c++ - Non-blocking VSync in Direct3D

Monday, March 12, 2018

c++ - Non-blocking VSync in Direct3D

I have a Direct3D application which uses a PRESENTATION_INTERVAL_ONE implementation. Pity, its blocking and eating my CPU waiting for Present. I ended up searching for raster scan (BitBlt) solutions, sample code from the Internet, but none of them actually worked well - failing to do the exact timing prior to Present call.

Question

So, ideal solution for me would be using PRESENTATION_INTERVAL_ONE (or if its not possible - IMMEDIATE), calling a function

// blabla code

waitForVsync(); //loop here with sleeps or whatever, freeing my CPU
()->Present();  //just in time, no blocking wait, yay!

And upon a return I call Present, which is, lets say, 1ms for _ONE and 0ms for _IMMEDIATE prior to physical top scanline.

I know I am asking a lot and do not hope for a solution in 2011 year, as Microsoft also has failed to improve in this area.

To clarify the issue, I provide a link with a more detailed analysis of a problem: http://www.virtualdub.org/blog/pivot/entry.php?id=157

Answer

It seems to me that you're trying to achieve a graceful co-existence with other processes on a PC game? I.e. to not occupy the entirety of the CPU all the time, so as to allow other processes to run (and if you're running on a laptop, to allow the speed-stepping to power down the CPU and preserve battery life)?

I've certainly tried to achieve this in the past, and never fully settled on a solution I liked. I never tried Crowley9's method of using the DONOT_WAIT flag, but to me that smells like closer to the right solution. If Present only blocks when the present can't be queued (because there's already another present queued), then you don't need to call Present 'just before' the VSync happens, you just need to sleep until it's likely that another call to Present will succeed (which will be any time after the next VSync).

The root problem is that you don't really know when the VSync occurs, and a system like that needs to stay synchronised.

The basic gist of my solution was: don't use Present to find out where the VSync boundary is, do the timing yourself, and make sure you're calling Present as close to the appropriate time as possible. If you're late, it won't matter because of the internal buffering D3D does, and if you're early then it will simply block and the worst you'll incur is some extra CPU usage.

What I did do was to use (microsecond accurate) timers, track the elapsed frametime, and sleep accordingly. There's no magic flag or function I know of to say 'wake me when VSync's about to happen', because there isn't an event available that is signalled when a VSync is about to happen. So instead I looked at how long the frame had taken so far, figure out when I thought the next VSync would occur, subtracted a grace period (I think I left 3ms), and call ::Sleep.

E.g. if the frame processing took 2ms, I would get to the Present call early. If I called it straight away, I'd expect it to block (at 100% CPU) for 14ms. Instead, I call ::Sleep(11), and then the subsequent Present only blocks for ~3ms). If the frame processing took longer than 13ms, I'd just call Present without sleeping. If it took longer than 16ms, I'd switch down to 30Hz rendering (with some hysteresis), and adjust the Sleep timings accordingly.

BUT I was never particularly happy with this solution, for several reasons. If you're running on a loaded system, Sleep is by no means guaranteed to give you control back in that 3ms window, you have no good control over the OS time-slicing.

Worse, the minimum sleep time is supposedly 10ms, anything less than that and it ends up sleeping for 10ms regardless of what you specify (in practice it's way more variable than that). So you also have to track how long you actually slept for, and factor that into your logic (so you avoid sleeping if you're running late). And then you end up calling Present late one frame, then avoiding the sleep the next frame and calling it early the next. E.g.

0ms (0ms) : start
10ms (10ms) : process frame 1
10ms (20ms) : try to sleep for 3ms, end up sleeping 10ms
0ms (20ms) : present late, succeeds straight away
10ms (30ms) : process frame 2

2ms (32ms) : don't try to sleep, present early, blocks till the next vsync

So for managing 60Hz rendering, the granularity of the time-slicing / Sleep function makes this a not great solution. And it also doesn't take into account the actual refresh rate of the system!

Blog

Monday, March 12, 2018

c++ - Non-blocking VSync in Direct3D

No comments:

Post a Comment

Simple past, Present perfect Past perfect