Monday, October 29, 2018

Optimizing an XNA 2D game


Does it make sense to implement the logic to skip rendering objects outside the viewport or should I not care about it and let the Framework do it?



Answer



Culling is a performance optimisation. So it doesn't make sense to just do it just for the sake of it. You have to have a reason.




The GPU (not the XNA Framework) culls triangles and pixels at blindingly fast speed. Every triangle you submit must be transformed (via your vertex shader). Then it culls the ones that land off screen. Then it fills the remaining triangles, culling pixels that are off-screen. The remaining pixels are then drawn to the back buffer (via your pixel shader).



(When I say "then" - it actually does all this in a massively parallel pipeline.)


So it's very rare and unusual that you might have to cull individual triangles. To hit vertex limits you have to be drawing an absurdly large number of triangles. To hit fill-rate, texture-fetch or pixel-shading limits, you generally need to have a high depth-complexity (in which case viewport/frustum culling will not help).




So there's usually little or no cost to having geometry off-screen.


The cost - particularly in the context of drawing "objects" (usually 3D objects) - is actually in submitting those objects to the GPU in the first place. Submit too many objects and you hit your batch limit (you get a few thousand* batches per frame).


I recommend reading this answer and this linked slide deck for an in depth description of batches.


Because of this, if you implement frustum culling, you can reduce the number of batches you submit to the GPU. If you're batch limited - this can get you under the limit.




Now - your question is about 2D XNA - so presumably you are using SpriteBatch. This is a bit different.


It is no mistake that it is called "Sprite Batch". What it is doing is taking the sprites you draw and doing the best it can to submit those sprites to the GPU in as few batches as possible by batching them together.



But SpriteBatch will be forced to start a new batch if:



  • You draw more sprites than it can fit into a single batch (2048 sprites in XNA 4)

  • You change texture (this is why there is a sort-by-texture option)


So culling is a suitable optimisation if you are running into the first one. If you are sending such a huge number of sprites that you end up with too many batches (you're probably using up bandwidth as well - but I'm pretty sure you'll hit the batch limit first). This will generally only happen if you have a truly enormous world - so you can generally get away with very simplistic, fast-but-inaccurate culling in this case.


Now - If you are drawing with enough texture-swaps to take you over the batch limit, and lots of them are actually off-screen and culling them would get you under the batch limit. Then yes - culling is an optimisation that will work.


However - a better optimisation to pursue in this case is to use texture atlases (aka: sprite sheets). This allows you to reduce the number of texture-swaps and therefore batches - regardless of what is on screen or off. (This is the main reason you can specify a source rectangle for your sprites.)




(As always: this is advice on performance optimisation. So you should measure and understand your game's own performance, and what limits you are hitting, before spending effort on adding optimisations.)



No comments:

Post a Comment

Simple past, Present perfect Past perfect

Can you tell me which form of the following sentences is the correct one please? Imagine two friends discussing the gym... I was in a good s...