Friday, September 29, 2017

optimization - "Optimal" game loop for 2D side-scroller


Is it possible to describe an "optimal" (in terms of performance) layout for a 2D side-scroller's game loop? In this context the "game loop" takes user input, updates the states of game objects and draws the game objects.


For example having a GameObject base class with a deep inheritance hierarchy could be good for maintenance... you can do something like the following:


foreach(GameObject g in gameObjects) g.update();

However I think this approach can create performance issues.



On the other hand all game objects' data and functions could be global. Which would be a maintenance headache but might be closer to an optimally performing game loop.


Any thoughts? I'm interested in practical applications of near optimal game loop structure... even if I get a maintenance headache in exchange for great performance.



Answer




For example having a GameObject base class with a deep inheritance hierarchy could be good for maintenance...



Actually, deep hierarchies are generally worse for maintainability than shallow ones, and the modern architectural style for game objects is trending towards shallowing, aggregation-based approaches.



However I think this approach can create performance issues. On the other hand all game objects' data and functions could be global. Which would be a maintenance headache but might be closer to an optimally performing game loop.




The loop you've shown potentially has performance issues, but not, as is implied by your subsequent statement, because you have instance data and member functions in the GameObject class. Rather, the problem is that you if you treat every object in the game as exactly the same, you probably aren't grouping those objects intelligently -- they're likely randomly scattered about throughout that list. Potentially, then, every call to the update method for that object (whether that method is a global function or not, and whether than object has instance data or "global data" floating around in some table you're indexing into or whatever) is different from the update call in the last loop iterations.


This can put increased pressure on the system as you may need to page the memory with the relevant function in and out of memory and re-fill the instruction cache more frequently, resulting in a slower loop. Whether or not this is observable by the naked eye (or even in a profiler) depends on exactly what is a considered a "game object," how many of them exist on average, and what else is going on in your application.


Component-oriented object systems are a popular trend right now, leveraging the philosophy that aggregation is preferable to inheritance. Such systems potentially allow you to split the "update" logic of components (where "component" is roughly defined as some unit of functionality, such as the thing that represents the physically-simulated portion of some object, which is processed by the physics system) on to multiple threads -- discriminated by component type -- if possible and desired, which could have a performance gain. At the very least you can organize the components such that all components of a given type update together, making optimal use of the CPU's cache. An example of a such component-oriented system is discussed in this thread.


Such systems are often highly decoupled, which is a boon to maintenance as well.


Data-oriented design is a related approach -- it's about orienting yourself around the data required of objects as a first priority, so that that data can be effectively processed in bulk (for example). This generally means an organization that tries to keep data used for the same purpose cluster together and operated on all at once. It is not fundamentally incompatible with OO design, and you can find some chatter on the subject here at GDSE in this question.


In effect, a more optimal approach to the game loop would be, instead of your original


foreach(GameObject g in gameObjects) g.update();

something more like


ProcessUserInput();

UpdatePhysicsForAllObjects();
UpdateScriptsForAllObjects();
UpdateRenderDataForAllObjects();
RenderEverything();

In such a world, each GameObject might have a pointer or reference to its own PhysicsData or Script or RenderData, for the cases where you might need to interact with objects on an individual basis, but the actual PhysicsData, Scripts, RenderData, et cetera would all be owned by their respective subsystems (physics simulator, script hosting environment, renderer) and processed in bulk as indicated above.


It is very important to note that this approach isn't a magic bullet and will not always yield performance improvement (although it is generally a better design than a deep inheritance tree). You're especially likely to notice basically no performance difference if you have very few objects, or very many objects you cannot effectively parallelize the updates for.


Unfortunately, there is no such magic loop that is the most optimal -- every game is different and may need performance tuning in different ways. So it's very important to measure (profile) things before you blindly go taking the advice of some random guy on the internet.


No comments:

Post a Comment

Simple past, Present perfect Past perfect

Can you tell me which form of the following sentences is the correct one please? Imagine two friends discussing the gym... I was in a good s...