I'm new to programming and want to speed up the collision detection in my C++ 2D platformer game engine.
Currently, only the player's collision detection works: Every frame, the player's update function iterates through every level tile (which is pretty many) and checks if the player hitbox collides with it.
I may be way off on this, but this seems like a really chunky way of doing it. Is this bad practice? If the level gets large enough, is iterating through this list going to take forever like it seems it should?
My second question builds off of that: If I start adding for example enemies into the engine and give them a similar method for collision detection, can I expect performance drops? I'd like some insight from more experienced people on a better way of setting this up. I would just imagine that having ~20 enemies on the screen all iterating through all possible objects for collision with a list could get slow fast.
Preliminary warning: Optimisation is a dangerous cast-from-life-type magic. You should always optimise as late as possible, because machine time is cheap. Obfuscate your code for performance only when you know you need to. Many small games manage just fine with the simple method you've described.
That said, here's the news:
The bad news is that you're right: Giving larger inputs to any algorithm will eventually slow it down. Bigger problems are harder—that's a physical certainty.
The good news is that different algorithms slow down at different rates. The rate at which an algorithm slows down with respect to the input size is called its computational complexity.
I'll first explain computational complexity theory (what efficiency is) then some techniques to speed up your specific problem.
Let's start by making a slow algorithm, in a way no sane person would do it, because it's easier than making a fast one and better illustrates how to measure an algorithm's efficiency.
The way you're doing it, iterating through a list to find whether any tile collides with your player has linear complexity (often written as O(n)-complexity or order-n-complexity). This means that if you double the size of the input (how many tiles you have), that also doubles the time the algorithm takes to run.
Iterating over a...
...1-element list: 1
...2-element list: 1 2
...3-element list: 1 2 3
...4-element list: 1 2 3 4
...5-element list: 1 2 3 4 5
...6-element list: 1 2 3 4 5 6
...7-element list: 1 2 3 4 5 6 7
Iterating over a large list of n
elements this way has complexity O(n)
.
Some algorithms slow down even more easily: For example, imagine if every time you wanted to access one of the tiles in your list, you had to start over and iterate to that tile all over again from the start of the list. (No sane programmer would actually do this, but hey, it's a possible algorithm.)
Iterating over a whole list of different lengths would cause reads something like this:
Dumb-iterating over a...
...1-element list: 1
...2-element list: 1 1 2
...3-element list: 1 1 2 1 2 3
...4-element list: 1 1 2 1 2 3 1 2 3 4
...5-element list: 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5
...6-element list: 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6
...7-element list: 1 1 2 1 2 3 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5 6 7
That blew up pretty fast in comparison. Given a list of n
elements, this algorithm takes 0.5 * n * (n + 1)
time. In computing, we often shorten this by calling it O(n ^ 2)
, because the n-squared term dominates as the list grows.
I'm sure we could come up with an even dumber way of iterating over a list with an exponential or even factorial complexity. Let's not do that. Instead, let's go to the other extreme.
Can we find the collision response of a player character with a large number of static walls in constant time? This is the fastest possible time complexity: It would mean taking the same amount of time regardless of the input size.
Yes! We could precompute and store every possible position that your player character could possibly be standing at and the appropriate precalculated collision response, in memory.
Assuming your game world is 2000 by 1000 pixels large and that your player and walls are positioned at integer coordinates, then you'd have to store a
struct CollisionResult {
int collisionResponseX;
int collisionResponseY;
}
for every possible position your player could be in (struct CollisionResult[2000][1000] collisionResponseCache
) and you'd have a complete and immediately queryable cache of what collision to expect when a player is in any position.
This is a constant time algorithm: Indexing into an array (something like collisionResponseCache[player->posX][player->posY]
) is just as fast for every possible input—mere milliseconds.
We've traded memory space and flexibility to gain time complexity.
How much memory space? Assuming your machine has 4-byte integers, that big array would take up 2000 * 1000 * 2 * 4 = 16000000 bytes of space—that's 16MB. That's quite a lot of memory and it increases linearly with the size of your game world, though it's not too bad for a modern PC with gigabytes of RAM!
How much flexibility? This constant-time method assumes all of the things you want to test for collisions against are present and static (and hence precomputable). This is often limiting, as it prohibits moving enemies, collectibles, breakable walls and such.
A better algorithm would allow more flexibility (changing inputs) but still be faster than linear time. How might we do that?
Consider a two-dimensional game world with some game entities in it.
We could easily double the collision detection algorithm's efficiency by keeping two lists of objects instead of just one: A list for objects on the left half of the level and another for the right half:
Whenever an object crosses the boundary, we'd move it to the other list. This way, we only need to check for collisions against the objects in the list corresponding to the side of the stage the player is on.
We can take this further by introducing two more lists and splitting the level in four instead.
Or into twelfths. Hundredths? We can parametrise this as we like, to control how much flexibility and memory we want to trade for speed. Making each list represent a 1×1 area (a single pixel) makes this essentially degenerate to the "store absolutely everything"-approach above.
This is the gist of a spatial subdivision technique called spatial hashing. The details are a little more complex, as you have to account for entities' sizes (they might be in many cells at once). C++ implementation details are the domain of another question.
A quadtree works in a similar manner, but it subdivides cells only as needed:
Many thanks to David Eppstein of Wikipedia! The image page includes Python source code!
If your game objects are relatively sparse, quadtrees may be significantly faster than a spatial hash, so they're also worth investigating. Again, implementations and guides exist that I won't replicate here, but this is the gist of it.