architecture - Grouping entities of the same component set into linear memory

Monday, October 12, 2015

architecture - Grouping entities of the same component set into linear memory

We start from the basic systems-components-entities approach.

Let's create assemblages (term derived from this article) merely out of information about types of components. It is done dynamically at runtime, just like we would add/remove components to an entity one by one, but let's just name it more precisely since it is only about type information.

Then we construct entities specifying assemblage for every of them. Once we create the entity, its assemblage is immutable which means we cannot directly modify it in place, but still we can obtain existing entity's signature to a local copy (along with contents), make proper changes to it, and create a new entity out of it.

Now for the key concept: whenever an entity is created, it is assigned to an object called assemblage bucket, which means that all entities of the same signature will be in the same container (e.g. in std::vector).

Now the systems just iterate through every bucket of their interest and do their job.

This approach has some advantages:

components are stored in a few (precisely: number of buckets) contiguous memory chunks - this improves memory friendliness and it is easier to dump whole game state

systems process components in a linear manner, which means improved cache coherency - bye bye dictionaries and random memory jumps

creating a new entity is as easy as mapping an assemblage to bucket and pushing back needed components to its vector

deleting an entity is as easy as one call to std::move to swap the last element with the deleted one, because order doesn't matter at this moment

enter image description here

If we have a lot of entities with completely different signatures, benefits of cache coherency kind of diminish, but I don't think it would happen in most applications.

There is also a problem with pointer invalidation once vectors are reallocated - this could be solved by introducing a structure like:

struct assemblage_bucket {
    struct entity_watcher {

        assemblage_bucket* owner;
        entity_id real_index_in_vector;
    };

    std::unordered_map> subscribers;

    //...
};

So whenever for some reason in our game logic we want to keep track of a newly created entity, inside the bucket we register an entity_watcher, and once the entity has to be std::move'd during removal, we lookup its watchers and update their real_index_in_vector to new values. Most of the time this imposes just a single dictionary lookup for every entity deletion.

Are there any more disadvantages to this approach?

Why is the solution nowhere mentioned, despite being pretty obvious?

EDIT: I'm editing the question to "answer the answers", since comments are insufficient.

you lose the dynamic nature of pluggable components, which was created specifically to get away from static class construction.

I don't. Maybe I did not explain it clearly enough:

auto signature = world.get_signature(entity_id); // this would just return entity_id.bucket_owner->bucket_signature or so
signature.add(foo_component);
signature.remove(bar_component);

world.delete_entity(entity_id); // entity_id would hold information about its bucket owner
world.create_entity(signature); // automatically assigns new entity to an existing or a new bucket

It's as simple as just taking existing entity's signature, modifying it and uploading again as a new entity. Pluggable, dynamic nature? Of course. Here I'd like to emphasize that there is only one "assemblage" and one "bucket" class. Buckets are data-driven and created at runtime in an optimal quantity.

you'd need to go through all buckets that might contain a valid target. Without an external data structure, collision detection could be equally as difficult.

Well, this is why we have the aforementioned external data structures. The workaround is as simple as introducing an iterator in System class that detects when to jump to next bucket. The jumping would be purely transparent to the logic.

Answer

You have essentially designed a static object system with a pool allocator and with dynamic classes.

I wrote an object system that works almost identically to your "assemblages" system back in my school days, though I always tend to call "assemblages" either "blueprints" or "archetypes" in my own designs. The architecture was more of a pain in the butt than naive object systems and had no measurable performance advantages over some of the more flexible designs I compared it to. The ability to dynamically modify an object without needing to reify it or reallocate it is hugely important when you are working on a game editor. Designers will want to drag-n-drop components onto your object definitions. Runtime code might even have need to modify components efficiently in some designs, though I personally dislike that. Depending on how you link up object references in your editor, simply being able to add and remove components to existing objects with zero extra shenanigans will come in handy.

You're going to get worse cache coherency than you think in most non-trivial cases. Your AI system for instance doesn't care about Render components but ends up being stuck iterating over them as part of each entity. The objects being iterated over are larger, and cacheline requests end up pulling in unnecessary data, and fewer whole objects are returned with each request). It'll still be better than the naive method, and the naive method object composition is used even in big AAA engines, so you probably don't need better, but at least don't go thinking you can't improve it further.

Your approach does make most sense for some components, but not all. I dislike ECS strongly because it advocates always putting each component in a separate container, which makes sense for physics or graphics or whatnot but no sense at all if you allow multiple script components or composable AI. If you let the component system be used for more than just built-in objects but also as a way for designers and gameplay programmers to compose object behavior then it can make sense to group together all AI components (which will often interact) or all script components (since you want to update them all in one batch). If you want the most performant system then you're going to need a mix of component allocation and storage schemes and put in the time to figure out conclusively which is best for each particular type of component.

Blog

Monday, October 12, 2015

architecture - Grouping entities of the same component set into linear memory

No comments:

Post a Comment

Simple past, Present perfect Past perfect