c# - Fastest way to look up an entity with a set of components?

I'm currently trying to implement an ECS system, but I've sort of ran into a problem trying to fetch all of my entities that have a specific set of components. I currently have a Dictionary that maps a component name to a list of Guids (entities), so I have no problem fetching a list of entities if I were to query for just one component. But if I were to query entities for more than one component (e.g. all entities that have Render and Transform component), I run into a problem where it's no longer a constant time lookup.

I could probably loop through the entities to see if they contain that set of component names (they are stored in a dictionary that maps Guid to a list of strings), but I was thinking that there could be a faster way of doing this?

Answer

Update

I have wrote Theraot.ECS inspired by this answer. It allows you to use Guid, int or whatever for the entities. It will allow you specify how sets of component kinds are handled. Two implementation provided: one uses a binary flag array, the other is based on hash sets.

Some lessons learned:

QueryCheck (see original answer) should return one of three cases: add, remove, nothing to do. I created an enum for this.

BitArray, not very good for this. I rolled my own FlagArray type.

With the design proposed in this answer getting the entities from a query yields a view, not a snaptshot. It was very easy to make it a custom type that also provide events among other things.

I decided to merge creating the query and getting the entities for the query. This means that the call can only expensive the first time (if there are already entities). Subsequent calls are O(1).

I also decided to change the phrase "component type" to "component kind" to avoid confusion with actual System.Type types.

The project is free and open source software, feel free to study it, use it, whatever. MIT license.

Original Answer

I want to suggest is to maintain a set for entities for each query.

When a system starts, it will report the queries it needs (I assume it is usually a single one, yet, multiple could be supported).

The engine will create (and populate) new sets for those queries. By doing this, you would only need to go over every entity to populate the dictionary when a system is created. By creating all necesary systems before the entities, you do not need to populate the sets on creation at all.

Instead, when a component is attached to an entity, you will add it to the sets according to the queries. Alright, that is not trivial... we need to figure out what queries could change their result depending on the attached component. Similarly when removing.

So, if we express a query as a list of components that must be present, we can also create a dictionary that give you queries based on components. In fact, it is relatively easy to extend to have negative queries (as in "the entity must not have this component").

The process when a component is attached or removed is as follows:

Use the component to get the list of active queries that could apply

For each query:

2.1 See if the entity passes or not.

2.2 If it passes: Add it to the set for the query (if it was not already there)

2.3 If it does not pass: Remove it from the set for the query (if it was already there)

Then the system can simply get the set for the query it wants. Of course, the query would not exist if it was not created first.

We need something like the following:

Dictionary> QueriesByComponentType;
Dictionary> EntitiesByQuery;
Dictionary> ComponentsByEntity;

Of course, you can use GUID for your entities, and I do not know if you want ConcurrentDictionary, and you would need a good hash for the HashSet, in fact a good hash for the Components is a good idea.

What follows is the same idea translated to code (some assumptions where made).

When the component is added or removed:

// O(n) where n = number of affected queries
var component = component_begin_added_or_removed;
var componentType = ComponentTypeManager.GetFrom(component_begin_added_or_removed);
var entity = this_entity;
// The code below should probably be extracted to another method:
// Try to update ComponentsByEntity, if no update you can return
if (QueriesByComponentType.TryGetValue(componentType, out var queries))

{
    foreach (var query in queries)
    {
        var set = EntitiesByQuery[query];
        if (query.CheckQuery(entity)) // Uses ComponentsByEntity
        {
            set.Add(entity);
        }
        else
        {

            set.Remove(entity);
        }
    }
}

Note: the remove case can be optimized futher if we know that all queries are positive (they only ask for a component to be present, but never for a component to not be present), which is the way entity-component-system is meant to be. If that is the case, you separete this code in a version for adding and another for removing, and the removing case does not need CheckQuery. You might also be interested in creating a version that takes multiple components to add at once (computing the union of the queries sets).

When the system is created:

// O(n) where n = number of components
var componentTypes = new []{componentTypeA, componentTypeB /*,...*/};
var query = QueryManager.GetFrom(componentTypes);

// The code below should probably be extracted to another method:
if (EntitiesByQuery.TryAdd(query, new HashSet()))
{
    foreach (var componentType in componentTypes)
    {
        if (!QueriesByComponentType.TryGetValue(componentType, out var set))
        {
            set = new HashSet();
            QueriesByComponentType.TryAdd(component, set);   
        }

        set.Add(query);
    }
}

When the system wants to query:

// O(1)
var entities = EntitiesByQuery[query];

I said twice in comments that code should be extracted to another method. That is because that code would be the same for all entities and systems. In fact, I think it is wise to not expose the dictionaries directly. I suggest a Façade.

How many components do you have? There is a change you can represet the list of components that make up a query as a bit array. Which would also be useful to represent the list of components an entity has... and then, checking is bit wise and.

In fact ComponentType does not need to be a class, nor Query. And you already know Entity does not have to be a class either. I wrote it that way to not get into the specifics of how they are represented. In fact, you might as well take advantage of using alias directive plus extension methods.

Addendum on the order of component types

This can work even without having a strict order for the components types of a query (and yes, even for negative queries).

With that said, if you want to use a bit array to represent a set of component types, the component types would need consecutive numeric codes that also act as the indexes for the bits in the bit array.

You could use an enum and flags, such that only the bit that represents the component type is set and the rest are not set. That makes doing that bit wise and very easy, and give give you the best performance. However, it would also limit the number of possible component types to 64, since the base type would be at best a ulong which has 64 bits.

You can carry on with that idea beyond the 64 component types by using a BitArray instead.

If you start with the enum and then for whatever reason you need a large number of component types, you would have to change that. Please notice I consider the bit array an optimization. You can still do the same with a set of component types and iterating.

In fact the advice would be the opposite: - Start with sets, but keep them isolated from the rest of the code. - If they are affecting your performance, and you have already settled on the number of component types for your game, then optimize accordingly.

If you are creating a generic ECS, you could offer different strategies, and let the developer decide. Keep the same façade so most of the code is unaware of the difference, and use dependency injection to pass the strategy the developer wants.

Addendum on the idea of negative component queries

Sometimes it is useful to have a system that must run on entities that do not have a particular component. For example, you could have the system detect these entities, do some computation on then, and then add the componet so it does not run on it anymore.

How to do it? The idea is go back to the initial algorithm I proposed, before any optimization. Realize it is the same for adding and removing, it has symmetry. We can exploit that symmetry... if you remove a component, perhaps you should add the entity to the set of a query that requires to not have that component. Similarly when adding a component, perhaps you want to remove the entity from the set of a query that do not wants that component.

We, of course, have the problem of how to represent these negative queries. We need a concept of the negation of a component type. That way you can have queries that say "must have componentA and no componentB".

So a query can contain a component type, its negative or neither (a query with a component type and its negative should be rejected, as it makes no sense for an entity to have a component and not have it). And yes, for the bit array, that would mean two bits per component. Which for the enum approach means you could only have half the ammount of possible component types. Again, this is a trade-off.

Addendum on disjuntion queries

Disjuntions are another kind of query that is missing (an "Any" query instead of an "All" query).

You got to treat them separately (have queries marked as disjuntion). The base algorithm continues to be the same (when you add or remove, you check the queries that has the component type that is being added or removed and check if the query is satisfied and add or remove the entity on the set of the query accordingly), but the optimizations are different.

Addendum on the idea of entities with multiple of the same component type

It usually does not make sense, and in the cases it does, you probably want a hierarchy of components, such that an aggregation of components of a given type can also act as a component.

However if you want to allow entities with multiple components of the same type, then ComponentsByEntity would not use HashSet, but some sort of list... which also makes the system code more complex, because it has to deal with a variable number of components.

Then, in that case, being able to use a sorted list would allow a faster algorithm for checking a query than a regular list. If the list of components is large, a binary search will be good, otherwise, simply iterating in order will allow to discard soon. How large? Test.

By allowing an entity to have multiple of the same component type, checking if it satiesfies a query is slower. Alternatively, you could have another level of dictionaries. Which means more indirection, which means more overhead. As you can see, this idea comes with a trade-off, as usual there is price for versatility.

Blog

Tuesday, August 14, 2018