How should content or asset items be referenced in their serialized form?
For example, a material might need to reference several textures. The simplest way would be to use a path, relative to the base directory of the assets.
Alternatively, all content items could be referenced by an ID. I implemented it this way (using GUIDs), which allows content items to be renamed or moved without breaking references. However, this makes it more difficult to replace a content item (delete the original, then rename something else to the same name... requires copy/pasting GUIDs around). It also makes debugging missing content trickier.
Is there a clear winning option here?
Answer
String-keying / Hashmaps
Are fast, as read time is amortized O(1), meaning that read access is usually very fast, but in worst cases (rare, but not unheard of), it can be quite slow. Worst case results from hash collisions.
Implementations sometimes have to be built, or found (for instance, in C). Writing / finding a performant string-keyed map implementation can be non-trivial.
String comparisons can be expensive, though whether this is pertinent depends on your language's implementation of both hashmap read ops, and string equality implementation.
May make it possible for you to inadvertently create duplicates or overwrite existing values (depends on implementation).
Numeric keying / arrays
Are very fast, because the structure you're adding to maps directly to memory (as per arrays in most languages)
Are very fast, because numeric equality is checked rapidly (at least for sufficiently small bitwidths)
Are very fast, because long, contiguously-allocated arrays can have superb cache performance (caveat: this depends very much on your choices of language and implementation).
Require that your "GUID" is not too large to act as an array index. However, fast options can be found to work around this, such as breaking up your ID (via bit ops) to use as two sufficiently-short indices per element in a 2D array.
Are inconvenient, for the reasons you've mentioned. If the IDs are compile time values that you need to be able to develop with, it's less of a problem, since they can be assigned to appropriately-named ALL CAPS constants and used that way.
Can reference structures that contain a human readable name. This offers some convenience, even if they can't be accessed by this name.
Conclusion
Unless you are going to be processing many elements (say, tens of thousands) per 16-20ms tick using either of these options, I wouldn't worry about numeric IDs, and particularly if you are using a language where efficient hashmaps are already available, just opt for a hashmap.
Otherwise, the numeric ID / array approach is good for conserving CPU cycles, but go for this in cases where resources come at a premium.
No comments:
Post a Comment