When I am making a model for a 3D game what should I take as a measure in my budget Polygons (triangles) or vertices? I have made an experiment with two sets 40000 cubes one with 8 vertices and 12 triangles, another with 24 vertices and 12 triangles. All was done in Unity and both of those were generated procedurally. To my surprise, both sets performed almost the same there was a very small difference between them.
Does it mean I should not worry about vertices count and only look at triangles count?
EDIT: I have made another experiment I have created a plane with 19602 triangles and 10000 vertices and another one with same ammount of tirangles but 39204 vertices. I have generated 4000 of both. Now less vertices won 14 fps to 19 fps. So I guess generally less is better, but only in large differences.
Answer
Let's imagine a big grid mesh, like one we might use for terrain. We'll render n
triangles worth of it, covering say half our 1080p screen, in a single draw call.
If we weld all of our vertices and have no smoothing/texturing seams, then each triangle has 3 vertices and each vertex is shared by 6 triangles, so we have n/2
vertices.
To render this we need to:
Run the vertex shader at least
n/2
times("at least" because our cache for vertex results is only so big. Sometimes we'll end up evicting a vertex we already transformed, then need it again for a later triangle that shares it and so re-run the vertex shader on it. So we don't get quite as much savings as it looks like on paper)
Clip & cull
n
triangles.Rasterize & interpolate over at least 1920x1080/2 or about 1 million pixels of the frame buffer (since we said our terrain covers about half the screen).
("at least" because of the way GPUs work on quads of pixels, some fragments just outside the edges of polygons still get rasterized but then masked, meaning we process fragments twice. For a bumpy mesh we'll also get overdraw anywhere the mesh occludes itself, if we're not lucky enough to draw the frontmost polygon into the depth buffer first)
Run the fragment shader for all those >= 1 million fragments.
Blend ~ 1 million results into the frame & depth buffers.
Okay, now let's unweld all of our vertices so now we have 3n
vertices to render, six times more than before! Our steps are...
Run the vertex shader
3n
times.(No asterisks due to caching since every vertex is used only once, though this means the cache can't save us any time)
Clip & cull
n
triangles.Rasterize & interpolate over at least 1920x1080/2 or about 1 million pixels of the frame buffer.
Run the fragment shader for all those >= 1 million fragments.
Blend ~ 1 million results into the frame & depth buffers.
...wait, every step except the first one is the same! So most of the work that the GPU does in a typical draw call is not directly related to the number of vertices used. The amount of screen coverage, overdraw, and total triangle count make up much more of the cost.
That doesn't mean vertices are completely free. If you share vertices when you can you get some modest savings from caching, especially if your vertex shaders are complicated or your hardware's vertex pipeline is weak (as was the case on some older consoles). But given that vertex count tracks proportional to triangle count plus or minus a constant factor, it's usually not as interesting a metric of overall mesh cost.
No comments:
Post a Comment