"Liberty means responsibility. That is why most men dread it."
- George Bernard Shaw
More pages: 1 2 3 4 5
Wednesday, January 14, 2009 | Permalink

When you look at a highly tessellated model it's generally understood that it will be vertex processing heavy. Not quite as widely understood is the fact that increasing polygon count also adds to the fragment shading cost, even if the number of pixels covered on the screen remains the same. This is because fragments are processed in quads. So whenever a polygon edge cuts through a 2x2 pixel area, that quad will be processed twice, once for both of the polygons covering it. If several polygons cut through it, it may be processed multiple times. If the fragment shader is complex, it could easily become the bottleneck instead of the vertex shader. The rasterizer may also not be able to rasterize very thin triangles very efficiently. Since only pixels that have their pixel centers covered (or any of the sample locations in case of multisampling) are shaded the quads that need processing may not be adjacent. This will in general cause the rasterizer to require additional cycles. Some rasterizers may also rasterize at fixed patterns, for instance an 4x4 square for a 16 pipe card, which further reduces the performance of thin triangles. In addition you also get overhead because of less optimal memory accesses than if everything would be fully covered and written to at once. Adding multisampling into the mix further adds to the cost of polygon edges.

The other day I was looking at a particularly problematic scene. I noticed that a rounded object in the scene was triangulated pretty much as a fan, which created many long and thin triangles, which was hardly optimal for rasterization. While this wasn't the main problem of the scene it made me think of how bad such a topology could be. So I created a small test case to measure the performance of three different layouts of a circle. I used a non-trivial (but not extreme) fragment shader.

The most intuitive way to triangulate a circle would be to create a fan from the center. It's also a very bad way to do it. Another less intuitive but also very bad way to do it is to create a triangle strip. A good way to triangulate it is to start off with an equilateral triangle in the center and then recursively add new triangles along the edge. I don't know if this scheme has a particular name, but I call it "max area" here as it's a greedy algorithm that in every step adds the triangle that would grab the largest possible area out of the remaining parts on the circle. Intuitively I'd consider this close to optimal in general, but I'm sure there are examples where you could beat such a strategy with another division scheme. In any case, the three contenders look like this:

And their performance look like this. The number along the x-axis is the vertex count around the circle and the y-axis is frames per second.

Adding multisampling into the mix further adds to the burden with the first two methods, while the max area division is still mostly unaffected by the added polygons all the way across the chart.



Enter the code below

Wednesday, January 14, 2009

Nice analysis. This supports the removal of triangle fans from Direct3D and OpenGL.

Thursday, January 15, 2009

You probably won't believe it, but I spent my whole day working on exactly such a tesselation - even designed espacially for high multi-sampling levels.

I could have saved a lot of work by checking your blog earlier

*great* information, thank you!

Rohit Garg
Thursday, January 15, 2009

Nice job. The info on how to lower the pixel workload was particularly useful. But in the max area scheme, why the tris which are added in the end will give you less workload? They are very small too. Can you please explain this?


Thursday, January 15, 2009

Very interesting test! Thanks for sharing!

Thursday, January 15, 2009

Rohit Garg,
They get small, which of course is not optimal, but they are not long and thin, so it's not nearly as much of a problem. Look at the strip case for instance. Every triangle goes all the way from top to bottom. If you double the amount of triangles you're almost doubling the number of quads that are intersected by a polygon edge. For the max area case you're only causing trouble along the outer circle edge. The triangles also never become long. Doubling the triangle count will basically only cause around round of fragment shading along the outer edge. The vast majority of the interior will only be shaded once no matter how many triangles you add.

I've added some additional info about the rasterizer to the post as well.

Thursday, January 15, 2009

Great information.
Thanks for sharing.

Rohit Garg
Saturday, January 17, 2009


Thanks for explaining. Now I get it. It's sort of a perimeter/area thing. As far as Seth's suggestion of removing triangle fans from APIs is concerned, I'd like to ask what might be better triangulation for an oject which is not planar like circle.

Saturday, January 17, 2009

Rohit Garg,

A circle is planar. Do you mean a sphere?

More pages: 1 2 3 4 5