"He is able who thinks he is able."
- Buddha
More pages: 1 2 3 4 5 6 7
New dynamic brancing demo
Thursday, July 1, 2004 | Permalink

Here's a demo that does dynamic branching, without the need for pixel shader 3.0, and still receives the huge performance boost that pixel shader 3.0 dynamic branching supposedly gives when utilized in a similar fashion.

Enjoy!


2004-07-03:
Since it seems people have lost the ability to understand smilies I removed the first line of this post. Couldn't have imagined that people would feel so seriously offended by it, and I certainly didn't foresee the amount of crap I would recieved for it.

2004-07-04: Did a minor update to the demo to work around the performance drop issue on nVidia cards. The demo will now let you choose between doing a full stencil clear or simply zero it for surviving fragment. The former method seems to be required for this technique to see any performance gains at all on nVidia hardware, while both methods run fast on ATI card, the latter at a higher speed though. In order to get maximum performance it will choose zeroing as default for ATI and full stencil clear on everyone else. I don't know if that assumption holds true for other vendors though.

Name

Comment

Enter the code below



Andrew
Wednesday, July 7, 2004

This is basically the same concept as what Richard Huddy calls "Pre Zee" in his presentation, no?
http://www.ati.com/developer/gdc/Huddy_SaveTheNanosecond.pdf

None-the-less very cool stuff Humus. I'm surprised that it made that huge a difference in a relatively simple shader, though. Anyways, keep up the great work

Humus
Wednesday, July 7, 2004

Well, this demo uses that too, but it's not the same thing. What he describes is just a depth-only pass.

Andrew
Wednesday, July 7, 2004

Ah fair enough. On closer inspection I guess your playing with the stencil buffer does do more than a depth-only pass, although I still think the concepts are similar.

pro
Thursday, July 8, 2004

It seems that some newer nVidia graphics cards (including GeForce FX 5900) do not perform the stencil test before the actual shader execution.
The stencil test does work, however, but the lighting shader is executed full-screen for every light. I notified this when I reduced the light range to 100 units (no lit pixels visible) and the fps was still the same. You can even clear the stencil buffer to 1 and remove the DrawIf() call and the lighting shader gets executed.
Is it possible that nVidia hardware does not support early stencil test?

Humus
Thursday, July 8, 2004

Did you try the latest version?

MesserFuerFrauSchmid
Thursday, July 8, 2004

@pro Yep you are right it does not. Stencil is after shader. That's the problem when you rely on undocumented stuff like that.

NitroGL
Thursday, July 8, 2004

Hm... On my system, the demo runs faster with out the branching.

640x480 window
~290FPS with branching
~1500FPS with out

Sunray
Friday, July 9, 2004

lol, 1500 fps? I've 40 fps without "branching" and 100 with. (Radeon 9700 Pro)

More pages: 1 2 3 4 5 6 7