More pages: 1 ...
11 ...
21 ...
31 ...
41 ...
51 ...
61 ...
71 ...
81 ...
91 ...
101 ...
107 108 109 110 111 112
113 114 115 116 117 ...
121 ...
131 ...
141 ...
151 ...
161 ...
171 ...
181 ...
191 ...
201 ...
211 ...
221 ...
231 ...
241 ...
251 ...
261 ...
271 ...
281 ...
291 ...
301 ...
311 ...
321 ...
331 ...
341 ...
351 ...
361 ...
365
Query Failed
LogicalError
Monday, May 25, 2009
Just a crazy idea, would it be useful to differentiate between semi-transparent and completely non-transparent pixels? the fully non-transparent triangles could be drawn in a separate pass.. just a random (probably bad) idea
Humus
Sunday, May 24, 2009
TomF, yeah, the speedup it pretty much exactly the same as the reduction in area, but then most of the particles are large. If you're using smaller particles the gain may not be nearly as linear or no gain at all. I guess it could be a good idea to use fewer vertices for smaller particles, perhaps using the GS.
TomF
Saturday, May 23, 2009
Have you checked actual speedup? Something that might counteract the speedup is the extra triangles generate extra internal edges. GPUs shade in at least 2x2 blocks, and some have larger internal groupings - 4x2, 4x4. At an internal edge, these blocks can be hit twice (one per tri). In some cases the GPU can coalesce them again, but not always. It would be interesting to see the falloff.
Overlord
Saturday, May 23, 2009
it depends a little on how you interpret that interview, i read it as there will come a time where programming for the gpu or the cpu becomes irrelevant, that you will basically write programs that will run on any resource seamlessly, be it the gpu or cpu or both.
It's interesting to note that the cell processor has about the same processing power as the lowest end today, so would any high end intel or amd if they where designed a little bit differently.
I wouldn't say that the gpu is orders of magnitudes faster, it might be for a single application, but not generally speaking.
I think that in the not so distant future CPUs (like the cell) will be equipped with 2-4 different kinds of cores, all good at doing their thing but are still coded the same way.
BTW larrabee is both gpu and cpu, though i don't think it will come out in time to make any difference in the gpu market.
Humus
Saturday, May 23, 2009
I should add that while this tool find something that intuitively would be the optimal solution (without any particular proof for this being the case), or at least close to it, this is just for convex polygons. There can of course be concave solutions that are better. For instance a tree impostor with its trunk and top.
I also found a case where performance really suffered. An entirely circular particle. It'll generate a very high number of edges in the convex hull. Up to six vertices ran in order of seconds, but my brute force approach really slowed down at seven vertices which took almost a minute. The eight vertices cases would probably be like an hour then. I gave up after a few minutes and generated that manually instead.
Humus
Saturday, May 23, 2009
Overlord, using discard in the fragment shader is unlikely to speed things up at all. In fact, it's more likely to slow you down. Unless of course you have a very heavy pixel shader.
Eric, as you say, the particle would have to be tiny. Exactly how tiny is hard to say without any actual test. I'd guesstimate it to be a few hundred pixels or so.
John, thanks for catching this. I didn't catch it because I've only tested on square textures. I'll make an updated tool tomorrow with a few other minor improvements as well.
John Burnett
Friday, May 22, 2009
Awesome - thanks for sharing! As a sidenote, while looking at the code, it seems like there's a copy/paste bug on line 91? ("const int h = img.getWidth();" should be "getHeight()"?)
Eric Haines
Friday, May 22, 2009
Very cool, thanks! As particles get smaller, the vertex shader eventually becomes more important. Trimming quads is always going to be a win (no extra vertices). Any sense of what the break-even size point is for, say, 8 vertices? It is likely to be slower than a quad if the particle covers very few pixels.
My guess is that particles have to be tiny to make 8 vertices perform worse, but you never know without actual testing... Even then, of course, GPUs vary; but it would still be nice to know, to have some stick in the sand. I hope you'll give it a try and let us know.
More pages: 1 ...
11 ...
21 ...
31 ...
41 ...
51 ...
61 ...
71 ...
81 ...
91 ...
101 ...
107 108 109 110 111 112
113 114 115 116 117 ...
121 ...
131 ...
141 ...
151 ...
161 ...
171 ...
181 ...
191 ...
201 ...
211 ...
221 ...
231 ...
241 ...
251 ...
261 ...
271 ...
281 ...
291 ...
301 ...
311 ...
321 ...
331 ...
341 ...
351 ...
361 ...
365