"I do not feel obliged to believe that the same God who has endowed us with sense, reason, and intellect has intended us to forgo their use."
- Galileo Galilei
More pages: 1 ... 11 ... 21 ... 31 ... 41 ... 51 ... 61 ... 71 ... 81 ... 91 ... 101 ... 111 ... 121 ... 131 ... 141 ... 151 ... 161 ... 171 ... 181 ... 191 ... 201 ... 211 ... 221 ... 231 ... 241 ... 251 ... 261 ... 271 ... 280 281 282 283 284 285 286 287 288 289 290 ... 301 ... 311 ... 321 ... 331 ... 341 ... 351 ... 361 ... 371 ... 381 ... 391 ... 401 ... 411 ... 421 ... 431 ... 438
Query FailedeXile
Saturday, October 22, 2005

Nice to see that, Humus! It seems that the future are dual core processors... but they are too expensive now for me

Will be there anytime more indoor demos? Perhaps radiosity could be a nice challenge for you or realtime ambient occlusion

Nitro
Saturday, October 22, 2005

Thanks for explanation. I had off all optimizations when I recompiled it, so now it runs 75 fps on FPU and 113 on SSE (1 thread) and 122 on SSE (2 threads). FPU with 2 threads still slower - 53 fps. Maybe it's just because I have only HT. I need a new CPU
Btw Humus, I wonder how fast does the demo run on your new Athlon64.

Humus
Friday, October 21, 2005

Hello to N! S� jo k�n spik inglich?

Nitro,
yeah, I was a bit surprised how much VC2005 could optimize the FPU path. There's a new compiler option where you can set FPU to "fast", which ignores some of the IEEE standards and so on, much like -fast-math in GCC. It boosted performance quite a bit. I also did some algorithmical improvements. I now bake in the radius in a preprocessing step, and I also defer the division to a final division in the end by refactoring instead. I didn't try doing the same in the other paths, but they have fast RCP instructions anyway, so I don't think it's going to improve performance a lot if at all.
It was interesting to see it improve performance almost as much on HT too. I tried it at my work laptop and saw a pretty decent increase too. But that further points to that it's the cache/memory that's the bottleneck, rather than computation power.

Nitro
Friday, October 21, 2005

Very interesting piece of code. I get about 13% speed increase on my 2.8 GHz P4 HT. But what's weird is that the FPU path is faster that SSE (117 fps to 104 with 1 thread and 132/117 with 2 threads). Is the new VS2005 optimising so much? I recompiled the demo on VS2003 and the results are rather slower. 1 tread FPU/SSE: 54/97, 2 threads even slower: 43/93. Does anybody know why is it so?
Btw the old demo ran 80 fps on FPU and 115 on SSE.

You know who!
Friday, October 21, 2005

Hello hello! I will write in english this time, hope you understand. The thing is that this message is to Humus if you didn't know?! :-P

Whooohooo!!!
Maybe I will go and see Howard Jones in Ume�!!!!!
Hug to you Humus and take care of yourself.

N

Sunray
Friday, October 21, 2005

The coolest would be to compute them entirely on the GPU (via raytracing).

Looks like this: http://jmb.mine.nu/~kma/x/metaballs_xvid.avi

Anonymous
Friday, October 21, 2005

"Anyhow, has this anything to say on my 'now-outdated-underpar-performing' A64 3400+?"
-Bj�rn

Your A64 3400+ is an AMD Semron, not an Athlon 64. It was never on-par to ever be able change to underpar status.

Bj�rn
Thursday, October 20, 2005

Hey nice! Congratulations on your new CPU! I want one of those myself *drool*

Anyhow, has this anything to say on my "now-outdated-underpar-performing" A64 3400+?

Cheers!

More pages: 1 ... 11 ... 21 ... 31 ... 41 ... 51 ... 61 ... 71 ... 81 ... 91 ... 101 ... 111 ... 121 ... 131 ... 141 ... 151 ... 161 ... 171 ... 181 ... 191 ... 201 ... 211 ... 221 ... 231 ... 241 ... 251 ... 261 ... 271 ... 280 281 282 283 284 285 286 287 288 289 290 ... 301 ... 311 ... 321 ... 331 ... 341 ... 351 ... 361 ... 371 ... 381 ... 391 ... 401 ... 411 ... 421 ... 431 ... 438