More pages: 1 ...
11 ...
21 ...
31 ...
41 ...
51 ...
61 ...
71 ...
81 ...
91 ...
101 ...
111 ...
116 117 118 119 120 121
122 123 124 125 126 ...
131 ...
141 ...
151 ...
161 ...
171 ...
181 ...
191 ...
201 ...
211 ...
221 ...
231 ...
241 ...
251 ...
261 ...
271 ...
281 ...
291 ...
301 ...
311 ...
321 ...
331 ...
341 ...
351 ...
361 ...
365
Query Failed
Rohit
Friday, April 10, 2009
Actually, since all the crap is an extension now, won't all code have to put ARB suffix in all of their gl calls? And BTW, whatever may be the merits of their decision, they are doing what they said they would do. Now that the deprecated stuff has been moved out to extensions and there is no requirement for it's implementation, I'd expect these things to be dropped from the consumer cards over time at least in future. IHV's will have an incentive to do this, they'll make sure that only the workstation cards have those drivers with legacy crap so that they can screw those not updating the code, some poetic justice eh!
BTW, in your framework, you'll be enabling the forward compatible flag by default right?
Humus
Friday, April 10, 2009
From WGL_ARB_create_context spec:
"If the WGL_CONTEXT_FORWARD_COMPATIBLE_BIT_ARB is set in WGL_CONTEXT_FLAGS_ARB, then a <forward-compatible> context will be created. Forward-compatible contexts are defined only for OpenGL versions 3.0 and later. They must not support functionality marked as <deprecated> by that version of the API, while a non-forward-compatible context must support all functionality in that version, deprecated or not."
So yes, providing this flag disables all deprecated stuff. I have also found that both AMD and Nvidia implementations does it right too, so in fact, deprecated function calls are ignored. I haven't checked if any errors are generated, but at least no rendering comes out of it.
Jan
Friday, April 10, 2009
I am in the process of porting to GL3. For now i still use GL2 but remove everything deprecated. The "forward compatible" flag is a good idea, but i wonder, whether it is implemented, at all. Just like with the ARB_compatibility extension, i fear, that IHVs simply give us the old crap, with a new label. So, do you know, whether this flag has actually any influence? Does it raise errors and ignore calls to deprecated functions?
Jan
Aras Pranckevičius
Friday, April 10, 2009
Fully agree.
The biggest problem with OpenGL for production use is driver quality (and no, users never update their drivers, at least in casual/small game space). Having GL3 with all the legacy bits does nothing to improve the situation.
What I think should have been done: make GL3 a whole new headers/DLL/dylib/framework, and only have "the new way" APIs in there. Then provide a single "GL2 emulation on top of GL3" library (that is mostly hardware independent); which can be just statically linked into applications and whatnot. Make drivers only contain the GL3 bits, the old APIs should be emulated on top of actual GL3.
Of course, most of my views are probably very na�ve, and I'm underestimating impact of emulating old GL on top of new GL (performance, backwards compatibility, ...). But the approach taken by GL 3.1 does not solve one of the biggest problems with OpenGL either (stability).
Michael
Wednesday, April 8, 2009
trip down memory lane.
My favorite was the "Ate my Balls" meme. Ah, memories.
JKL
Tuesday, April 7, 2009
SSE are 2 operand instructions so write mask would be useless anyway, you need separate destination register for this. However the bigger reason for such design is that there are problems with long dependency chains.
If we had 3 op DP with write mask the matrix-vector multiply code could look like this:
xmm0 - vector
xmm4-7 - matrix
xmm1 - result
dpps xmm1.x, xmm0, xmm4 // 0 (11)
dpps xmm1.y, xmm0, xmm5 // 11 (11)
dpps xmm1.z, xmm0, xmm6 // 22 (11)
dpps xmm1.w, xmm0, xmm7 // 33 (11)
on the right is a starting cycle and a (latency).
Dpps has 11 cycles latency on Core2 and each of these DPs depends on a previous one so the whole routine takes 4*11=44 cycles.
Now the same with the real code:
movaps xmm1, xmm0 // 0 (1)
movaps xmm2, xmm0 // 0 (1)
movaps xmm3, xmm0 // 0 (1)
dpps xmm0.x, xmm4 // 1 (11)
dpps xmm1.y, xmm5 // 4 (11)
dpps xmm2.z, xmm6 // 7 (11)
dpps xmm3.w, xmm7 // 10 (11)
orps xmm0, xmm1 // 15 (1)
orps xmm2, xmm3 // 21 (1)
orps xmm0, xmm2 // 22 (1)
DPs are independent here and can be issued every 3 cycles. The code takes 23 cycles to execute even if it's 6 instructions longer.
Disclaimer

None of the above is actually tested, it is based on Core2 cycle tables from http://www.agner.org/optimize/
yoav
Monday, April 6, 2009
auto and static_assert... ok
i just tried re-reading that article, and unless i miss something, this boils down to LESS overhead, not zero overhead
so in the case of the strcat example, you would better use some kind of string builder if you were up for speed (memcpy to the end of where you were last time)
but ok.

Humus
Monday, April 6, 2009
What's so ugly about it?
I don't think I'm going to use lambda expressions anytime soon, but auto, static_assert and rvalue references are very useful additions.
More pages: 1 ...
11 ...
21 ...
31 ...
41 ...
51 ...
61 ...
71 ...
81 ...
91 ...
101 ...
111 ...
116 117 118 119 120 121
122 123 124 125 126 ...
131 ...
141 ...
151 ...
161 ...
171 ...
181 ...
191 ...
201 ...
211 ...
221 ...
231 ...
241 ...
251 ...
261 ...
271 ...
281 ...
291 ...
301 ...
311 ...
321 ...
331 ...
341 ...
351 ...
361 ...
365