More pages: 1 2 3
4 5 6 7 8 9 10 11 12
Framework 3 (Last updated: June 26, 2012)Framework 2 (Last updated: October 8, 2006)Framework (Last updated: October 8, 2006)Libraries (Last updated: September 16, 2004)Really old framework (Last updated: September 16, 2004)
Inferno
Sunday, August 5, 2007 | Permalink


Executable
Source code
Inferno.zip (1038 KB)
Required:Direct3D 10This demo uses a couple of interesting features of Direct3D10. It doesn't use any vertex or index buffers at all (except for what the framework uses for GUI), instead everything is generated in the shader from the SV_VertexID and SV_InstanceID system generated values. The skybox has only tree vertices (fullscreen triangle), so by generating that in the shader we avoid the API overhead of binding buffers (which is not the bottleneck of a skybox pass of course, but that's besides the point). The terrain renders instanced triangle strips which read from height from a heightmap. The heightmap is in BC4 format, or ATI1N as it was called in D3D9. This gives us a very compact geometric representation for the terrain. There are 1024 particle systems, all rendered in a single draw call by using instancing. The particle systems are stateless and are generated entirely in the vertex shader from the input vertex and instance IDs. The geometry shader is used to expand the incoming points from the vertex shader into quads in screen-space. This is similar to how point sprites used to work, except it's more flexible and this demo uses rotation on the particles, something that point sprite can't do.
This demo should run on Radeon HD 2000 series and GeForce 8000 series. Since this demo uses D3D10 it only works in Windows Vista.
Domino
Sunday, February 4, 2007 | Permalink



Executable
Source code
Domino.zip (194 KB)
Required:GLSLThis demo is mostly for eye-candy, but contains a couple of interesting techniques too. OpenGL currently has no particular API for doing instancing. There exist some pseudo-instancing techniques for OpenGL, such as using a vertex attribute and issuing a new draw call. This of course does not cut down the number of draw calls, but should have relatively low cost per call, making drawing lots of objects managable compared to for instance setting shader constants. Another method, which this demo uses, is based on shader constants. It uses multiple copies of the model in the vertex buffer, each with a particular index. The index is used to grab the instance data from a constant array. This allows you to draw as many instances in a draw call as you can fit instance data in the vertex shader constant storage. While this doesn't cut the number of draw calls down to a single one it at least divides the number of draw calls by a fair amount, in this demo 64 (set conservatively, could probably be increased).
The other interesting technique used here is the wood shader, which is loosely based on the wood shader that comes with RenderMonkey. The RenderMonkey wood shader suffers from a serious aliasing problem at a distance since the wood rings are mathematically generated. To solve this problem I'm deriving the rings from the noise texture too, which of course is mipmapped and thus doesn't suffer directly. I added a repeat factor on the returned noise, which adds back some aliasing, which I get rid of by adding a mipmap bias for the lookup.
This demo should run on Radeon 9500 and up and GeForce FX 5200 and up.
Ambient aperture lighting
Sunday, December 3, 2006 | Permalink
This demo implements a variation of
Ambient Aperture Lighting. I went with a simpler implementation than what's in the paper, but the concept is the same. The idea behind ambient aperture lighting is that you approximate the directions from where light can reach the surface with a disc. So you store a direction vector to the centre of the disc and the size of the disc. In the more elaborate version in the paper they compute the intersection area between the light disc and the aperture disc. In my implementation I simply take the dot product between the light vector and the aperture direction, and the difference between that aperture size as the shadow factor.
Ambient aperture lighting is a quite rough approximation, but one of the cases where it applies is terrains, which naturally also translates to bumpmaps. In this demo I use ambient aperture lighting to compute a cheap self-shadowing factor.
The advantage of this method compared to horizon mapping, which can be used for the same purpose, is that it's cheaper. Horizon mapping require a 3D texture, plus either complex math or a cube lookup table for the angles. Ambient aperture lighting requires a 2D texture only. Horizon mapping is also more aliasing prone and there's also a problem with texture coordinate discontinuity when accessing the horizon map, meaning that you either can't mipmap it (more aliasing) or you'll have to use a lookup with gradients (only supported in SM3.0 hardware) to avoid artifacts. Additionally, a benefit of ambient aperture lighting is that the size of the aperture is pretty much an ambient occlusion factor and can thus be used for a better looking ambient lighting (see the outside of the castle in this demo for example). However, in horizon mapping's defence, it is a better approximation of the occluding geometry so you can find cases where it will look more natural than ambient aperture lighting.
This demo should run on Radoen 9500 and up and GeForce FX 5200 and up.
Volumetric Fogging 2
Sunday, October 8, 2006 | Permalink



Executable
Source code
VolumetricFogging2.zip (858 KB)
Required:GL_ARB_shader_objects
GL_ARB_vertex_shader
GL_ARB_fragment_shader
GL_ARB_shading_language_100This demo shows a method for generating volumetric fog where the fog itself has texture, which adds a lot to the realism compared to traditional uniform fog and allows the fog to be animated. The downside is that it's a lot more computationally expensive. It's implemented by raytracing from the surface back to the camera through the fog and iteratively mixing in the fog at each sampled position. This way you can also include shadows in the computation which gives you nice light shafts through the fog. In this demo I have just stored the shadows in a static volume lightmap for fast lookup and good quality.
This demo should run on Radeon 9500 and up and GeForce FX 5200 and up.
Dynamic lightmapping
Sunday, August 20, 2006 | Permalink
Generating lightmaps can be a quite time-consuming task, thus they are typically generated offline and shipped with the application. That's also what I've done in the past with the demos that use lightmaps. Even though my lightmap generation code probably could be optimized a fair bit, I doubt any CPU implementation could get anywhere close to the performance you can get by offloading this task to the GPU, which is what this demo does.
First a position map is generated on the CPU. The position map contains the worldspace position that each pixel in the lightmap maps to on the geometry it's used with. The position is preprocessed a bit to push it out slightly from the geometry to avoid precision problems. It's particularly important since I generate four position maps to antialias the shadow slightly, and the offset sample positions cannot cut into the geometry or you'd get artifacts.
The shadows are generated with a standard cubic shadow mapping technique, except it's done in texture space of the lightmap with the position looked up from the position map. The process of generating the shadow map is quite fast and definitely real-time if you're doing plain hard shadows. The texture filter will then smooth the edges a bit to get somewhat soft shadows. This is slower than just doing shadowmapping directly though and the quality improvement is relatively small. It does give you the option to blur the shadow in lightmap space, which is cheaper than doing it per pixel with regular shadow mapping. However, in order to really differenciate from plain shadow mapping this demo implements real soft shadows with the light sampled at 512 positions. The shadow for each light position sample is also 4x antialiased. The antialiasing was added since it adds some extra quality especially with a small light radius and adds very little to the cost (generating the shadow map is the bottleneck). Generating this soft shadow is almost real-time, but not fast enough to do every frame. However, once it's been generated it can be reused forever and give you soft shadows nearly for free.
Typical applications of this technique could be to either generate lightmaps fast on end-user machines to reduce download size, or for semi-dynamic lights in games, where the light position is expected to remain static most of the time, such as lighting up a candle.
This demo should run on Radeon 9500 and up and GeForce FX 5200 and up.
On the first run it will generate four position maps, so it may take up to maybe 10 seconds to load. Later runs will start much quicker.
High Quality Texture Compression
Friday, June 2, 2006 | Permalink
This demo shows a way to achieve higher quality texture compression than DXT1 at a bit higher bitrate by using 3Dc+. Don't complain about the lack of artwork or eyecandy in this demo (it's just a single textured quad) since that's not the point, but the point is to illustrate the quality difference between this method and DXT1.
In JPEG compression the first step is to convert RGB to YCbCr, a color space based on luminance (Y) and chrominance (Cb and Cr). The rationale for this is that the eye is much more sensitive to luminance information than chrominance, and by converting it to this color space we can sample luminance at full rate and chrominance at a lower rate. Typically JPEG files sample luminance at 1x1 and chrominance at 2x2, which already cuts down the data to half the size at nearly no visible quality degradation.
The method I use here is similar to this first compression step in JPEG, but I take it one step further by storing the Y channel in an ATI1N texture and CbCr in a lower resolution ATI2N texture. This essentially gives you a 6bpp texture format, with a lot better quality than DXT1. Now it's true that DXT1, which is 4bpp, looks good with most textures, but there are exceptions. DXT1 doesn't perform very well with photographical images, some textures with smooth gradients, some very detailed textures, non-uniformly colored textures, textures with diagonal features or features that otherwise lines up badly with the 4x4 pattern. In these cases this method looks much better. Additionally, it's in many cases possible to sample CbCr at 4x4 without significantly reducing quality, resulting in 4.5 bpp. This will almost always look better than DXT1, but could see more color bleeding than 6bpp.
Decoding YCbCr into RGB in the shader is very cheap and takes only three instructions. Generally speaking this method has quality close to RGB8 but performance close to DXT1. The default view is a bit zoomed in so you can judge quality, where performance difference is small since it's all magnified, but if you zoom out so you get most of the texture visible covering the screen you'll see bigger performance difference.
In addition to using ATI1N and ATI2N I've also added similar compression using DXT1 and DXT5. This gives the same performance as the 3Dc modes, but visibly worse quality. It's still better than DXT1 though.
Use the keys 1-6 to toggle between DXT1, YCbCr DXT/3Dc & 4.5bpp/6bpp and RGB.
This demo should run on Radeon 9500 and up and GeForce FX and up.
Dynamic branching 3
Tuesday, April 25, 2006 | Permalink



Executable
Source code
DynamicBranching3.zip (1139 KB)
Required:GL_ARB_shader_objects
GL_ARB_vertex_shader
GL_ARB_fragment_shader
GL_ARB_shading_language_100
GL_EXT_framebuffer_object
This demo illustrates the benefit of dynamic branching in a per pixel lighting scenario. On the F1 dialog you can toggle dynamic branching, as well as shadows and single/multi pass. Branching is used at several places in the shader. It skips past instructions if the pixel is outside the light range, is backfacing the light, or is in shadow.
This demo should run on Radeon 9500 and on and GeForce FX and up. Only those cards supporting dynamic branching will see a performance benefit by enabling this path, that is X1300 and up and possibly also the GeForce 6 series.
Hair
Monday, April 3, 2006 | Permalink


Executable
Source code
Hair.zip (925 KB)
Required:R2VB
Pixel shader 2.0This demo illustrates a hair simulation using R2VB. The hair consist of a large number of strands. Each strand is a line in a render target, and each node in the strand is represented by a pixel in the render target. The simulation is similar to a typical cloth simulation, except springs only connect in one direction, along the strands. The first node of every strand (the leftmost column of pixels) are locked to the moving balloon head. The rest of the hair will follow it wherever the head is bouncing. Collision is computed against the head and the floor.
Currently the demo draws the hair as lines. This has the effect that the hair will look thinner in bigger resolutions or when the head is close up. An easy fix for that would have been to scale the line width as appropriate. Unfortunately, D3D doesn't expose any way to draw wide lines. With additional work a shader could have expanded the lines into a triangle strip with adjustable width, but I was too lazy to do that.
This demo should run on Radeon 9500 and up.
More pages: 1 2 3
4 5 6 7 8 9 10 11 12