Hi,
A few weeks ago I decided to compute how much GPU memory the terrain was using. It turned out to be a pretty huge amount. I tried to deal with this issue for the past 4 days, here is what I've come to.
First things first, let's compute how much memory the terrain was using. I'm using geometry shaders to generate, well, terrain's geometry: vertices and vertex indices. Vertices include position and normal, that is 2 * 3 float vectors: 24 bytes per vertex. Indices are simple integers 4 bytes each. With

I've spent a week refactoring a lot of code from the deferred renderer to make it suitable for the needs of the voxel terrain system. The terrain needed special shaders for rendering the shadow maps, which couldn't be specified in the deferred renderer. I also removed the need of shaders when rendering shadow maps (except for point lights), it makes the code much simple and faster.

Levels of detail often cross each other, so to keep lower LOD from shadowing higher ones, I had to slightly move the low LOD vertices toward their normal to shrink the shape of the terrain.
I will now focus on optimizing the terrain rendering process. It is completely straight-forward for now and no culling nor batching is done yet.
As promised, I now added some texturing to the terrain. Nothing much interesting here, but it looks quite better now. The texturing is pretty straight-forward though; texture repetition is visible, blending between textures is soft and doesn't use noise perturbation, etc. I'm now working on a proper integration of the terrain rendering shaders into the engine. I also need to make them suitable with the deferred renderer.

On the right you can see different texture blending transitions. The way to achieve this is pretty simple. Let weights be a two-dimensional vector where each of its component is the blending factor for respectively the grass and the rock. The sum of weights' components is always 1 so that the image is never lightened or darkened. Making the transition more rough simply consist of taking a power of weights (divided by weights.x + weights.y to keep it between 0 and 1).
There are more screenshots in the media page.
Last weeks I've been working on a seamless LOD transition mechanism for my voxel terrain system. It does not seem to be a very discussed topic. NVidia didn't talk about the issue in their now
So from here I started searching a technique on my own. I've had some ideas but many seemed very unreliable compared to the work required to implement them (yes, I'm lazy). But let me first introduce the problem to you.
The LOD system currently works like this. Each level is defined by a cube. A cube is a regular 3D grid containing the density values that represent the terrain. If you take 8 neighboring points you get a cell into which the marching cubes algorithm can be applied. Lower LOD cubes wrap higher ones. Since all cubes are of the same grid resolution, lower LOD cubes are made twice bigger (see the image on the right). Now at the intersection of two cubes, you get this:

As it can be seen on those screenshots, there are holes in the terrain. Those holes match one of the six plans of the cube that forms the inner level of detail. The problem is that at the intersection between two levels, vertices of one level do not match those of the other. This is due both because there are twice as much vertices in the higher LOD and the LOD generation filter tends to smooth the topology of the terrain. Let's take a look at a slice of the terrain:

Each cross is a single density value. The marching cubes algorithm is applied on each cell. This is a slice of the terrain as seen from the side; the region downside the line is underground, the upper region is the air. When computing one LOD, the resolution is the same but the area is twice as big, so the cells are twice bigger. Lower LOD cells appear darkened:

As you can see, the topology has been simplified. As a result, bumps have diminished while pits have been filled (les bosses creusent et les creux bossent). Knowing this behavior, I thought I could just move the high LOD (blue) vertices along their normal to bring them closer to the simplified surface. Vertex normals are shown in orange (I like orange):

The final position of vertices is then computed as follows: final = position + λ * normal. The only question is: what is λ and how do we compute it. You might have guessed that this coefficient needs to be negative on the bumps in order to shrink the shape. To know whether we are on a bump or in a pit we can just sample the neighboring densities. If their normalized (say between -1 and 1) sum appears to be less than 0.0 then the vertex is mostly surrounded by air: we are on a bump. For values greater than 0, we are in a pit. So we can basically say that λ is the normalized sum of surrounding densities. We don't actually need to compute this value, in fact we already have it in the lower LOD density map.
This works pretty well but it doesn't account the density at the vertex we are moving. We have to compute the difference between the density in the high and low LOD maps to know whether the vertex is likely to have swollen or shrinked. This produces a much more accurate result. You may think that we could proceed many iterations with this scheme, each time using the modified vertex position we computed in the previous iteration. Actually, from what I've tested, it doesn't improve the final result much, it may even alter it sometimes; I don't think it's worth the additional texture fetches.
Last but not least, it is very important to blend the vertex normals between LODs. Computing the normal given the density map is very easy; given a point p, we compute the difference between neighboring values of p in the density map in the X axis in order to get the rate of change in this axis. We do the same for each other axis:
gradient.x = density_map (p + [1, 0, 0]) - density_map (p + [-1, 0, 0]) gradient.y = density_map (p + [0, 1, 0]) - density_map (p + [0, -1, 0]) gradient.z = density_map (p + [0, 0, 1]) - density_map (p + [0, 0, -1]) normal = normalize (gradient)
Using the displaced vertex, we compute the normal in the lower LOD density map and blend it with the high quality normal as we get closer to the intersection. Screenshot on the left shows the difference.
This technique is only an approximation though, I've spent a lot of time tweaking to improve the overall result. It runs at render time, that is, it isn't part of the geometry generation process. As a result, generating the geometry (editing the terrain or moving around) doesn't take much longer. The rendering however needs to proceed the algorithm every frame for every vertex in the terrain. Fortunately, it is a quite lightweight vertex shader (up to 8 texture fetches only) with almost no branching.
Vertex position and normal are especially important in the context of volumetric terrain rendering, because they're used to compute the texture coordinates. Even a slight disturbance in the force the normals or the vertex position has a major impact on the texture projection, leading to quite visible artifacts. I haven't tried texturing yet, that will be part of another post. Let's just hope that the current seamless LOD system will be good enough to minimize the rendering glitches (which I predict will be most visible with normal mapping).
As you might already have noticed, I gave the website a little love today. Well, this actually waits for months, but it's finally there after a week of work: the website got a whole redesign!
That's all I wanted to tell, since there's no need to say that the skinning changed a lot or that the homepage now shows last news, latest commits, forum posts and more. Maybe I could tell you about the few core improvements, but let's be honest: it's still nothing attractive, and nobody cares.
So, voila, that's it. Enjoy your tour!
Hi there,
The last three weeks (20 days precisely) I've been working on a terrain system using voxels. Only the basics have been implemented so far, yet I think most of the dirty work is done anyway. But let's talk about the technique itself.

Voxel is a quite generic buzzword which usually means that you're representing the 3D world using a regular grid. It however doesn't tell anything about the way you exploit this grid. There are many things that can be done with voxels, from ray-marching (for advanced volume rendering) to straightforward rendering of a bunch of cubes (like the well-known game Minecraft does). The grid usually represents a volume, each cell telling us something about how the world is shaped or defined at the position of the cell. In my case I store the density of a terrain, ie. whether a point is inside or outside the volume of the terrain. The first screenshot shows such a grid, the second is a close-up view.

It gets interesting when it comes to make a nice-looking terrain surface using this rough grid. Actually there's a quite well-known technique to achieve this: marching cubes. It's an old technique that even has patent on it (now expired afaik). The principle is simple: it consists of placing triangles into a cube depending on whether the corners are inside or outside the volume, which leads to 256 triangulation cases for a cube (8 corners, 2^8 = 256). I let you search the web for the details of the technique. From now on constructing the iso-surface of our terrain is very easy, we simply walk (yes, we're not into Mordor... yet.) through the cells of the grid and generate the appropriate triangulation for each of them. Unfortunately, the straightforward method isn't suitable for real-time applications. Roughly, when the geometry shader outputs a lot of vertices it becomes a huge bottleneck. Moreover, vertices of adjacent cells are generated twice. nVidia has published a nice article about this problem and how to solve it pretty efficiently: GPU Gems 3's 
I won't go deep into the technique but the idea is to generate each individual vertex only once and use an index array, also generated. The tricky part is in the index array generation; you may guess that you just have to march through the grid like the first method and generate indices instead of vertices, but one does not simply walk into a regular grid to generate indices. First problem is: how do you know the index of the vertices when marching the grid to generate the index array? You'll have to do a first pass to store them into a huge 3D texture at wisely chosen positions. The second pass walks through the non empty cells (ie those who contain vertices) and fetches the 3D texture to know the index of the vertices it is marching on. Then output the indices out and blahblah- it's done; kid's stuff.
I believe however you are experienced enough to guess that it's not the kind of technique you manage to get working without a few surprise bugs! Funny thing, the more they are stupid and obvious, the more they are difficult to find. Moreover, the visual output is not often meaningful in bugged GL apps *but* you struggle hard to believe it is. I honestly wonder if I'm the only one to be that stupid but sometimes I only see what I wanna see; it's like searching the ground to find that plane in the sky.
Rage self-analysis paragraph behind, let's talk about my plans for the next few weeks. Even if you're not much into 3D rendering you might have noticed that there is a huge problem in this sweet terrain system: the regular grid has to be gigantic to describe even a small 1km^2 area. You sure can choose to pick 1 cell = 5m to manage large terrains, but it's needless to say that you'll end up with very poor details; trade-off is not a solution itself. What I thought of instead is a kind of level of detail system (how surprising) where the terrain is kept on CPU memory or even on HDD and progressive levels are uploaded as needed (ie. when the player moves through the terrain). On GPU-side I see it much like a geometry clipmap but in 3D instead of 2D. I believe the tough part will be the seamless LOD transition, but I've some ideas on it.
PS: the title states that I'm using marching tetrahedra (which is true) but I completely forgot to mention them! I'll do it in the next post, this one is big enough.
Hi!
I've been working on a deferred renderer for the SCEngine these months, and here are the first results! For those who don't know what a deferred renderer is, I recommand you these nice articles:

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch19.html
http://www.gamerendering.com/2008/11/01/deferred-lightning/
In short, it consists of rendering the scene geometry into a G-buffer (for geometry buffer). The G-buffer is made of textures that allows to reconstruct the visible geometry of the scene; one texture is used for pixels' normal, another for pixels' depth and one for pixels' color from texturing (this one hasn't been shown in my screenshot because there is no texturing). Then, to shade the scene with our lights, we just have to apply the following algorithm:
setBlending (additive);
foreach (light in scene.lights) {
use (gbuffer);
use (light.shader);
renderFullScreenQuad ();
}
Though specific for each type of light (spot lights, point lights, ...) the shader basically consists of fetching the G-buffer to reconstruct the 3D world position of the pixel being rendered, and then apply some lighting on it using eg. phong shading and the information from the light being rendered. 3D reconstruction consists of a simple unprojection, using pixel's coordinates and depth. The additive blending makes lighting from different lights to be mixed up properly.
With the SCEngine you may now choose to use the basic forward renderer or the deferred one. I also implemented shadow mapping for each type of light. I'm doing a simple cubemap rendering for point lights and I use Cascaded Shadow Maps for "sun-like" lights that enlighten all the scene. Once again if you're interested in knowing what this technique is, I suggest you to take a look at these:
http://http.developer.nvidia.com/GPUGems3/gpugems3_ch10.html
http://msdn.microsoft.com/en-us/library/windows/desktop/ee416307%28v=vs.85%29.aspx
The deferred renderer is still being developed and lacks optimizations:
There are also a lot of features that wait to be implemented:
Hope you enjoyed the post!
Today (August 8), Khronos has released the OpenGL 4.2 specifications. They came with few new extensions but most of them have been chosen so that they are compatible with old hardware (GL2 and/or GL3). So there's not really any new kill feature. nVidia has already released an OpenGL 4.2 capable driver and a short changelog of the added extensions is available here:
http://developer.nvidia.com/opengl-driver
The extension which most caught my attention is GL_ARB_texture_storage. According to the specs, current GL textures are defined in such a way that they force implementations to perform useless checks at draw time. Basically, any part of any level of mipmap can be specified in any format at any moment, etc. This extension allow to upload texture data just as the usual routines but assuming that the texture properties wont change in the future. This fact suggests that current textures are "slow", though I don't think that the performance gain will be noticeable, but I'm not experienced at all with drivers implementation details.
Yesterday I've been working on implementing geometry shaders into SCEngine. It was quite easy but I had to deal with the shader code of the engine. What a mess. Let your code rot for 3 years and when you come back you may explore a jungle. The geometry shaders thus came with a little update of the API; SCE_Shader_Load() now only accepts a single source file which must contain all the shaders code. Apart from this little feature loss, resulting in a SLOC smaller than before, two new functions have been added: Shader_InputPrimitive() and Shader_OutputPrimitive(). Though deprecated in the 3.2 and newer GL specs, these allow to specify the input and output primitive type for the geometry shader, respectively. In modern GL this is now specified in the shader source directly.

I've added a new demo that shows a very basic use of the geometry shaders with the SCEngine. You may edit the shader source to modify the value of the MODE macro in the geometry shader to change its effect. There is one problem I've encountered during the development of this demo so far: backface culling. I was experiencing triangles being culled though facing the camera. The point is, when backface culling is proceeded, each face's normal is computed on the fly by doing a cross product between two edges of the face. These edges depend on the order of the face vertices, you must therefore control it in the geometry shader. Also don't forget to increase the maximum number of output vertices if needed, I had this little "problem" along with the culling one; mixed issues always give you the best headache.
I'm looking forward to do some work on the engine this summer, and hope to implement some cool features.
Coucou !
Comme Khronos c'est des bons, ils releasent souvent une petite version d'OpenGL, et alors nous on peut tripper dessus, ce qui nous plait bien.
J'ai pas envie de faire un résumé des nouveautés de cette version, déjà parce que je ne me suis pas encore penché sur la question de manière très détaillée, et de ce que j'ai vu il n'y a rien d'extraordinaire, mais aussi parce que les spécifications et en particulier les annexes qui parlent des ajouts sont là pour ça.
Ce qui a le plus attiré mon attention c'est l'extension ARB_separate_shader_objects. Les shaders OpenGL classiques fonctionnent de la façon suivante. Il y a différents types de shaders ; vertex, pixel, geometry, tout ça, et afin de les faire fonctionner il faut préalablement les mélanger dans ce qu'OpenGL nomme un "program". Une fois ce mélange fait il est impossible de changer uniquement le vertex shader ou uniquement le pixel ou geometry shader sans devoir reconstruire intégralement le program, ce qui peut être coûteux. Avec DirectX tout le monde sait bien qu'on peut spécifier indépendamment chaque type de shader, l'API se chargeant de faire le nécessaire de son côté. L'avantage c'est qu'on a pas besoin de construire autant de programs qu'il existe de combinaisons de shaders différentes. L'inconvénient c'est qu'on a pas potentiellement les mêmes performances ; changer un type de shader à la volée c'est pas super, en théorie.
Quoiqu'il en soit cette extension vient combler ce manque ! Le moins qu'on puisse dire c'est que les gens qui s'amusent à coder des moteurs 3D qui utilisent OpenGL et DirectX vont être heureux. Il serait intéressant de faire un test de performances entre cette extension et les shaders OpenGL habituels.
Bonne nuit !
PS: pages de manuel pour OpenGL 4.1 disponibles en ligne : http://www.opengl.org/sdk/docs/man4/
Edit: un article sur g-truc.net qui parle de l'extension et d'une bonne implémentation de celle-ci sur nVidia et ATI : http://www.g-truc.net/post-0348.html#menu.