Victor's Devblog

Rigged Meshes

February 15^th, 2026 at 23:20

About Last Week

Last week's driving animation (steering and spinning) was very hacky. Instead of handling rigged meshes correctly, I created highly specific shaders for my wheels and it was not gonna scale to any other type of rigged/skinned/animated mesh which is kind of sad. It looked something like:

// c++
if (isWheelMesh(mesh))
{
    useWheelShader();
    sendValue("steering", steering);
    render(mesh);
}
else
{
    // normal mesh rendering
}

// gpu's wheel shader
moveVertex(steering);

In addition, doing this hack prevented me from using a powerful performance optimization: depth pre-pass.

When rendering meshes, each triangle is rasterized (meaning gpu finds the fragments -pixel & depth- that this triangle should be displayed onto) then shader can be run for each fragment to decide how the triangle look on affected pixel. If multiple triangles would be rendered on the same pixel, the depth of the fragment (how far it is from the camera) is used to decide if new triangle's fragment should replace currently drawn fragment. If initial fragment was costly to draw (e.g. expensive light computations) but is replaced later on by a fragment from a closer triangle, then GPU wasted time. Luckily if later fragment is further away, its shader is not executed.

The idea of depth pre-pass is to perform a first pass of rendering where we don't even care about the fragment's color: only about storing closest fragment's depth for every pixel. Then we can run the expensive shaders knowing that only one of them (the one that matched the depth pre-pass' stored dpeth) will be executed per screen pixel.

By moving my wheels with an ad hoc shader, I would have needed to implement an equally specific depth pre-pass shader for wheels. This was annoying to do and would have been a waste of time since I knew I wanted to implement a proper rigged mesh rendering pipeline.

Addendum : How Rendering Works

I feel like what I wrote next may get some readers lost, so here is a quick recap of how mesh rendering works:

First, we must upload textures, meshes (this means all vertices and their properties such as position in mesh space, normals, UV coordinates, etc. as well as data to describe how they form triangles), and code (called shader or program) to the VRAM.

Then every frame and for every mesh, CPU does:
- tell the GPU which program to use
- send the GPU uniform variables to use with the program; imagine those are static variables, that remain constant through the entire rendering of a single mesh (e.g. mesh's world positioning, texture ids, etc.)
- tell the GPU to render all triangles from a mesh id

A standard GPU program is composed of two main shaders:
- the vertex shader, responsible for telling the GPU where each of the mesh's vertices would end up on screen (for that we provide uniforms to the shader that represent the camera position, where it aims, its FOV, etc.) : if a pixel should be in from of the camera, its (x,y,z) coordinates will end up within [-1.0, 1.0]
- the fragment shader, responsible for telling what color a triangle's fragment should have when displayed on screen
GPU is then responsible for running the rasterizer with the results of the vertex shader, to then execute the fragment shader (among other things...).

In particular, the code of a vertex shader defines key information such as the structure of each vertex (what data they contain, ideally as little as possible per vertex to avoid wasting space, so usually static meshes will use different vertex shaders than rigged meshes as they don't need bone information per vertex). This means that shader programs (the combo of vertex + fragment shaders) must be different for static and rigged meshes. However the code of a fragment shader can often be shared between static mesh and rigged mesh programs.

Rigged Mesh Rendering

The first step to implement rigged mesh rendering was to rework slightly how my code is organized and how mesh data are stored and passed to the GPU. I used the classic boneIndices / boneWeights structure for rigged vertices, giving a vertex structure as follow:

vec3 position;
vec3 normal;
vec2 UV;
vec3 tangent;
ivec4 boneIndices; // (rigged vertices only) up to 4 bone index that affect this vertex
vec4 boneWeights; // (rigged vertices only) how much each of the 4 bones affect this vertex

I also wrote a simple GLSL (GPU language, very similar to C) pre-processor to allow using include directives in my GPU code (otherwise it has to be loaded all in one string) so that I could then easily share most of the shading code (especially the fragment shader but also parts of the vertex shader) between my static mesh programs and rigged mesh programs.

On top of needing to be rendered with extra information per vertex, rigged mesh shaders need to be provided an uniform array of matrices that represent how each bone moved compared to a bind pose (the reference pose bones had when mesh was exported). Then in the vertex shader, bone transforms can be used to modify where each vertex should be:

uniform mat4 uWorldToScreen;
uniform mat4 uMeshToWorld;
uniform mat4 uBones[100];

// this would be called for each mesh vertices expressed in mesh's local space
vec4 getVertexPositionScreen(vec3 positionLocal, ivec4 boneIndices, vec4 boneWeights)
{
    mat4 skin = uBones[boneIndices.x] * boneWeights.x + ... + uBones[boneIndices.w] * boneWeights.w;
    return uWorldToScreen * uMeshToWorld * skin * positionLocal;
}

Maybe you can see now how an animation system, running on the CPU, would update the position of every bones every frame to then animate the mesh? In my case I don't have to write a full animation system for a car, usually a simpler procedural system (that applies steering, suspensions, and spin to appropriate bones) is enough; this is what I did and here are a few bloopers along the way (I think I heard some of you really enjoyed last week's bloopers? :D):

But anyway, sorry for the long and verbose post (I don't have much more than words to share this week as I barely could make any progress), here is final result: