In my latest project, I’ve explored WebGPU to create a ray tracer capable of rendering the Stanford dragon model (87,130 triangles) in real-time. This project is my first step into WebGPU, and I’m excited to share the result.

You can find the demo here.

Accelerating Ray Tracing with BVH

I have implemented a Bounding Volume Hierarchy (BVH) to accelerate ray tracing. To implement and build the BVH, I followed Sebastian Lague’s excellent tutorial (“Coding Adventure: Optimizing a Ray Tracer (by building a BVH)”) pretty closely.

BVHs work by dividing the 3D space containing a model into a tree structure of axis-aligned bounding boxes (AABBs). Each AABB encloses a portion of the model’s triangles. Starting with a root AABB encompassing the entire model, the algorithm recursively subdivides AABBs containing many triangles into smaller AABBs. This process continues until the leaf AABBs contain a very small number of triangles (or just one).

This hierarchical structure significantly reduces the number of triangle-intersection tests required during ray tracing. When a ray is cast into the scene, it first intersects with the root AABB. If an intersection is found, the algorithm traverses down the BVH tree, only testing for intersections with the AABBs and triangles within the intersected AABBs. This eliminates unnecessary intersection tests with triangles in distant or irrelevant parts of the scene, leading to a major performance improvement.

WebGPU Implementation

For the WebGPU implementation, I used a framework largely developed by Johan Holwerda. This framework greatly simplified the boilerplate code, allowing me to focus on the core ray tracing algorithm.

The inner loop of my shader now looks like this:

fn intersectWorld(tMin: f32, tMax: f32, ro: vec3<f32>, rd: vec3<f32>, intersection: ptr<function, Intersection>) -> f32 {
    var stack = array<u32, 32>();
    var dist = tMax;
    let invRd = -1. / rd;

    stack[0u] = 0u;
    var stackSize = 1u;

    while (stackSize > 0u) {
        stackSize = stackSize - 1u;

        var node = getNode(stack[stackSize]);

        if (intersectBox(node.min.xyz, node.max.xyz, ro, invRd) < dist) {
            if (node.index >= 0.) {
                let triangleEnd = u32(node.index + node.triangleCount);
                for (var i = u32(node.index); i < triangleEnd; i = i + 1u) {
                    let tri = getTriangle(i);
                    let d = intersectTriangle(tMin, dist, ro, rd, tri.v0, tri.v1, tri.v2, &(*intersection).normal);
                    if (d < dist) {
                        dist = d;
                    }
                }
            } else {
                let nodeIndex = u32(-node.index);
                let nodeA = getNode(nodeIndex);
                let dA = intersectBox(nodeA.min, nodeA.max, ro, invRd);
                let nodeB = getNode(nodeIndex + 1u);
                let dB = intersectBox(nodeB.min, nodeB.max, ro, invRd);

		let dMin = min(dA, dB);
		let dMax = max(dA, dB);

		if (dMax < dist) {
			stack[stackSize] = select(nodeIndex, nodeIndex + 1u, dA < dB);
			stackSize = stackSize + 1u;
		}
		if (dMin < dist) {
			stack[stackSize] = select(nodeIndex, nodeIndex + 1u, dA >= dB);
			stackSize = stackSize + 1u;
		}
            }
        }
    }
    return dist;
}Code language: Rust (rust)

A Note on WGSL

WGSL, WebGPU’s shading language, employs the keywords varlet, and const similarly to JavaScript, yet it confers different meanings.

In JavaScript, var has function scope, let allows block scoping for variables, and const ensures block-scoped immutability. Conversely, in WGSL, all variables have block scope: var is mutable, let represents an immutable value, and const is reserved for compile-time constants and is unusable at runtime.

This decision to mimic JavaScript’s keywords yet alter their functionality seems unnecessarily confusing and likely to frustrate developers without a clear rationale. Retaining JavaScript’s functionality for let and const and introducing a new keyword like def for compile-time constants could have minimized this confusion.

The RenderQueue

I have pushed this experiment to the RenderQueue. You can find it here.

Similar posts

If you like this post, you may also like one of my other posts:

WebGPU Ray Tracer: Accelerating the Stanford Dragon