Year of Prototypes: Topple

Table of Contents

2022 Prototypes - This article is part of a series.

Part 1: Year of Prototypes: GPU Rigid Body Simulation

Part 2: Year of Prototypes: Quest 2 GPU Physics

Part 3: This Article

Towards the latter half of 2022 two things were clear: I didn’t like working in off the shelf engines, and I didn’t have the skills to ship a game without one. So I decided to get good.

That’s an ongoing journey, but Topple was an important step on it. For a while I’d been looking at the Odin programming language as a sort of friendlier C. It ships with all the graphics APIs and C libs I tend to use (like stb), and has some things I love about scientific langs, like array programming and good SIMD intrinsics. Perhaps as importantly, it lacks the things that I don’t like about C++.

Learning Odin
#

Building real things is the best way to learn, so I ported over some particle XPBD code to Odin to run on the CPU, wrote a basic OpenGL render loop and had this working in under a day:

I think it’s a testament to Bill’s great language design that I was able to go from reading the docs to a working rendered real-time physics sim so quickly. Since I knew I wanted to properly understand VR programming, I found some Odin bindings for OpenXR, read the spec, and in 2 more days had my little particle demo working in VR!

This felt great, and was a huge confidence boost. In many ways, directly using the OpenXR C API felt easier than the higher level VR interfaces in Unity. At this point I’d already made the decision to pursue a PhD, but the idea of getting good at engine-level programming was pretty exciting. I decided to give the XPBD rigid body solver another go.

Solving the Solver
#

You may recall the bouncy stacks from my GPU Rigid Bodies article were due to using a Jacobi solver instead of a Gauss-Seidel one. That was largely a result of putting the solver on the GPU. But now I had a low level CPU language at my disposal, the possibility of writing a CPU solver became more and more interesting to me.

To summarise: here was the main “conceit” of writing my own solver.

With XPBD, solver iterations are replaced with substeps, so your simulation can run at very high framerates (1000s of FPS)
So even if the simulation gets too big, in VR it could just run in slow-mo and still hit your render frame target.

Getting a slow motion effect for free and also gracefully degrading for bigger sims was and still is a pretty compelling feature of these solvers. So I began porting my GPU code over, just replacing the batched update with a sequential one. Pretty quickly I had this:

Which already felt so much more real than the Jacobi solver. Then, I optimised what I did have using Odin’s SIMD intrinsics. I’ve written a decent amount of “vectorised” and “SPMD” code on the CPU before using OpenCL and ISPC, but this was my first time doing direct SIMD programming. Two key features in Odin made that a delight, the first is the #soa directive.

I can take a struct like this:

Rigid_Body :: struct {
	// Static quantities
	kind:                  Body_Kind,
	inv_mass:              f64,
	inv_inertia:           #simd[4]f64,
	extents:               #simd[4]f64,

	// Dynamic quantities
	position:              #simd[4]f64,
	rotation:              #simd[4]f64,
	velocity:              #simd[4]f64,
	angular_velocity:      #simd[4]f64,

    // etc.
}

And make an “array” of them which is actually a struct of arrays of each field. Simply by writing #soa[]Rigid_Body, awesome! Second, those #simd[4]f64 types support all the builtin arithmetic operations directly. So instead of messing with _mm256_ SSE intrinsic soup, I can just write:

rb.position = rb.position + (rb.velocity * dt)

Where this is automatically handling both the SIMD instructions and accessing the fields from a struct of arrays. No matter how much junk is on that Rigid_Body struct, the above method will only pull the memory for the position and velocity into cache. Even with the redundant fourth field this was still a huge speedup. So lovely.

As a side note, the reason those are f64 not f32 is that once you get past 2000FPS, applying acceleration due to gravity actually underflowed to zero, woops!

With that done, I was getting to quite satisfying scale even on a single core. Still everything felt pretty firm!

Stable Stacks
#

But still it was falling over. Without randomising the contact solve order (something not mentioned in the paper) it barely worked at all. No matter how small I made the time step I was still using a local solver for a very long chain of constraints. As I read more and more into the literature on offline quality rigid body simulation, it became clear that my game idea, motivated by a particular experience I’d wanted to build, also was an unsolved problem in the literature.

The sort of direct solvers you’d want to use to handle these scenarios (the kind used in the offline Bullet simulations I was trying to recreate) do not run at the scale I want to in real time.

But that doesn’t mean we can’t enjoy a bit of destruction in VR.

This is exactly the “graceful slow motion” I was talking about before. I really appreciate the fidelity of interaction the high simulation frame rate gets you.

Moving Forward
#

By the time I’d gotten Topple to this point, it was already about time to start my PhD. So though it didn’t work how I’d wanted, I was glad there was a good reason.

Getting a more “traditional” rigid body solver built is still on my bucket list, and there’s plenty more possible in real-time than I achieved here. Something for another time!

For now, I’ll leave you with a final video of Topple.

2022 Prototypes - This article is part of a series.

Part 1: Year of Prototypes: GPU Rigid Body Simulation

Part 2: Year of Prototypes: Quest 2 GPU Physics

Part 3: This Article

Learning Odin #

Solving the Solver #

Stable Stacks #

Moving Forward #

Learning Odin
#

Solving the Solver
#

Stable Stacks
#

Moving Forward
#