Towards the latter half of 2022 two things were clear: I didn’t like working in off the shelf engines, and I didn’t have the skills to ship a game without one. So I decided to get good.
That’s an ongoing journey, but Topple was an important step on it. For a while I’d been looking at the Odin programming language as a sort of friendlier C. It ships with all the graphics APIs and C libs I tend to use (like stb), and has some things I love about scientific langs, like array programming and good SIMD intrinsics. Perhaps as importantly, it lacks the things that I don’t like about C++.
Learning Odin #
Building real things is the best way to learn, so I ported over some particle XPBD code to Odin to run on the CPU, wrote a basic OpenGL render loop and had this working in under a day:
I think it’s a testament to Bill’s great language design that I was able to go from reading the docs to a working rendered real-time physics sim so quickly. Since I knew I wanted to properly understand VR programming, I found some Odin bindings for OpenXR, read the spec, and in 2 more days had my little particle demo working in VR!
This felt great, and was a huge confidence boost. In many ways, directly using the OpenXR C API felt easier than the higher level VR interfaces in Unity. At this point I’d already made the decision to pursue a PhD, but the idea of getting good at engine-level programming was pretty exciting. I decided to give the XPBD rigid body solver another go.
Solving the Solver #
You may recall the bouncy stacks from my GPU Rigid Bodies article were due to using a Jacobi solver instead of a Gauss-Seidel one. That was largely a result of putting the solver on the GPU. But now I had a low level CPU language at my disposal, the possibility of writing a CPU solver became more and more interesting to me.
To summarise: here was the main “conceit” of writing my own solver.
- With XPBD, solver iterations are replaced with substeps, so your simulation can run at very high framerates (1000s of FPS)
- So even if the simulation gets too big, in VR it could just run in slow-mo and still hit your render frame target.
Getting a slow motion effect for free and also gracefully degrading for bigger sims was and still is a pretty compelling feature of these solvers. So I began porting my GPU code over, just replacing the batched update with a sequential one. Pretty quickly I had this:
Which already felt so much more real than the Jacobi solver. Then, I optimised what I did
have using Odin’s SIMD intrinsics. I’ve written a decent amount of “vectorised” and “SPMD”
code on the CPU before using OpenCL and ISPC, but this was my first time doing direct SIMD
programming. Two key features in Odin made that a delight, the first is the #soa
directive.
I can take a struct like this:
Rigid_Body :: struct {
// Static quantities
kind: Body_Kind,
inv_mass: f64,
inv_inertia: #simd[4]f64,
extents: #simd[4]f64,
// Dynamic quantities
position: #simd[4]f64,
rotation: #simd[4]f64,
velocity: #simd[4]f64,
angular_velocity: #simd[4]f64,
// etc.
}
And make an “array” of them which is actually a struct of arrays of each field. Simply by writing
#soa[]Rigid_Body
, awesome! Second, those #simd[4]f64
types support all the builtin
arithmetic operations directly. So instead of messing with _mm256_
SSE
intrinsic soup, I can just write:
rb.position = rb.position + (rb.velocity * dt)
Where this is automatically handling both the SIMD instructions and accessing the fields from
a struct of arrays. No matter how much junk is on that Rigid_Body
struct, the above method will
only pull the memory for the position
and velocity
into cache. Even with the redundant fourth
field this was still a huge speedup. So lovely.
As a side note, the reason those are f64
not f32
is that once you get past 2000FPS, applying
acceleration due to gravity actually underflowed to zero, woops!
With that done, I was getting to quite satisfying scale even on a single core. Still everything felt pretty firm!
Stable Stacks #
But still it was falling over. Without randomising the contact solve order (something not mentioned in the paper) it barely worked at all. No matter how small I made the time step I was still using a local solver for a very long chain of constraints. As I read more and more into the literature on offline quality rigid body simulation, it became clear that my game idea, motivated by a particular experience I’d wanted to build, also was an unsolved problem in the literature.
The sort of direct solvers you’d want to use to handle these scenarios (the kind used in the offline Bullet simulations I was trying to recreate) do not run at the scale I want to in real time.
But that doesn’t mean we can’t enjoy a bit of destruction in VR.
This is exactly the “graceful slow motion” I was talking about before. I really appreciate the fidelity of interaction the high simulation frame rate gets you.
Moving Forward #
By the time I’d gotten Topple to this point, it was already about time to start my PhD. So though it didn’t work how I’d wanted, I was glad there was a good reason.
Getting a more “traditional” rigid body solver built is still on my bucket list, and there’s plenty more possible in real-time than I achieved here. Something for another time!
For now, I’ll leave you with a final video of Topple.