Mastodawn

Optimization can be tricky.
Here’s how to go from drawing a few hundred trees to virtually unlimited in Three.js, step by step.

This will be high level, but not so much that you can’t fill in the details.

#threejs

Show thread

SimonDev Feb 4

The first step is to make sure you’re measuring the right things. You need both CPU time and GPU time, to understand where the problems lie.

I use three-perf or the Three.js Inspector so I can see both numbers easily.

Show thread

SimonDev Feb 4

Draw calls are often the first hurdle.

In this example, as the trees stream in, the FPS drops, and we cap out around ~500 or so.

Show thread

SimonDev Feb 4

Low-hanging fruit: stop duplicating assets.
Share geometry & materials across instances and you'll see immediate improvements in the framerate.

This change alone gets us to ~700–800 trees at 60fps.

Show thread

SimonDev Feb 4

To draw a lot of stuff, you want to reduce materials. Collapse different materials into a single material by packing textures into atlases.

Then you can collapse draw calls with:
• InstancedMesh (same geo)
• BatchedMesh (different geo)

Show thread

SimonDev Feb 4

Instancing allows us to tell the GPU in a single draw call: "hey, draw this thing a zillion times".

No need for the CPU to constantly submit draw commands, which alleviates the load on the CPU and shifts the bottleneck to the GPU.

We're hitting 30k+ trees now.

Show thread

SimonDev Feb 4

Now take a look at your data.

You want GPU-friendly assets, not just smaller downloads.

Meshes: weld verts, simplify, quantize
Textures: Use GPU compressed formats (like ETC1S/ UASTC)

I use my in-browser GLB optimizer to do most of this: https://gltf-optimizer.simondev.io

Show thread

SimonDev Feb 4

You can quantize way further than most people think.
It’s possible to squeeze a ~56B vertex down to ~16B with packing + quantization.

TSL makes unpacking clean (override attributes via node API).

Source: https://x.com/SebAaltonen/status/1515735247928930311

This gets us to 50k+ trees.

Show thread

SimonDev Feb 4

At some point it’s hard to “draw faster”. So stop drawing stuff you can’t see.

Frustum culling removes anything offscreen.

This isn't automatic with InstancedMesh, so you can either:
• Instance within a chunk, then cull by chunk.
• Cull manually per-instance

We’re at 250k+ trees now.

(sidenote: occlusion culling, scene-dependent but huge when it applies)

Show thread

SimonDev Feb 4

At this point, the last lever left is reducing quality, but it’s a powerful one.

LOD (level-of-detail) works by dropping detail with distance. As an object gets further away, you swap meshes (LOD0 -> LOD1 -> LOD2) and nobody notices (hopefully)

With instancing, you'll have to do this manually with an InstancedMesh for each level.

Show thread

SimonDev Feb 4

Octahedral imposters are a powerful technique, where we render the object from many angles into an atlas texture, and then just show a billboard in the world.

The key detail, it responds to camera movement and lighting, but it's just smoke and mirrors.

TSL makes it easy to hook into the lighting system, making it seemless.

Show thread

SimonDev Feb 4

Once you've made it through all these steps

• Reuse materials
• Batch/instance
• Optimize data
• Cull
• LOD/imposters

We're hitting 1 million+ trees, for very little CPU/GPU cost.

Show thread

SimonDev Feb 4

Full video: https://youtu.be/phbaxNPJxss
Full course: https://simondev.io/lessons/gamedev/

How to optimize (almost) anything

YouTube