Compressing Gaussian Splats 33% Smaller Than SPZ

SPZ is Niantic's format for compressed Gaussian splats. It quantizes float32 fields to uint8/uint16, packs 20 bytes per point, and gzips the result. SHARP always outputs 1,179,648 points (1024 x 1152) — about 60 MB uncompressed. SPZ gets that down to ~11.7 MB.

We built MSPZ to serve splats faster in Mukbang. Across 50 real captures, MSPZ v3 averages 7.95 MB — 32% smaller than SPZ, and never produces a larger file.

Format Comparison

Format	Avg Size	vs SPZ
SPZ (gzip)	11.69 MB	baseline
MSPZ v1 (interleaved + delta + zstd level 5)	9.42 MB	19.4% smaller
MSPZ v3 (byte-plane + delta + zstd level 1)	7.95 MB	32.0% smaller

MSPZ v1 produced larger files than SPZ on 5 of 50 test images. MSPZ v3 beats SPZ on every single one.

What SPZ Does

SPZ quantizes each point's fields (position, color, scale, rotation, opacity) to 20 bytes, packs them sequentially, and gzips the whole thing. The compressor sees interleaved heterogeneous data — position bytes next to color bytes next to rotation bytes — and has to find patterns across unrelated fields.

What MSPZ v1 Changed

v1 sorts points along a Morton curve for spatial locality, delta-encodes, and compresses 256 KB chunks with zstd level 5 instead of gzip. Better compressor, better point ordering — but the data is still interleaved, so zstd is still fighting mixed data types within each chunk.

What MSPZ v3 Does Differently

v3 separates each byte position into its own plane before compression. Instead of storing 20 interleaved bytes per point, it creates 20 planes of N bytes each: all "position X byte 0" values together, all "position X byte 1" values together, all "color R" values together, and so on.

Then it delta-encodes and compresses each plane independently with zstd level 1 — the fastest setting.

This works because bytes from the same sub-channel are far more correlated than bytes from different fields. Delta encoding on a homogeneous stream produces long runs of near-zero values. zstd crushes these trivially, even at its lowest level. v1 at zstd level 5 can't match v3 at level 1 because it's trying to compress a fundamentally harder input.

v3 also prunes splats with sigmoid opacity below 0.05 — nearly invisible points that cost bytes but contribute nothing visible. This drops 1.5–3% of points per scene.

Quantization

All three formats use the same lossy quantization from float32 down to integer representations. The errors are well below perceptual thresholds for Gaussian splats:

Field	Encoding	RMSE
Position	float32 → 24-bit fixed-point	~0.00007
Color (SH DC)	float32 → uint8	~0.0075
Scale	float32 → uint8 (log-scale)	~0.018
Rotation	float32 → uint8 [-1, 1]	0.009–0.057

Gaussian splats are fuzzy overlapping blobs — sub-millimeter position errors and sub-percent color errors are invisible in the final render.

zstd Level Doesn't Matter When the Layout Is Right

We benchmarked both formats across all 19 zstd levels (averaged across 5 captures):

Level	v1 Size	v1 Time	v3 Size	v3 Time	v3 Δ from L1
1	11.98 MB	352 ms	9.34 MB	262 ms	baseline
3	11.79 MB	282 ms	9.44 MB	278 ms	+1.1%
5	11.58 MB	373 ms	9.40 MB	328 ms	+0.6%
8	11.48 MB	502 ms	9.32 MB	375 ms	−0.2%
12	11.21 MB	1599 ms	9.31 MB	453 ms	−0.3%
16	11.01 MB	2639 ms	9.17 MB	887 ms	−1.8%
19	10.97 MB	3424 ms	9.17 MB	2838 ms	−1.8%

v3 at level 1 (9.34 MB, 262 ms) already beats v1 at level 19 (10.97 MB, 3424 ms) — both smaller and 13x faster. The byte-plane layout is so much more compressible that the fastest zstd setting on v3 wins over the slowest on v1.

For v1, going from level 1 to 19 saves 1.01 MB (8.4%) but costs 10x more time. For v3, the same sweep saves just 0.17 MB (1.8%). Levels 2–7 on v3 are actually larger than level 1 — zstd's more aggressive strategies backfire on data that's already this clean. The total range across all 19 levels is under 3%.

Level 1 is the sweet spot. The layout does the work, not the compressor.

MSPZ is what powers splat delivery in Mukbang — 33% less data means faster loads on every capture.