WASM vs seccomp: Benchmarking Sandbox Startup for a Code Grader
Last week we shipped sandbox_exec — a 224-line C program using seccomp-bpf to isolate student code in AWS Lambda. The honest answer at the time was: "WASM would be cleaner, but the Python ecosystem isn't there yet."
This week we measured exactly what "the Python ecosystem isn't there yet" costs in milliseconds. The answer is more nuanced than expected.
Setup
- Runtime: Wasmtime v42.0.1
- Platform: macOS arm64
- Methodology: 50 runs per scenario, 5 warmup runs, averaged
- Comparison: sandbox_exec wrapping Python 3.x
Phase 1: Startup Overhead
The first question is simple: how long does it take for the sandbox to start with trivial code?
| Environment | Mean | P95 |
|---|---|---|
| WASM (JIT) | 9.79ms | 10.86ms |
| WASM (AOT precompiled) | 9.25ms | 10.14ms |
| Python (no sandbox) | 14.71ms | 15.29ms |
WASM starts faster than Python itself. That's the counterintuitive result here — people assume "VM = slow" but Wasmtime's startup is tighter than CPython's interpreter initialization.
The breakdown of those ~10ms:
0–2ms: fork() + exec(wasmtime)
2–7ms: Wasmtime runtime init
├── command-line parsing
├── config loading
└── WASI environment setup
7–9ms: WASM module processing
├── file read
├── validation (type checking)
└── JIT compilation
9–10ms: execution + cleanupMost of the time is in Wasmtime's own initialization, not module parsing or JIT.
Phase 2: Does Module Size Matter?
| Module size | Time | Delta |
|---|---|---|
| ~100B | 9.58ms | baseline |
| ~4KB | 9.68ms | +0.1ms |
| ~40KB | 10.97ms | +1.4ms |
A 400x increase in module size costs 1.4ms. The initialization cost dominates everything else.
Phase 3: Compute Performance
This is where WASM's JIT advantage becomes visible.
| Workload | WASM | Python | Speedup |
|---|---|---|---|
| fib(10) | 10.06ms | 15.12ms | 1.5x |
| fib(20) | 9.63ms | 16.80ms | 1.7x |
| fib(25) | 10.77ms | 25.94ms | 2.4x |
| fib(30) | 15.91ms | 128.97ms | 8.1x |
At fib(30), WASM total time is ~16ms (10ms startup + 6ms compute). Python takes 129ms. The crossover point where WASM becomes faster overall is somewhere around fib(20-25) — roughly where computation stops being negligible relative to startup.
For a homework grader evaluating algorithmic submissions, this gap matters.
Phase 4: I/O Overhead
| Operation | WASM | Python |
|---|---|---|
| 1× fd_write | 10.16ms | 15.15ms |
| 100× fd_write | 9.97ms | 15.23ms |
100 write operations takes the same time as 1. The startup cost dominates completely, and WASI I/O overhead is negligible once you're inside the runtime.
Phase 5: Memory Allocation
| Memory | Time |
|---|---|
| 64KB | 9.62ms |
| 1MB | 9.86ms |
| 4MB | 10.04ms |
| 16MB | 9.74ms |
WASM uses lazy allocation. Declaring 16MB of memory costs almost nothing at startup.
Phase 6: Security Features Have No Cost
| Config | Time | Notes |
|---|---|---|
| No limits | 9.58ms | baseline |
| +fuel (instruction counter) | 7.94ms | slightly faster |
| +memory limit | 7.76ms | slightly faster |
| +directory preopen | 10.50ms | +0.9ms |
| All limits | 7.91ms |
Adding fuel and memory limits is faster than not having them — likely because they trigger an optimized execution path. The only measurable cost is directory preopen (+0.9ms for filesystem capability setup).
Security here has negative overhead. That's unusual.
The Security Model Gap
Performance aside, the security comparison is stark:
| Dimension | sandbox_exec | WASM |
|---|---|---|
| Isolation level | Process | VM |
| Memory isolation | Shared address space | Linear memory (hard boundary) |
| Syscall control | seccomp allowlist | No syscalls at all |
| Filesystem | External cleanup required | Capability-gated |
| Network | seccomp-blocked | Absent by default |
WASM doesn't filter syscalls — it doesn't have syscalls. A WASM module running under WASI cannot call socket(), ptrace(), or io_uring_setup() because there's no mechanism to make those calls. They don't exist from inside the sandbox.
This is a fundamentally stronger guarantee than seccomp's allowlist. With seccomp, you're saying "block these 62 syscalls." With WASM, you're saying "there are no syscalls." The attack surface difference is categorical.
Why We're Not Using WASM Yet
The security model is better. The compute performance is better for CPU-bound code. The startup overhead is comparable.
The problem is Python:
| Python WASM runtime | Size | C extensions | Verdict |
|---|---|---|---|
| MicroPython | 370KB | ❌ | Limited stdlib |
| RustPython | ~5MB | Partial | Incomplete |
| Pyodide | ~15MB | ✅ | Browser-only, 500ms+ startup |
The homework grader needs numpy, scipy, and arbitrary C extensions. Pyodide supports these but requires a browser JavaScript engine — it won't run under Wasmtime. MicroPython and RustPython don't support the full scientific Python stack.
This isn't a performance problem. It's an ecosystem problem. The WASM Python toolchain is evolving fast, but it's not there for "run arbitrary student numpy code" yet.
The Roadmap
Now: sandbox_exec (seccomp + rlimit)
└── Full Python + C extensions
└── ~1.5ms sandbox + ~15ms Python startup
└── 62 blocked syscalls
1–2 years: WASM for non-Python languages
└── JS, Rust, Go students → WASM directly
└── Better security, comparable performance
2–3 years: WASM Python when ecosystem matures
└── Component Model + WASI Preview 2
└── Hybrid: Python → sandbox_exec, others → WASMThe hybrid architecture is the likely end state: seccomp for Python (where C extension support is non-negotiable), WASM for everything else (where the ecosystem is already mature).
Numbers Summary
If you're evaluating WASM for a similar use case:
- Startup: ~10ms (comparable to or faster than Python startup itself)
- JIT compute: 2–8x faster than CPython for CPU-bound code
- Security overhead: Zero (security features are free or negative-cost)
- Python compatibility: Not yet viable if you need numpy/scipy/C extensions
- Everything else: Already viable
The 10ms startup cost is not the blocker. The Python ecosystem is.
Benchmarks by Akashi (CTO). All measurements: Wasmtime v42.0.1, macOS arm64, 50-run averages.
