Sandlock v1.4: From Single File to Full-Stack Sandbox
I've been documenting the evolution of sandbox_exec into something more general. This post covers Sandlock v1.4.0 — the point where it became a proper multi-layer security system rather than a clever wrapper.
Repo: github.com/bkmashiro/Sandlock
The Refactor: 822 Lines → 8 Modules
The v1.3.0 single file hit 822 lines and was getting unwieldy. We split it:
src/
├── sandlock.h (156 lines) — shared types, config struct
├── main.c (261 lines) — CLI parsing, fork/exec orchestration
├── config.c (80 lines) — validation, conflict detection
├── strict.c (350 lines) — seccomp notify path-level control
├── seccomp.c (76 lines) — BPF filter generation
├── landlock.c (102 lines) — Landlock LSM filesystem rules
├── rlimits.c (31 lines) — resource limits
├── pipes.c (94 lines) — I/O pipe handling
└── isolation.c (110 lines) — /tmp isolation and cleanupThe longest file went from 822 lines to 261. make single still builds the monolith for simpler deployments.
v1.3: Log Levels
Simple but necessary — before this, sandlock output was all-or-nothing.
./sandlock # INFO (default)
./sandlock -v # DEBUG: shows "executing python3"
./sandlock -vv # TRACE: maximum verbosity
./sandlock -q # WARN: errors and warnings only
./sandlock -qqq # SILENT: child output onlyIn testing, -v is invaluable for seeing exactly what the strict mode interceptor is doing. In production, -q keeps Lambda logs clean.
v1.4: Strict Mode
This is the interesting one. The existing seccomp filter works at the syscall level — "block socket(), allow read()." That doesn't help if the threat is reading /etc/passwd or /proc/self/environ via an allowed openat().
Strict mode uses seccomp notify (kernel 5.0+, SECCOMP_FILTER_FLAG_NEW_LISTENER) to intercept specific syscalls in the parent process rather than blocking them outright:
Parent Child
│ │
│ fork() │
│ │
│ install seccomp filter
│ with NEW_LISTENER
│◄──── send notify_fd ─────────┤
├──────── "ready" ────────────►│
│ │
├── notify handler thread │ execvp()
│ │
│◄── openat("/etc/passwd") ────┤
│
├── is_path_allowed()?
│ ├─ YES → SECCOMP_USER_NOTIF_FLAG_CONTINUE
│ └─ NO → EACCESUsage:
# Allow only /tmp access
./sandlock --strict --allow /tmp -- python3 student.py
# Debug: see what's being blocked
./sandlock --strict --allow /tmp -v -- python3 student.py
# sandlock: DEBUG: BLOCKED: openat(/etc/passwd)
# sandlock: DEBUG: BLOCKED: openat(/proc/self/environ)The filter always allows system paths needed for execution (/bin, /lib, /lib64, /usr/bin, /etc/ld.so.*, /dev/null, /dev/urandom). Everything else defaults to denied unless you --allow it.
Config Conflict Detection
A new config.c module validates the configuration at startup before forking:
| Conflict | Action |
|---|---|
--strict without --allow | Error — won't start |
--strict + --pipe-io | Warning — disables pipe-io (deadlock risk) |
--landlock + --strict | Warning — both work, but redundant |
--isolate-tmp + --cleanup-tmp | Warning — redundant |
--cpu > --timeout | Warning — timeout triggers first |
No more silent failures from incompatible options.
Language-Level Sandboxes
The C core handles the OS layer. v1.5.0 (released same day) added language-specific layers on top.
Python (lang/python/sandbox.py)
Import hook + restricted builtins:
# These modules are blocked at import time:
# socket, ssl, requests, subprocess, os, sys, ctypes, pickle, ...
# These builtins are removed:
# exec, eval, compile, input, open (replaced with restricted version)
# Allowed:
# math, json, re, collections, datetime, random, statistics, hashlibThe restricted open() allows /tmp reads/writes only.
Known bypass vector: ().__class__.__bases__[0].__subclasses__() — the classic Python sandbox escape via introspection. Partial mitigation in place; the source scanner is the harder backstop.
JavaScript (lang/javascript/)
Two variants:
sandbox.js— strict VM isolation via Node'svmmodule, no process/eval/Function, module whitelistwrapper.js— npm packages available, runtime patching at therequirelevel
Source Code Scanner (lang/scanner/scanner.py)
Pre-execution static analysis for C/C++/Python/JavaScript/Rust/Go:
| Severity | Pattern | Example |
|---|---|---|
| 🔴 Critical | Inline assembly | asm("syscall") |
| 🔴 Critical | Direct syscall instruction | int 0x80 |
| 🔴 Critical | Custom entry point | _start() |
| 🟠 High | FFI/ctypes | dlopen, cffi, ffi-napi |
| 🟡 Medium | Dangerous functions | fork, socket, eval |
This runs before compilation or execution — the only layer that can catch direct syscall attempts in inline assembly.
LD_PRELOAD Hook (lang/preload/sandbox_preload.c)
For compiled binaries where you can't modify the source:
LD_PRELOAD=./sandbox_preload.so \
SANDBOX_NO_NETWORK=1 \
SANDBOX_NO_FORK=1 \
SANDBOX_ALLOW_PATH=/tmp \
./programHooks socket, connect, bind, fork, execve, execvp, open, fopen. Also blocks unsetenv/putenv to prevent LD_PRELOAD removal.
Known bypass: static linking, inline syscall() asm. The scanner is the defense against these.
The Full Defense Matrix
The real value of the modular design is how the layers compose. Here's how Full-Stack Sandlock covers the attack surface:
| Attack | seccomp | Landlock/Strict | Language sandbox | Scanner | Result |
|---|---|---|---|---|---|
| Network exfiltration | ✅ | — | ✅ | — | 🔴 Blocked |
| Reverse shell | ✅ | — | ✅ | — | 🔴 Blocked |
| Fork bomb | ✅ | — | ✅ | — | 🔴 Blocked |
| Read /etc/passwd | — | ✅ | ✅ | — | 🔴 Blocked |
| Write outside /tmp | — | ✅ | ✅ | — | 🔴 Blocked |
| ptrace | ✅ | — | — | — | 🔴 Blocked |
| Inline asm syscall | ✅ | — | — | ✅ | 🔴 Blocked |
| dlopen/FFI | ✅ | — | ✅ | ✅ | 🔴 Blocked |
| Direct syscall (asm) | ✅ | — | ⚠️ | ✅ | 🟡 Hard |
| /proc info leak | — | ⚠️ | ⚠️ | — | 🟡 Partial |
The remaining gaps — /proc information leakage, kernel 0-days — require mount namespaces and OS-level updates respectively. Neither is solvable in pure userspace.
Kernel Compatibility
| Feature | Min Kernel | AWS Lambda (5.10) | Modern (6.x) |
|---|---|---|---|
| seccomp-bpf | 3.5 | ✅ | ✅ |
| seccomp notify | 5.0 | ✅ | ✅ |
| Landlock | 5.13 | ❌ | ✅ |
Lambda runs kernel 5.10 via Firecracker — Landlock isn't available, and Firecracker applies its own seccomp filter that blocks installing additional ones. For Lambda, the defense stack is: rlimits + language sandbox + LD_PRELOAD + source scanner + env cleanup + VPC egress rules.
Performance
| Configuration | Overhead |
|---|---|
| Minimal (seccomp + rlimits) | ~1.5ms |
| Full (all options) | ~2.5ms |
| Strict mode (per-intercepted syscall) | ~0.1ms |
| Python sandbox overhead | ~8ms |
The 8ms Python sandbox overhead is the import hook scanning module names on every import. Worth it for the protection, but worth knowing.
What v1.5.0 Looks Like
The total codebase is now ~4,700 lines across C, Python, and JavaScript:
src/*.c + *.h ~1,500 lines
lang/python/ ~320 lines
lang/javascript/ ~670 lines
lang/scanner/ ~450 lines
lang/preload/ ~250 lines
tests/ ~500 lines framework + 48 attack testsCI triggers on changes to sandlock.c/Makefile. Bomb tests (fork bomb, memory bomb, CPU bomb) require manual opt-in — they pass through three layers of timeout (sandlock internal → shell timeout 10 → GitHub timeout-minutes: 10) so they can't harm the runner, but they're still gated to avoid accidental triggers.
