The Baseline
Our patchworks appliance runs Redis 7.2.13 as PID 1 on a FreeBSD machine. It accepts authenticated connections, lets visitors write messages to a single stream, and lets administrators read them. That’s it. No sorted sets, no geospatial queries, no pub/sub, no scripting, no cluster coordination.
We built Redis from source with
-fprofile-arcs
-ftest-coverage and ran our 59 operational tests — every
authentication flow, every stream command, every ACL enforcement case,
every error path we handle in production. Then we ran gcov.
The results:
Total executable lines: 57,753
Lines executed: 8,617 (14.9%)
Lines NOT executed: 48,551 (85.1%)
Zero-coverage files: 36 (20,332 lines)
Eighty-five percent of the Redis source code does nothing for us. It sits in our binary, consuming instruction cache, expanding our attack surface, and adding complexity to a system whose virtue is simplicity. Every one of those 48,551 lines is a candidate for removal.
But we don’t remove code by guessing. We remove it by proving — with tests, with coverage, and with traces — that we know exactly what we’re cutting and exactly what remains.
The Procedure: DDD.1
We call our removal methodology DDD.1 — the first formal procedure in Demolition-Driven Development. It has five steps, and we cycle through them for each module we remove:
- Write tests that exercise the condemned module. Not perfunctory tests — thorough tests. Every command, every flag, every edge case we can find. We aim for near-total coverage of the module’s source code. These tests prove we understand what we’re removing.
- Verify all tests pass. Before touching any source code, every DDD.1 test must pass. This is the “before” photograph. The module works. We know how it works. We have evidence.
- Excise the module. Replace the source files with stubs that return error messages. Not disabled by configuration — removed from the binary. The stubs ensure that any client issuing a removed command gets a clear, honest error: this command is not available.
- Verify all tests xfail. Every test we wrote in step 1
must now fail — and we expect it to fail. We use
pytest.mark.xfail(strict=True), which means the test must fail. If any test unexpectedly passes, the removal is incomplete. Something survived that shouldn’t have. - Trace the residual coverage. This is the step that makes DDD.1 more than test-driven deletion. We run each xfail test individually with per-test gcov instrumentation and measure how many lines of C each failing test exercises. After a clean excision, every test should exercise approximately the same number of lines — the infrastructure floor, which is just the cost of being a Redis server: boot, connect, authenticate, dispatch, error, reply. If any test exercises significantly more code than average, something is leaking. A hidden dependency survived. The excision is not clean.
After step 5, we also verify that our 59 operational tests still pass. The appliance works. The removed module is gone. The evidence is complete.
Then we pick the next module and repeat.
Module 1: Geo
Three files. 1,584 lines. Ten commands. Zero coverage.
Step 1: Writing the Tests
Redis’s Geo module provides geospatial indexing: you store longitude/latitude pairs, then query by radius, bounding box, or distance. It’s built on sorted sets with geohash-encoded scores, and it ships with its own geohash library (two additional C files).
We don’t use any of it. Our appliance handles streams. But before we could remove Geo, we had to prove we understood what it did.
We wrote 43 tests across seven test classes:
- GEOADD (7 tests) — single member, multiple members, update existing, NX flag (don’t overwrite), XX flag (only overwrite), CH flag (count changes), boundary coordinates (±180° longitude, ±85° latitude)
- GEOPOS (4 tests) — single lookup, multiple lookup, nonexistent member returns nil, nonexistent key returns nil
- GEODIST (6 tests) — meters, kilometers, miles, feet, same point returns zero, nonexistent member returns nil
- GEOHASH (4 tests) — single hash, multiple hashes, nonexistent returns nil, nearby points share geohash prefix
- GEOSEARCH (9 tests) — radius from member, radius from coordinates, bounding box, COUNT limit, ASC sort, DESC sort, WITHDIST, WITHCOORD, empty result
- GEOSEARCHSTORE / GEORADIUS (6 tests) — store results, store distances, legacy GEORADIUS and GEORADIUSBYMEMBER commands
- Edge cases (7 tests) — invalid longitude, invalid latitude, nonexistent keys, nonexistent members, verify underlying sorted set, delete via ZREM
We used real-world coordinates — Rome, Paris, London, Berlin, Tokyo, Tel Aviv — so the distance assertions were readable: Rome to Paris is roughly 1,105 km, and the test asserts a value between 1,000 and 1,200.
Step 2: All Tests Pass
$ pytest tests/test_ddd1_geo.py -v
...
43 passed in 0.92s
Coverage of the geo source files after running these tests: 88.6%. The uncovered 11.4% was error-handling for encoding edge cases (skiplist vs. listpack internal representations) and argument validation paths that required malformed RESP protocol input rather than normal client calls. We were satisfied that we understood the module.
Step 3: Excision
We replaced geo.c (1,005 lines), geohash.c
(299 lines), and geohash_helper.c (280 lines) with a single
67-line stub file. Each of the ten command handlers now calls:
addReplyError(c,
"ERR Geo commands are not available. "
"This redis-server was compiled without "
"geospatial support (redis-hardened).");
The geohash library files became empty — four lines each
(a comment pointing to the rationale in geo.c).
Rebuild: clean. No warnings.
Step 4: All Tests Xfail
We flipped one boolean:
GEO_EXCISED = True
Every test in the file is decorated with
@geo_test, a marker that resolves to
pytest.mark.xfail(strict=True) when the flag is set.
Strict means: the test must fail. If Geo somehow still worked,
the test would pass, and strict=True would report that
unexpected pass as a failure. A passing test after excision means the
removal is incomplete.
$ pytest tests/test_ddd1_geo.py -v
...
43 xfailed in 7.25s
All 43 tests failed as expected. The operational tests:
$ pytest tests/test_patchworks_coverage.py -v
...
59 passed in 0.97s
The appliance is unaffected.
Step 5: Trace Analysis
This is where it gets interesting.
We ran each of the 43 xfail tests in isolation, with a fresh redis-server for each test, collecting gcov data per run. The question: after excision, how much code does each failing test exercise? A geo command now hits a two-line stub. But the server still has to boot, accept the connection, authenticate, look up the command in the dispatch table, call the stub, format the error, and send the reply. How deep does that go?
Tests: 43
Average: 4,897 lines hit per test
Min: 4,895 lines
Max: 4,903 lines
Range: 8 lines
Stdev: 1 line
Every test exercised the same code. Within one line of each other. The range across all 43 tests was eight lines.
The ~4,897 lines break down as pure infrastructure:
| File | Lines hit | Role |
|---|---|---|
server.c | 1,271 | Startup and command dispatch |
networking.c | 508 | Connection handling |
module.c | 408 | Module infrastructure (even empty) |
rax.c | 285 | Radix tree (command table lookup) |
sds.c | 262 | String handling |
acl.c | 208 | Authentication |
dict.c | 208 | Hash tables |
config.c | 189 | Config parsing at startup |
The four tests that hit 4,901–4,903 lines (instead of the
baseline 4,895–4,897) were the “nonexistent key” tests.
They triggered a database key lookup that missed, adding a few lines in
server.c’s key-miss path. Fully explained.
Fully accounted for.
No geo-specific code appeared in any trace. The stub is a wall. Nothing leaked through.
This is what a clean excision looks like: uniform traces, vestigial infrastructure, no surprises. We call the ~4,897-line baseline the infrastructure floor — the irreducible cost of being a Redis server. As we remove more modules, this floor should shrink: fewer command table entries, less initialization code, smaller dispatch paths. Watching the floor decrease is how we measure dissolution progress.
Result: 1,584 lines removed. Binary size decreased. Attack surface reduced by ten command handlers and an entire geohash library. 43 xfail tests document exactly what was removed and prove it’s gone. 59 operational tests prove the appliance still works.
On to the next module.
Module 2: Bitops
One file. 1,267 lines. Seven commands. Zero coverage.
Redis’s bit operation module provides SETBIT, GETBIT, BITCOUNT, BITOP (AND/OR/XOR/NOT), BITPOS, BITFIELD, and BITFIELD_RO. It’s a compact, self-contained module for treating Redis strings as bit arrays.
We wrote 30 tests: individual bit manipulation (set, get, extend, patterns), population counting (full string, byte range, after setbit), bitwise operations (AND, OR, XOR, NOT, multiple keys, different lengths), first-bit searching (first set, first clear, range, not-found), and the BITFIELD sub-language (SET/GET, INCRBY, overflow modes WRAP/SAT/FAIL, signed integers, multiple operations, 16-bit types, read-only variant). All 30 passed.
Excision: replaced 1,267 lines with seven one-line stubs. All 30 tests xfail. All 59 operational tests pass.
Bitops was clean — no internal dependencies, no cross-references from other modules. A textbook DDD.1 cycle.
Module 3: Sort
One file. 316 lines. Two commands. Zero coverage.
SORT is one of Redis’s oldest commands. It sorts lists and sets by numeric value, lexicographically with ALPHA, with external key patterns via BY and GET, with LIMIT for pagination, and with STORE for persisting results. SORT_RO is the read-only variant.
We wrote 10 tests covering numeric, descending, alphabetical, limit, store, BY pattern, GET pattern, empty list, set input, and SORT_RO. All 10 passed.
Excision: replaced 316 lines with two one-line stubs. All 10 tests xfail. All 59 operational tests pass.
Module 4: LOLWUT
Three files. 566 lines. One command. Zero coverage.
LOLWUT is Redis’s easter egg — an ASCII art generator that
ships with every Redis version. Version 5 draws a Schotter-like pattern;
version 6 draws a different design. It includes a small canvas rendering
library (lwCanvas) for drawing pixels and converting them
to Unicode braille characters.
It’s charming. It’s also 566 lines of code in a security-critical binary that we never call.
We wrote 5 tests: basic output (multi-line art), version 5 rendering, version 6 rendering, rendering with size arguments, and art content verification. All 5 passed. We verified the tests checked for actual art content (canvas output with many lines), not just the version string that our stub also returns.
Excision: replaced the three art files with stubs. The canvas API
stubs (lwCreateCanvas, etc.) are retained because
lolwut.h is included by server.h. The
LOLWUT command now returns a single line with the Redis version.
All 5 tests xfail. All 59 operational tests pass.
Module 5: Sentinel
One file. 5,484 lines. The largest single excision.
Sentinel is Redis’s distributed monitoring and automatic failover
system. It watches master and replica instances, detects failures via
subjective and objective down detection (+sdown, +odown), orchestrates
failover elections, reconfigures replicas, and executes notification
scripts. It’s 5,484 lines of distributed systems code — the
largest module in Redis after server.c itself.
Our appliance is a single Redis instance with no replicas. Sentinel is not just unused — it’s architecturally irrelevant.
Sentinel posed a unique DDD.1 challenge. Unlike Geo or Bitops, whose
commands route directly to handler functions, SENTINEL commands are
rejected by the server’s CMD_ONLY_SENTINEL flag
before reaching any handler. This means the observable command
behavior is identical in both original and excised builds — both
return the same system-level error.
So we tested structurally rather than behaviorally. Our 6 tests verify
that sentinel-specific strings are present in the binary: failover state
machine strings (+failover-state), down-detection events
(+sdown, +odown), TILT mode references,
configuration keywords (down-after-milliseconds), and
failover reconfiguration strings. After excision, these strings vanish
from the binary, and the tests fail.
The stub required careful attention to lifecycle hooks:
sentinelTimer() (called every server tick),
initSentinel() (called at startup when
--sentinel flag is set), and configuration parsing
functions. All are no-ops since our server never enters sentinel mode.
Excision: 5,484 lines replaced with a 70-line stub. All 6 tests xfail. All 59 operational tests pass. Binary size dropped from ~13.8 MB to ~12.3 MB.
Module 6: String Type Commands
One file. 951 lines. Twenty-four commands. Zero coverage.
SET and GET are the most fundamental Redis commands. Removing them
from a Redis server feels like removing wheels from a car. But our
appliance doesn’t use string keys — it uses streams. The live
patchworks instance has exactly one key: ruach:stream:contact.
No strings.
We wrote 30 tests: SET with EX/PX/NX/XX/GET/KEEPTTL flags, SETNX, SETEX, PSETEX, GETEX (persist, set expiry), GETDEL, GETSET, MGET, MSET, MSETNX, INCR/DECR/INCRBY/DECRBY/INCRBYFLOAT, APPEND, STRLEN, SETRANGE, GETRANGE, and LCS. All 30 passed.
String type was the first data type excision where we discovered a
clean separation: the command handlers in t_string.c had
zero references from other files. No aof.c, no rdb.c, no debug.c,
no module.c dependencies. The internal string representation
(sds.c) is a different thing entirely — that’s
Redis’s core string library, used by everything. The
type commands were cleanly isolable.
Excision: 951 lines replaced with stubs. All 30 tests xfail. All 59 operational tests pass.
Module 7: Cluster
One file. 3,649 lines. The most entangled excision.
Cluster was different from every previous module. Where Geo had zero cross-references and String had clean command handlers, Cluster’s tentacles reached into every corner of the server:
- server.c —
clusterCron()called every server tick,clusterInit()at startup,getNodeByQuery()in command dispatch - db.c — six
slotToKey*functions for tracking which keys live in which hash slots - networking.c —
getClusterConnectionsCount()for connection limits - config.c — five
clusterUpdateMyself*functions for config changes - debug.c, module.c, pubsub.c — node lookup, cluster link management, message propagation
- Command table — MIGRATE, DUMP, RESTORE, ASKING, READONLY, READWRITE in addition to the CLUSTER command itself
We stubbed every one of these as a no-op or trivial return. The
lifecycle hooks do nothing (cluster is never enabled).
getClusterConnectionsCount() returns 0.
slotToKey* functions are empty.
genClusterInfoString() returns static disabled-cluster info.
askingCommand was retained as a stub because
networking.c compares against it as a function pointer.
After excision, we took an additional step: we removed 29 cluster
command JSON files, 22 sentinel JSONs, and the JSON files for every
other excised module — 87 files total — from the
commands/ directory. This eliminated their entries from the
auto-generated command table, shrinking it from 75,816 bytes to 57,408
bytes. The excised commands don’t return error messages anymore
— Redis doesn’t recognize them at all. An attacker
can’t even probe for their existence.
With the command table entries gone, the handler stubs became orphaned code — nothing referenced them. We deleted them too. The chain: command JSON → table entry → handler stub. Cut the first link and the rest falls away.
Excision: 3,649 lines of cluster code + 87 command JSON files + orphaned stubs. All 8 tests xfail. All 59 operational tests pass. Binary dropped from ~12.3 MB to ~11.7 MB.
Interlude: Clearing the Noise
Before continuing to harder targets, we cleaned up sources of statistical noise in our coverage measurements:
- CLI tools removed from build.
redis-cli(5,933 lines),redis-benchmark(1,221 lines), and their shared librarycli_common.c(137 lines) were never part of the server binary, but their source files inflated our line-count denominator. We removed them from the Makefileall:target andALL_SOURCES. - pqsort.c emptied (69 lines). Its only caller was the SORT command, already excised.
- crc16.c stubbed (6 lines). Its only caller was redis-cli, no longer built.
These changes removed ~7,400 lines from the measurement denominator. Our operational coverage jumped from 17.5% to 22.3% — same code exercised, less dead weight counted.
Module 8: (next — surgical excision of debug.c)
The remaining zero-coverage modules (data types, debug, AOF, pub/sub, replication, tracking, defrag, latency) are entangled with four infrastructure files: aof.c, rdb.c, debug.c, and module.c. These files contain serialization and diagnostic code for every data type. Removing a data type’s internal functions breaks the linker because these four files still reference them.
The next phase attacks this entanglement directly, starting with debug.c — the lowest-coverage infrastructure file that we have the least operational need for.
The Scoreboard
We track dissolution progress across modules:
| Module | Lines removed | DDD.1 tests | Xfail | Floor Δ | Status |
|---|---|---|---|---|---|
| Lua scripting | ~900 | — | — | — | ✅ Excised (pre-DDD.1) |
| HyperLogLog | ~15 (stubbed) | — | — | — | ✅ Stubbed (pre-DDD.1) |
| Geo | 1,584 | 43 | 43/43 | baseline: 4,897 | ✅ Excised |
| Bitops | 1,267 | 30 | 30/30 | — | ✅ Excised |
| Sort | 316 | 10 | 10/10 | — | ✅ Excised |
| LOLWUT | 566 | 5 | 5/5 | — | ✅ Excised |
| Sentinel | 5,484 | 6 | 6/6 | 12.3 MB (↓1.5 MB) | ✅ Excised |
| String type | 951 | 30 | 30/30 | — | ✅ Excised |
| Cluster | 3,649 | 8 | 8/8 | 11.7 MB (↓0.6 MB) | ✅ Excised |
| Command table | 87 JSON files | — | — | 18 KB table shrink | ✅ Removed |
| CLI tools | 7,366 (from build) | — | — | — | ✅ Removed from build |
| pqsort + crc16 | 75 | — | — | 22.3% coverage | ✅ Emptied |
| Hash type | 646 | 16 (written) | — | — | ⏳ Entangled |
| List type | 721 | — | — | — | ⏳ Entangled |
| Set type | 904 | — | — | — | ⏳ Entangled |
| Sorted set type | 2,297 | — | — | — | ⏳ Entangled |
| debug.c | 976 | — | — | — | ⏳ Next target |
Running total: 42,031 executable lines remain (down from 57,753). Coverage: 22.3% (up from 14.9%). Binary: 11.6 MB (down from 13.8 MB). 132 DDD.1 xfail tests document every removal.
The remaining zero-coverage targets — data type internals (hash, list, set, sorted set) and low-coverage infrastructure (debug, AOF, pub/sub, replication, tracking, defrag, latency) — are entangled with four files that reference every type’s internal functions. The next phase attacks this entanglement surgically.
Ruach Tov is open-source AI infrastructure research. If this work is valuable to you, consider supporting the project.