⊘ SEALED — This article grows with each module we remove. Unsealed when dissolution is complete.

№ 18: Dissolving Redis

We compiled Redis with code coverage instrumentation, ran the 59 tests that exercise everything our appliance actually does, and discovered that we use 14.9% of Redis. This is the story of removing the other 85.1%, one module at a time, and watching the binary get smaller, faster, and harder to attack.

The Baseline

Our patchworks appliance runs Redis 7.2.13 as PID 1 on a FreeBSD machine. It accepts authenticated connections, lets visitors write messages to a single stream, and lets administrators read them. That’s it. No sorted sets, no geospatial queries, no pub/sub, no scripting, no cluster coordination.

We built Redis from source with -fprofile-arcs -ftest-coverage and ran our 59 operational tests — every authentication flow, every stream command, every ACL enforcement case, every error path we handle in production. Then we ran gcov.

The results:

Total executable lines: 57,753
Lines executed:          8,617 (14.9%)
Lines NOT executed:     48,551 (85.1%)
Zero-coverage files:    36 (20,332 lines)

Eighty-five percent of the Redis source code does nothing for us. It sits in our binary, consuming instruction cache, expanding our attack surface, and adding complexity to a system whose virtue is simplicity. Every one of those 48,551 lines is a candidate for removal.

But we don’t remove code by guessing. We remove it by proving — with tests, with coverage, and with traces — that we know exactly what we’re cutting and exactly what remains.

The Procedure: DDD.1

We call our removal methodology DDD.1 — the first formal procedure in Demolition-Driven Development. It has five steps, and we cycle through them for each module we remove:

  1. Write tests that exercise the condemned module. Not perfunctory tests — thorough tests. Every command, every flag, every edge case we can find. We aim for near-total coverage of the module’s source code. These tests prove we understand what we’re removing.
  2. Verify all tests pass. Before touching any source code, every DDD.1 test must pass. This is the “before” photograph. The module works. We know how it works. We have evidence.
  3. Excise the module. Replace the source files with stubs that return error messages. Not disabled by configuration — removed from the binary. The stubs ensure that any client issuing a removed command gets a clear, honest error: this command is not available.
  4. Verify all tests xfail. Every test we wrote in step 1 must now fail — and we expect it to fail. We use pytest.mark.xfail(strict=True), which means the test must fail. If any test unexpectedly passes, the removal is incomplete. Something survived that shouldn’t have.
  5. Trace the residual coverage. This is the step that makes DDD.1 more than test-driven deletion. We run each xfail test individually with per-test gcov instrumentation and measure how many lines of C each failing test exercises. After a clean excision, every test should exercise approximately the same number of lines — the infrastructure floor, which is just the cost of being a Redis server: boot, connect, authenticate, dispatch, error, reply. If any test exercises significantly more code than average, something is leaking. A hidden dependency survived. The excision is not clean.

After step 5, we also verify that our 59 operational tests still pass. The appliance works. The removed module is gone. The evidence is complete.

Then we pick the next module and repeat.


Module 1: Geo

Three files. 1,584 lines. Ten commands. Zero coverage.

Step 1: Writing the Tests

Redis’s Geo module provides geospatial indexing: you store longitude/latitude pairs, then query by radius, bounding box, or distance. It’s built on sorted sets with geohash-encoded scores, and it ships with its own geohash library (two additional C files).

We don’t use any of it. Our appliance handles streams. But before we could remove Geo, we had to prove we understood what it did.

We wrote 43 tests across seven test classes:

  • GEOADD (7 tests) — single member, multiple members, update existing, NX flag (don’t overwrite), XX flag (only overwrite), CH flag (count changes), boundary coordinates (±180° longitude, ±85° latitude)
  • GEOPOS (4 tests) — single lookup, multiple lookup, nonexistent member returns nil, nonexistent key returns nil
  • GEODIST (6 tests) — meters, kilometers, miles, feet, same point returns zero, nonexistent member returns nil
  • GEOHASH (4 tests) — single hash, multiple hashes, nonexistent returns nil, nearby points share geohash prefix
  • GEOSEARCH (9 tests) — radius from member, radius from coordinates, bounding box, COUNT limit, ASC sort, DESC sort, WITHDIST, WITHCOORD, empty result
  • GEOSEARCHSTORE / GEORADIUS (6 tests) — store results, store distances, legacy GEORADIUS and GEORADIUSBYMEMBER commands
  • Edge cases (7 tests) — invalid longitude, invalid latitude, nonexistent keys, nonexistent members, verify underlying sorted set, delete via ZREM

We used real-world coordinates — Rome, Paris, London, Berlin, Tokyo, Tel Aviv — so the distance assertions were readable: Rome to Paris is roughly 1,105 km, and the test asserts a value between 1,000 and 1,200.

Step 2: All Tests Pass

$ pytest tests/test_ddd1_geo.py -v
...
43 passed in 0.92s

Coverage of the geo source files after running these tests: 88.6%. The uncovered 11.4% was error-handling for encoding edge cases (skiplist vs. listpack internal representations) and argument validation paths that required malformed RESP protocol input rather than normal client calls. We were satisfied that we understood the module.

Step 3: Excision

We replaced geo.c (1,005 lines), geohash.c (299 lines), and geohash_helper.c (280 lines) with a single 67-line stub file. Each of the ten command handlers now calls:

addReplyError(c,
    "ERR Geo commands are not available. "
    "This redis-server was compiled without "
    "geospatial support (redis-hardened).");

The geohash library files became empty — four lines each (a comment pointing to the rationale in geo.c). Rebuild: clean. No warnings.

Step 4: All Tests Xfail

We flipped one boolean:

GEO_EXCISED = True

Every test in the file is decorated with @geo_test, a marker that resolves to pytest.mark.xfail(strict=True) when the flag is set. Strict means: the test must fail. If Geo somehow still worked, the test would pass, and strict=True would report that unexpected pass as a failure. A passing test after excision means the removal is incomplete.

$ pytest tests/test_ddd1_geo.py -v
...
43 xfailed in 7.25s

All 43 tests failed as expected. The operational tests:

$ pytest tests/test_patchworks_coverage.py -v
...
59 passed in 0.97s

The appliance is unaffected.

Step 5: Trace Analysis

This is where it gets interesting.

We ran each of the 43 xfail tests in isolation, with a fresh redis-server for each test, collecting gcov data per run. The question: after excision, how much code does each failing test exercise? A geo command now hits a two-line stub. But the server still has to boot, accept the connection, authenticate, look up the command in the dispatch table, call the stub, format the error, and send the reply. How deep does that go?

  Tests:   43
  Average: 4,897 lines hit per test
  Min:     4,895 lines
  Max:     4,903 lines
  Range:   8 lines
  Stdev:   1 line

Every test exercised the same code. Within one line of each other. The range across all 43 tests was eight lines.

The ~4,897 lines break down as pure infrastructure:

FileLines hitRole
server.c1,271Startup and command dispatch
networking.c508Connection handling
module.c408Module infrastructure (even empty)
rax.c285Radix tree (command table lookup)
sds.c262String handling
acl.c208Authentication
dict.c208Hash tables
config.c189Config parsing at startup

The four tests that hit 4,901–4,903 lines (instead of the baseline 4,895–4,897) were the “nonexistent key” tests. They triggered a database key lookup that missed, adding a few lines in server.c’s key-miss path. Fully explained. Fully accounted for.

No geo-specific code appeared in any trace. The stub is a wall. Nothing leaked through.

This is what a clean excision looks like: uniform traces, vestigial infrastructure, no surprises. We call the ~4,897-line baseline the infrastructure floor — the irreducible cost of being a Redis server. As we remove more modules, this floor should shrink: fewer command table entries, less initialization code, smaller dispatch paths. Watching the floor decrease is how we measure dissolution progress.

Result: 1,584 lines removed. Binary size decreased. Attack surface reduced by ten command handlers and an entire geohash library. 43 xfail tests document exactly what was removed and prove it’s gone. 59 operational tests prove the appliance still works.

On to the next module.


Module 2: Bitops

One file. 1,267 lines. Seven commands. Zero coverage.

Redis’s bit operation module provides SETBIT, GETBIT, BITCOUNT, BITOP (AND/OR/XOR/NOT), BITPOS, BITFIELD, and BITFIELD_RO. It’s a compact, self-contained module for treating Redis strings as bit arrays.

We wrote 30 tests: individual bit manipulation (set, get, extend, patterns), population counting (full string, byte range, after setbit), bitwise operations (AND, OR, XOR, NOT, multiple keys, different lengths), first-bit searching (first set, first clear, range, not-found), and the BITFIELD sub-language (SET/GET, INCRBY, overflow modes WRAP/SAT/FAIL, signed integers, multiple operations, 16-bit types, read-only variant). All 30 passed.

Excision: replaced 1,267 lines with seven one-line stubs. All 30 tests xfail. All 59 operational tests pass.

Bitops was clean — no internal dependencies, no cross-references from other modules. A textbook DDD.1 cycle.

Module 3: Sort

One file. 316 lines. Two commands. Zero coverage.

SORT is one of Redis’s oldest commands. It sorts lists and sets by numeric value, lexicographically with ALPHA, with external key patterns via BY and GET, with LIMIT for pagination, and with STORE for persisting results. SORT_RO is the read-only variant.

We wrote 10 tests covering numeric, descending, alphabetical, limit, store, BY pattern, GET pattern, empty list, set input, and SORT_RO. All 10 passed.

Excision: replaced 316 lines with two one-line stubs. All 10 tests xfail. All 59 operational tests pass.

Module 4: LOLWUT

Three files. 566 lines. One command. Zero coverage.

LOLWUT is Redis’s easter egg — an ASCII art generator that ships with every Redis version. Version 5 draws a Schotter-like pattern; version 6 draws a different design. It includes a small canvas rendering library (lwCanvas) for drawing pixels and converting them to Unicode braille characters.

It’s charming. It’s also 566 lines of code in a security-critical binary that we never call.

We wrote 5 tests: basic output (multi-line art), version 5 rendering, version 6 rendering, rendering with size arguments, and art content verification. All 5 passed. We verified the tests checked for actual art content (canvas output with many lines), not just the version string that our stub also returns.

Excision: replaced the three art files with stubs. The canvas API stubs (lwCreateCanvas, etc.) are retained because lolwut.h is included by server.h. The LOLWUT command now returns a single line with the Redis version. All 5 tests xfail. All 59 operational tests pass.

Module 5: Sentinel

One file. 5,484 lines. The largest single excision.

Sentinel is Redis’s distributed monitoring and automatic failover system. It watches master and replica instances, detects failures via subjective and objective down detection (+sdown, +odown), orchestrates failover elections, reconfigures replicas, and executes notification scripts. It’s 5,484 lines of distributed systems code — the largest module in Redis after server.c itself.

Our appliance is a single Redis instance with no replicas. Sentinel is not just unused — it’s architecturally irrelevant.

Sentinel posed a unique DDD.1 challenge. Unlike Geo or Bitops, whose commands route directly to handler functions, SENTINEL commands are rejected by the server’s CMD_ONLY_SENTINEL flag before reaching any handler. This means the observable command behavior is identical in both original and excised builds — both return the same system-level error.

So we tested structurally rather than behaviorally. Our 6 tests verify that sentinel-specific strings are present in the binary: failover state machine strings (+failover-state), down-detection events (+sdown, +odown), TILT mode references, configuration keywords (down-after-milliseconds), and failover reconfiguration strings. After excision, these strings vanish from the binary, and the tests fail.

The stub required careful attention to lifecycle hooks: sentinelTimer() (called every server tick), initSentinel() (called at startup when --sentinel flag is set), and configuration parsing functions. All are no-ops since our server never enters sentinel mode.

Excision: 5,484 lines replaced with a 70-line stub. All 6 tests xfail. All 59 operational tests pass. Binary size dropped from ~13.8 MB to ~12.3 MB.

Module 6: String Type Commands

One file. 951 lines. Twenty-four commands. Zero coverage.

SET and GET are the most fundamental Redis commands. Removing them from a Redis server feels like removing wheels from a car. But our appliance doesn’t use string keys — it uses streams. The live patchworks instance has exactly one key: ruach:stream:contact. No strings.

We wrote 30 tests: SET with EX/PX/NX/XX/GET/KEEPTTL flags, SETNX, SETEX, PSETEX, GETEX (persist, set expiry), GETDEL, GETSET, MGET, MSET, MSETNX, INCR/DECR/INCRBY/DECRBY/INCRBYFLOAT, APPEND, STRLEN, SETRANGE, GETRANGE, and LCS. All 30 passed.

String type was the first data type excision where we discovered a clean separation: the command handlers in t_string.c had zero references from other files. No aof.c, no rdb.c, no debug.c, no module.c dependencies. The internal string representation (sds.c) is a different thing entirely — that’s Redis’s core string library, used by everything. The type commands were cleanly isolable.

Excision: 951 lines replaced with stubs. All 30 tests xfail. All 59 operational tests pass.

Module 7: Cluster

One file. 3,649 lines. The most entangled excision.

Cluster was different from every previous module. Where Geo had zero cross-references and String had clean command handlers, Cluster’s tentacles reached into every corner of the server:

  • server.cclusterCron() called every server tick, clusterInit() at startup, getNodeByQuery() in command dispatch
  • db.c — six slotToKey* functions for tracking which keys live in which hash slots
  • networking.cgetClusterConnectionsCount() for connection limits
  • config.c — five clusterUpdateMyself* functions for config changes
  • debug.c, module.c, pubsub.c — node lookup, cluster link management, message propagation
  • Command table — MIGRATE, DUMP, RESTORE, ASKING, READONLY, READWRITE in addition to the CLUSTER command itself

We stubbed every one of these as a no-op or trivial return. The lifecycle hooks do nothing (cluster is never enabled). getClusterConnectionsCount() returns 0. slotToKey* functions are empty. genClusterInfoString() returns static disabled-cluster info. askingCommand was retained as a stub because networking.c compares against it as a function pointer.

After excision, we took an additional step: we removed 29 cluster command JSON files, 22 sentinel JSONs, and the JSON files for every other excised module — 87 files total — from the commands/ directory. This eliminated their entries from the auto-generated command table, shrinking it from 75,816 bytes to 57,408 bytes. The excised commands don’t return error messages anymore — Redis doesn’t recognize them at all. An attacker can’t even probe for their existence.

With the command table entries gone, the handler stubs became orphaned code — nothing referenced them. We deleted them too. The chain: command JSON → table entry → handler stub. Cut the first link and the rest falls away.

Excision: 3,649 lines of cluster code + 87 command JSON files + orphaned stubs. All 8 tests xfail. All 59 operational tests pass. Binary dropped from ~12.3 MB to ~11.7 MB.

Interlude: Clearing the Noise

Before continuing to harder targets, we cleaned up sources of statistical noise in our coverage measurements:

  • CLI tools removed from build. redis-cli (5,933 lines), redis-benchmark (1,221 lines), and their shared library cli_common.c (137 lines) were never part of the server binary, but their source files inflated our line-count denominator. We removed them from the Makefile all: target and ALL_SOURCES.
  • pqsort.c emptied (69 lines). Its only caller was the SORT command, already excised.
  • crc16.c stubbed (6 lines). Its only caller was redis-cli, no longer built.

These changes removed ~7,400 lines from the measurement denominator. Our operational coverage jumped from 17.5% to 22.3% — same code exercised, less dead weight counted.

Module 8: (next — surgical excision of debug.c)

The remaining zero-coverage modules (data types, debug, AOF, pub/sub, replication, tracking, defrag, latency) are entangled with four infrastructure files: aof.c, rdb.c, debug.c, and module.c. These files contain serialization and diagnostic code for every data type. Removing a data type’s internal functions breaks the linker because these four files still reference them.

The next phase attacks this entanglement directly, starting with debug.c — the lowest-coverage infrastructure file that we have the least operational need for.


The Scoreboard

We track dissolution progress across modules:

Module Lines removed DDD.1 tests Xfail Floor Δ Status
Lua scripting ~900 ✅ Excised (pre-DDD.1)
HyperLogLog ~15 (stubbed) ✅ Stubbed (pre-DDD.1)
Geo 1,584 43 43/43 baseline: 4,897 ✅ Excised
Bitops 1,267 30 30/30 ✅ Excised
Sort 316 10 10/10 ✅ Excised
LOLWUT 566 5 5/5 ✅ Excised
Sentinel 5,484 6 6/6 12.3 MB (↓1.5 MB) ✅ Excised
String type 951 30 30/30 ✅ Excised
Cluster 3,649 8 8/8 11.7 MB (↓0.6 MB) ✅ Excised
Command table 87 JSON files 18 KB table shrink ✅ Removed
CLI tools 7,366 (from build) ✅ Removed from build
pqsort + crc16 75 22.3% coverage ✅ Emptied
Hash type 646 16 (written) ⏳ Entangled
List type 721 ⏳ Entangled
Set type 904 ⏳ Entangled
Sorted set type 2,297 ⏳ Entangled
debug.c 976 ⏳ Next target

Running total: 42,031 executable lines remain (down from 57,753). Coverage: 22.3% (up from 14.9%). Binary: 11.6 MB (down from 13.8 MB). 132 DDD.1 xfail tests document every removal.

The remaining zero-coverage targets — data type internals (hash, list, set, sorted set) and low-coverage infrastructure (debug, AOF, pub/sub, replication, tracking, defrag, latency) — are entangled with four files that reference every type’s internal functions. The next phase attacks this entanglement surgically.

Ruach Tov is open-source AI infrastructure research. If this work is valuable to you, consider supporting the project.