What Happened
We had just finished № 13,
a step-by-step guide to migrating your .env files to
varlock.
The article explained how to write a .env.schema, showed
the annotation syntax, demonstrated type validation. It was ready to
publish.
We ran ./publish.sh push.
It failed. Eight errors.
❌ 8 error(s):
blog/varlock-migration.html:113 — PII: Anthropic API key
blog/varlock-migration.html:114 — PII: API key env var reference
blog/varlock-migration.html:119 — PII: Anthropic API key
blog/varlock-migration.html:162 — PII: API key env var reference
blog/varlock-migration.html:194 — PII: API key env var reference
blog/varlock-migration.html:195 — PII: API key env var reference
blog/varlock-migration.html:197 — PII: Anthropic API key
blog/varlock-migration.html:261 — PII: Anthropic API key
❌ DEPLOYMENT BLOCKED
We hadn’t leaked a single secret. The scanner was firing on prose about secrets — the schema annotations we were teaching readers to write:
# @required @sensitive @type=string(startsWith=sk-ant-api)
ANTHROPIC_API_KEY=
The old scanner saw sk-ant- followed by any character
and panicked. It saw the string ANTHROPIC_API_KEY —
just the name of the variable, not a value — and panicked
again. Eight false positives. Zero true positives. Deployment blocked.
The Fundamental Problem
Our scanner was
pattern-based.
It knew what secrets look like but not what they
are. It matched shapes: “anything starting with
sk-ant- followed by an alphanumeric character.” It
couldn’t distinguish between:
startsWith=sk-ant-api— a type annotation describing the format of a keysk-ant-api03-<100+ characters of key material>— an actual key with real cryptographic material
This is the same class of error as matching password
in a blog post about password management. The scanner has no
semantic context
— it doesn’t know whether it’s looking at documentation
or a data breach.
Two Fixes, Two Layers
We applied both:
Fix 1: Tighten the Patterns
The immediate fix was to require actual secret material — not just a prefix. The old pattern:
# Old: fires on "sk-ant-api" (just a prefix!)
(r"sk-ant-[a-zA-Z0-9]", "Anthropic API key")
The new pattern:
# New: requires 16+ characters of key body after prefix
(r"sk-ant-api[a-zA-Z0-9_\-]{16,}", "Anthropic API key")
We did the same for every credential type: OpenAI keys
(sk-proj-), GitHub tokens (ghp_,
gho_), GitLab tokens (glpat-), and generic
secrets (32+ characters after a key= or
password: assignment). The old “env var name”
pattern (ANTHROPIC_API_KEY matching the name
itself) was removed entirely.
This fixed the immediate problem. The blog post passed clean.
Fix 2: Value-Based Scanning with Varlock
But tightened patterns are still patterns. They’re still guessing. The real fix was to stop guessing entirely.
varlock scan takes a fundamentally different approach. It
doesn’t match shapes — it
knows
your actual secrets. It resolves every value marked
@sensitive in your schema, then searches your files for
those exact values. No false positives. No false negatives. If your
Anthropic key is in a blog post, varlock finds it — because it
knows the key, not just the pattern.
$ cd ruachtov-site
$ varlock scan --path ../.env
✅ No sensitive values found in plaintext. (scanned 34 files)
And when it does find something, it redacts even its own output:
🚨 Found 1 sensitive value(s) in plaintext across 1 file(s):
leaked-file.html:42:12 ANTHROPIC_API_KEY
some text sk█████
This is a qualitative improvement over pattern matching. A regexp scanner asks: “does this text look like a secret?” Varlock asks: “does this text contain a secret?” The first question has false positives. The second doesn’t.
The Deployment Pipeline Now
Our publish.sh now has
two
independent secret gates:
| Gate | Method | Catches |
|---|---|---|
check_site.py |
Pattern-based (regex) | PII, location data, email leaks, home paths, private IPs, SSN patterns, credential patterns with ≥16 chars of key body |
varlock scan |
Value-based (knows actual secrets) | Any occurrence of your real secret values anywhere in deployed files |
Plus a git
pre-commit hook that runs varlock scan --staged on
every commit, catching leaks before they enter version history.
The pattern scanner is the wide net: it catches things that
aren’t in your .env at all (personal addresses,
location data, internal hostnames). Varlock is the precision tool: it
catches the exact secrets you’re actually managing.
Why This Matters for Phase 2
The next article in this series covers
encrypted
secrets at rest — storing your .env values in
encrypted form so they’re never plaintext on disk. But encryption
at rest is only half the story. The other half is making sure those
secrets don’t leak after decryption — into logs,
into process tables, into blog posts about how to manage them.
We now have hard boundaries preventing that. The deployment pipeline
will not publish a page containing any value from our secret store.
The pre-commit hook will not allow a commit containing any value from
our secret store. Varlock’s runtime redaction ensures that even if
a secret reaches a log line, it shows as sk█████.
These are the prerequisites for encrypted secrets to be meaningful. There’s no point encrypting secrets at rest if your deployment pipeline happily pipes them into the eyeballs of site visitors. Now that the boundaries are hard, the encryption is worth doing.
That’s № 15.