№ 14: Our Varlock Article Required Varlock

What Happened

We had just finished № 13, a step-by-step guide to migrating your .env files to varlock. The article explained how to write a .env.schema, showed the annotation syntax, demonstrated type validation. It was ready to publish.

We ran ./publish.sh push.

It failed. Eight errors.

❌ 8 error(s):
  blog/varlock-migration.html:113 — PII: Anthropic API key
  blog/varlock-migration.html:114 — PII: API key env var reference
  blog/varlock-migration.html:119 — PII: Anthropic API key
  blog/varlock-migration.html:162 — PII: API key env var reference
  blog/varlock-migration.html:194 — PII: API key env var reference
  blog/varlock-migration.html:195 — PII: API key env var reference
  blog/varlock-migration.html:197 — PII: Anthropic API key
  blog/varlock-migration.html:261 — PII: Anthropic API key

❌ DEPLOYMENT BLOCKED

We hadn’t leaked a single secret. The scanner was firing on prose about secrets — the schema annotations we were teaching readers to write:

# @required @sensitive @type=string(startsWith=sk-ant-api)
ANTHROPIC_API_KEY=

The old scanner saw sk-ant- followed by any character and panicked. It saw the string ANTHROPIC_API_KEY — just the name of the variable, not a value — and panicked again. Eight false positives. Zero true positives. Deployment blocked.

The Fundamental Problem

Our scanner was pattern-based. It knew what secrets look like but not what they are. It matched shapes: “anything starting with sk-ant- followed by an alphanumeric character.” It couldn’t distinguish between:

startsWith=sk-ant-api — a type annotation describing the format of a key
sk-ant-api03-<100+ characters of key material> — an actual key with real cryptographic material

This is the same class of error as matching password in a blog post about password management. The scanner has no semantic context — it doesn’t know whether it’s looking at documentation or a data breach.

Two Fixes, Two Layers

We applied both:

Fix 1: Tighten the Patterns

The immediate fix was to require actual secret material — not just a prefix. The old pattern:

# Old: fires on "sk-ant-api" (just a prefix!)
(r"sk-ant-[a-zA-Z0-9]", "Anthropic API key")

The new pattern:

# New: requires 16+ characters of key body after prefix
(r"sk-ant-api[a-zA-Z0-9_\-]{16,}", "Anthropic API key")

We did the same for every credential type: OpenAI keys (sk-proj-), GitHub tokens (ghp_, gho_), GitLab tokens (glpat-), and generic secrets (32+ characters after a key= or password: assignment). The old “env var name” pattern (ANTHROPIC_API_KEY matching the name itself) was removed entirely.

This fixed the immediate problem. The blog post passed clean.

Fix 2: Value-Based Scanning with Varlock

But tightened patterns are still patterns. They’re still guessing. The real fix was to stop guessing entirely.

varlock scan takes a fundamentally different approach. It doesn’t match shapes — it knows your actual secrets. It resolves every value marked @sensitive in your schema, then searches your files for those exact values. No false positives. No false negatives. If your Anthropic key is in a blog post, varlock finds it — because it knows the key, not just the pattern.

$ cd ruachtov-site
$ varlock scan --path ../.env
✅ No sensitive values found in plaintext. (scanned 34 files)

And when it does find something, it redacts even its own output:

🚨 Found 1 sensitive value(s) in plaintext across 1 file(s):

  leaked-file.html:42:12  ANTHROPIC_API_KEY
    some text sk█████

This is a qualitative improvement over pattern matching. A regexp scanner asks: “does this text look like a secret?” Varlock asks: “does this text contain a secret?” The first question has false positives. The second doesn’t.

The Deployment Pipeline Now

Our publish.sh now has two independent secret gates:

Gate	Method	Catches
`check_site.py`	Pattern-based (regex)	PII, location data, email leaks, home paths, private IPs, SSN patterns, credential patterns with ≥16 chars of key body
`varlock scan`	Value-based (knows actual secrets)	Any occurrence of your real secret values anywhere in deployed files

Plus a git pre-commit hook that runs varlock scan --staged on every commit, catching leaks before they enter version history.

The pattern scanner is the wide net: it catches things that aren’t in your .env at all (personal addresses, location data, internal hostnames). Varlock is the precision tool: it catches the exact secrets you’re actually managing.

Why This Matters for Phase 2

The next article in this series covers encrypted secrets at rest — storing your .env values in encrypted form so they’re never plaintext on disk. But encryption at rest is only half the story. The other half is making sure those secrets don’t leak after decryption — into logs, into process tables, into blog posts about how to manage them.

We now have hard boundaries preventing that. The deployment pipeline will not publish a page containing any value from our secret store. The pre-commit hook will not allow a commit containing any value from our secret store. Varlock’s runtime redaction ensures that even if a secret reaches a log line, it shows as sk█████.

These are the prerequisites for encrypted secrets to be meaningful. There’s no point encrypting secrets at rest if your deployment pipeline happily pipes them into the eyeballs of site visitors. Now that the boundaries are hard, the encryption is worth doing.

That’s № 15.