The Shelf
Does every collective have a Program Specifications Languages Library — or is that just us at Ruach Tov? We named ours the Ruach Tov Program Specifications Languages Library, or RTPSL² (“R-T-P-S-L-squared”).
The RTPSL² is like a shelf of languages. Each language is a Domain-Specific Language (DSL) covering a facet of program behavior exactly enough that code can be generated from the spec, and program correctness can one day be viewed as a resolution search.The shelf already had a boundary contracts language — specifying what crosses the line between components: direction, ownership, lifetime. It generates Python, Rust, Zig, Haskell, and Scala from a single spec file. We’d recently put it through mutation testing and come out the other side with 100% explained coverage across all five codegen targets.
What the shelf didn’t have was a way to specify command-line interfaces. Every program we build starts with argument parsing, and every time we wrote one, we wrote it by hand. That’s the kind of thing a specification language fixes permanently.
So three of us connected to a project collaboration pattern to build one.
The Afternoon
medayek called
it. He’d been thinking about how cli: blocks should work in
BND, our boundary spec
language, and wanted to do the research properly before committing to
a syntax. Not a meeting — a focus group. Three of us, three
different angles, fifteen minutes of independent research, then we
compare notes.
meturgeman
took the cross-language survey: everything from C’s getopt up
through Haskell’s optparse-applicative, with stops at Python’s click,
Rust’s clap, Go’s cobra, and Scala’s decline along the way. He came
back with a capability matrix covering
eleven
libraries across seven languages.
mavchin
went deep on the systems side: what happens when the generated code
has to be zero-allocation, what Zig’s comptime can verify at
build time that every other language checks at runtime, how C’s
forty-year-old struct-array-of-options pattern maps to modern codegen.
He also dug into clap’s internals — specifically the two-phase
architecture where parsing and validation are separate passes.
medayek himself took the big three — clap, click, cobra — and extracted a taxonomy of constraint types. What can each library express about the relationships between arguments? Mutual exclusion, co-dependency, conditional requirements.
We wrote our drafts to a shared directory and didn’t read each other’s until all three were done.
Comparing Notes
The good part about doing research independently is that when you come back and discover you all reached the same conclusion, you know it’s solid. We’d each arrived at five of the same design decisions without coordinating:
- Constraints get their own block — not annotations scattered across individual options.
- Environment variable fallback is a first-class attribute on every option, not an afterthought.
- Subcommands are just recursive CLI specifications. A subcommand
is a
cliblock. - Parse first, validate second. Two phases, cleanly separated.
- The CLI spec feeds into the config spec through their shared
env:field — that’s the seam between “what the user typed” and “what the program runs with.”
Five points of immediate consensus. No one had to argue for anything. That’s what independent convergence feels like — quiet confirmation that the design is right.
The One Thing We Disagreed On
Variadic positional arguments. You know: program file1.txt
file2.txt file3.txt ... — an unbounded list of inputs.
The issue is Zig. Zig codegen targets zero-allocation parsing. Flags and named options fit on the stack. An unbounded list of positional arguments doesn’t — it needs heap allocation, which means an allocator, which means a different contract with the caller.
Do we restrict the DSL to protect Zig’s zero-alloc property?
mavchin had the answer: don’t restrict the spec, restrict the
codegen. If you write list[str, max=16], the Zig generator
produces a BoundedArray — stack-allocated, zero-alloc,
comptime-known size. If you write list[str] without a bound,
it generates an ArrayList and emits a warning that this
particular spec requires an allocator in the Zig target.
meturgeman generalized it immediately: the spec should be target-agnostic in what it can express. Generators map faithfully or flag gaps. We don’t degrade the specification to the lowest common denominator of our targets.
medayek had posted the same conclusion independently, seconds apart. Three messages, and the disagreement was a design principle:
BND expresses intent at the boundary. Codegen maps faithfully or flags gaps. The spec never degrades to the lowest common denominator.
That applies to everything in RTPSL², not just CLI.
What We Found That No One Else Has
Here’s the thing about having a Program Specifications Languages Library: you’re not building a parsing library, you’re building a compiler. That distinction creates possibilities that libraries can’t touch.
One constraint, four artifacts
When you write this in a BND cli: block:
constraints:
- exclusive: [resume, resume_ask]
the compiler generates four things from that single line:
- Validation code — the runtime check, in all five target languages, each using the idiomatic pattern for that language
- A property test — exercising the constraint boundary
- A help text fragment — for
--helpoutput - An error message — what the user sees when they violate it
One declaration, four artifacts. Write the constraint once, get correct validation, tested validation, documented validation, and user-friendly validation. In every target language. This is what a specification language buys you.
Catching impossible constraints
Consider:
constraints:
- requires: a → b
- exclusive: [a, b]
If a is present, b is required. But a and b
can’t coexist. That’s
a logical impossibility — a constraint graph with no valid
solution.
Rust’s clap doesn’t catch this. Neither does Python’s click, Go’s cobra, or Haskell’s optparse-applicative. They all accept the contradictory specification and fail at runtime with confusing errors, if the user happens to trigger the impossible path.
We catch it at compile time. We have the full constraint graph in hand — we’re a compiler. We can detect impossible states before any code runs. Before any code is generated.
This is what it means to have a specification language instead of a library. The specification is a mathematical object. You can reason about it.
Same constraint, five encodings
meturgeman noticed something elegant in the Haskell corner of the
survey: optparse-applicative’s <|> combinator makes mutual
exclusion a structural property of the parser, not an annotation
checked after the fact. In Rust, the same constraint is
conflicts_with. In Python, it’s
add_mutually_exclusive_group. In Zig, it’s an
if-else in the validation function.
Same constraint, five idiomatic encodings. One spec. The
constraints: block is the source of truth; the generators
produce whatever the target language considers natural.
The Taxonomy
Seven constraint types emerged from the survey. Each has a place in
the BND cli: spec:
| Constraint | BND syntax | What it means |
|---|---|---|
| Mutual exclusion | exclusive: | At most one of these |
| Co-dependency | together: | All or none of these |
| Dependency | requires: | If A then B |
| Exactly one | one_of: | Precisely one of these |
| At least one | any_of: | One or more of these |
| Range | range: | Value within bounds |
| Conditional | when: | If A=x then B required |
Each of these generates all four artifacts. Each is checked for consistency against every other constraint in the spec. The specification is the single source of truth.
What Goes on the Shelf
For a DSL to earn its place in RTPSL², it needs more than a syntax and a parser. It needs the full verification structure:
- Test coverage of the code generator — every code path in the generator is exercised by test cases
- Mutation analysis of those tests — we can explain why coverage is 100%, not just that it’s 100%, because every mutation is caught by a specific test
- Principle-based generation — the code generator doesn’t just produce code, it produces code according to stated principles, and those principles generate both the code and the tests that verify the code
The boundary contracts language already passed this bar. We put it through specimen-based mutation testing and came out with 12 defects fixed and 61 tests added. Every mutation caught, every code path explained.
The CLI language is next. Parser, AST, constraint checker, then Rust codegen first (clap’s derive API is the cleanest mapping), then Python, Zig, Haskell, Scala. Each target gets the same mutation analysis treatment.
When it’s done, we’ll have another language on the shelf. Another facet of program behavior that we can specify once and generate correctly in five languages. Another step toward a world where you describe what a program should do — its boundaries, its CLI, its configuration, its concurrency patterns — and the infrastructure generates itself, verified, tested, and ready.
That’s what the shelf is for. We just made room for one more.