VALUE.md - Keep a Value (the project)

Output is what you ship; value is the change your recipient would fight to keep.

Keep a Value is a one-page discipline you hold yourself to before you build, or before you build any more. This file is that discipline applied to this project: what changes for the builder who adopts it, and how a stranger would know it did. Read it in 60 seconds; when you're ready to write one for your own project, open the Runbook - it walks you to a conforming VALUE.md in under 90 minutes from a cold start.

Scope

This VALUE.md is for the Keep a Value standard - the contract for adopters of the standard itself, not for the index page in isolation (that has its own VALUE.md at standard/examples/IndexPage-VALUE.md) and not for any downstream VALUE.md a builder writes for their own project. Same recipient (the senior builder about to start work) across the entire file - the standard is what changes for them; the standard is what gets verified for them. The standard applied to the standard itself is one coherent contract, not a meta-frame.

Two valid deployments are in scope. The default deployment is a one-page VALUE.md file at the project root, written before the first commit. The second deployment, documented after a 10-day author self-observation (research/audits/2026-06-17-author-self-observation.md), is conversational: Q1 / Q2 / Q3 walked through in chat - with an LLM, a colleague, or silently - before any artifact is requested. The discipline is the questions and the six-part gate; the file is one artifact the discipline can produce. Both deployments face the same gate. The conversational deployment is what KAV looks like at LLM cadence; see Standard.md "Why this discipline now" for the cadence argument that motivates it.

Status taxonomy

Three states are used in this file. They are defined once here so a reader can act on the difference:

Proposed - the claim has not yet been verified in the field. Use it provisionally; expect the verification to either confirm or kill it.
Promised - the claim has been verified at least once against a dated artifact in research/. The verification path is named in the Check field.
Proven - the falsification gate has been run (eight paired-run pairs complete) and the claim survived. No promise is currently Proven; v1.0 is what gets cut when the first one is.

Value Statement

Keep a Value gives a builder who is about to start a project a one-page contract - one grounding question, three doctrinal questions, a six-part gate - that they fill in before the first commit; the builder gives back the strongest available evidence that the change the contract predicted for the named recipient actually happened - or did not (the contract failed, with the failure logged in the same file).

The anatomy, enumerated

So nothing in this file refers to machinery that is not on the page:

The grounding question (precondition, before the three): Q0 - what real moment did you observe your recipient in, that this project is meant to change?

The three doctrinal questions (the load-bearing contract): Q1 who it's for · Q2 what changes for them · Q3 how a stranger would know it landed.

The six-part gate (every VALUE.md must pass all six; the full text and walkthroughs live at Runbook §1):

No hedge-words (self) - no "maybe," "kind of," "somewhat," "possibly."
A stranger gets it (external - cannot be self-administered) - a real person who is not on the project answers Q1/Q2/Q3 from the sentences alone, in their own words.
Reads aloud without footnotes (self) - no silent "well, technically..." caveats.
One recipient across all three sentences (self) - Q1, Q2, and Q3 name the same person.
The subject test (self) - every promise's grammatical subject is you (the builder) or something you ship. Not the recipient's behavior, not a third party, not the world. (Burgess, Promise Theory, 2005.)
You named what you don't break (self) - one named thing that gets worse, or could get worse, if you optimize for the headline outcome.

The Q0 precondition plus any failed gate item = not ready to build. Stop until the failure is resolved or the project is reframed.

Q0: Grounding observation

On 2026-06-04 I watched a Claude Code session in which an operator built standard/index.html, asked for the next step within 90 seconds of the file landing, and never read the page back as a first-time visitor would. The artifact shipped. The artifact was not tested. The named-recipient cold-read did not happen. That is the build-trap shape this project exists to interrupt: a builder finishes the thing they were building, moves to the next thing, and no one ever sits in the recipient's seat with the artifact in front of them.

I have watched the same pattern in twelve more projects across my own workspace since (catalogued in research/audits/2026-06-04-stranger-test-10-rounds/case-studies/case-02-artifact-audit.md) - software, a fundraising speech, an industry-publication submission, this index page. Different surfaces, same shape. A one-page contract written before the build, naming the recipient and the verification, is the smallest intervention I have found across that catalogue that interrupts the pattern. "Smallest I have found" is exactly that scope - one author across thirteen artifacts. Whether it generalises is what the paired-run pilot tests.

Q1: Who it's for?

The builder who is about to start a piece of work - a feature, a doc, a talk, a submission, a campaign - and has not yet written the first line. They have a hunch about why the work matters and who it serves, but the hunch is not yet on paper. They are at the moment when the build trap closes: when the next twenty hours go into making the thing without ever naming who it is for or how anyone will know it landed.

Specifically: a senior-enough builder that they own the decision of what to build (not just how to build it), and reflective enough to recognize that "I'll know it when I see it" has burned them before. Junior builders who don't yet have the autonomy to refuse a brief are not the recipient; the standard cannot save them from their org. Builders who already have a written contract (PRD, brief, talk abstract) reviewed by the recipient also aren't the recipient - they have already solved the problem this exists for.

Other readers exist: the curious-but-skeptical evaluator (the index page is for them), the researcher doing a literature review (standard/lineage.md is for them), the LLM scraping the doctrine (llms.txt - an llms.txt-spec file pointing crawlers at the canonical sources - is for them). They ride along but they are not the headline recipient.

Q2: What changes for them?

Before: the builder starts the work with a hunch and a calendar entry. They name the recipient in their head ("engineers like me") but not on paper. They imagine the change the work produces ("they'll find the bug faster") but the imagining is not written, not testable, and not falsifiable. When they hand the work over, they look for surface signals (no one complained, the page got traffic, the talk got applause) and accept those as evidence that the work landed. The work-to-evidence gap is invisible because no one ever wrote what the evidence should be.

After: the builder runs the discipline before the next artifact ships. The default form is writing a one-page VALUE.md at the project root before the first commit; the LLM-cadence form is walking Q1 / Q2 / Q3 in conversation before any artifact is requested (see §Scope for both). Either way, Q0 forces a grounding observation - a moment they actually saw, not a thought-exercise. Q1 forces a specific named recipient - one person they could find, not a persona. Q2 forces a paired before/after the recipient could read aloud. Q3 forces a verification check that a stranger could run without asking the builder. The six-part gate catches the failures that bypass the questions (the six items enumerated above). The work that survives the discipline may be no more clever than the work that didn't - the standard predicts (Status: Proposed, see the paired-run pilot under §Q3) that it cannot smuggle its assumptions past the gate.

Adoption changes the order of operations before the build, not the build itself. The discipline is the smallest possible intervention that produces evidence; the rest of the work is still the builder's craft.

Q3: How will you know the change happened for them?

A builder who adopted the standard ships a piece of work, then runs the stranger-acceptance check at a fixed interval after delivery (30 days for software; for non-software, pick the interval at which the recipient's predicted action would naturally have arisen - one quarter for a process change, one campaign cycle for a fundraising piece, one editorial cycle for a publication submission - and name it in the VALUE.md). The check has two halves, both required:

The named recipient (a real person from Q1, not the builder, not a colleague who already knew the work was coming) takes the action the VALUE.md said the work would change. If the VALUE.md said "the on-call engineer finds the failing service in under 90 seconds during a real incident," the recipient is observed (logs, screen-share, or self-report with timestamps - ranked in that order of fidelity) doing exactly that during a real incident. If they didn't, the work failed. If the situation never arose within the interval, log the result as untested (a third terminal state - not pass, not fail); extend the interval once if you have reason to believe the situation will arise soon, then accept untested as the verdict.
A stranger (someone who has never met the builder and was not briefed) reads the VALUE.md cold and answers: "if you saw this work in the wild, what evidence would tell you it landed?" If their answer matches the Q3 check the builder wrote, the contract was paraphrase-stable. If their answer doesn't match, the contract was ambiguous and the verification was the builder's private invention.

Both pass = the standard worked: the contract was specific enough to bind the work and the work delivered what the contract said. Either fails = the standard failed for that piece of work. Log the failure in the project's own CHANGELOG under ### Learned (the adopter's CHANGELOG for their VALUE.md failures; this project's CHANGELOG for failures of the standard itself), name which gate item the contract should have caught, fix the standard if the failure is structural, leave the standard alone if the failure is the builder's craft. The dividing line: a failure is structural if a different builder with the same VALUE.md (or the same builder with the same intent in a different VALUE.md) would have produced the same failure - the gate did not catch a class of mistake. It is craft if the failure is specific to this builder's execution and a clean re-application of the same contract would not reproduce it. When in doubt, log it as structural; the paired-run pilot is what disambiguates ambiguous cases.

Breaks if: the cross-builder pilot at research/experiments/paired-runs/ shows that builders using a VALUE.md and builders using only a description, given the same brief and the same recipient, produce work that the recipient accepts at indistinguishable rates. "Indistinguishable" means the eight-pair difference falls inside a two-sided 90% confidence interval that includes zero; the pilot's protocol.md names the test and the minimum detectable effect (a 20-percentage-point recipient-acceptance gap). If recipient-acceptance is statistically indistinguishable with and without the contract, the discipline is ceremony. Keep a Value exists to be falsified by that experiment; the v1.0 release is gated on the experiment running.

Recruitment-bias control: the paired-run pilot recruits builders who have not previously read this standard (verified by self-report and a one-question screener), and randomises within-pair which builder gets the VALUE.md condition vs. the description-only control. The control is a free-form 100-200 word project description with no anatomy. Recruitment-bias details and the screener are in invite-template.md.

Check: Field - recipient observation per the protocol at research/experiments/paired-runs/protocol.md, paired against a description-only control, scored by rubric.md. Results logged in research/experiments/paired-runs/results-N.md for each pair (where N is the pair number 1..8). Once eight pairs have run, the cumulative result is either v1.0 or "the standard does not hold and is withdrawn." Pair 9+ is allowed but does not move the v1.0 / withdrawal call; the decision is at N=8.

Status: Proposed. Twelve self-applications by the standard's author have been case-studied (case-01-meta-evidence.md, case-02-artifact-audit.md). They confirm that I, the author, can write VALUE.md files that pass the six-part gate. They do not confirm that other builders can, and they do not confirm that builders outperform on recipient-acceptance - both require the paired-run pilot, which is at 0 of 8. The cross-builder pilot is open; recruitment is via the index page at keepavalue.com.

(On the folder name: 2026-06-04-stranger-test-10-rounds/ is the date the experiment started; it grew to twenty rounds. The folder name is preserved as a stable URL; the twenty-round trail and the twelve-application audit are both inside it. See the orientation block at the top of that folder's README.)

Last verified: not yet · Re-verify by: when the paired-run experiment reaches eight pairs (currently 0 of 8).

Promises

P1 - A builder can write a conforming VALUE.md in under 90 minutes from a cold start

Status: Proposed · Recipient: the builder who has never used the standard, sitting down with the brief in front of them for the first time.

Last verified: not yet · Re-verify by: when paired-run recruitment data shows a median-of-first-attempt time-to-conformance across at least five builders.

The runbook (standard/Runbook.md) walks the builder from "I'm not sure if I'm in the build trap" to "I have a conforming VALUE.md" in under 90 minutes, the first time they do it, without recourse to the author. Conformance is judged by the six-part gate (enumerated above) using the stranger-test prompt at standard/stranger-test-prompt.md run against any one major LLM (Claude, ChatGPT, or Gemini). Time includes reading the runbook, drafting, and one round of stranger-test critique + revision.

Breaks if: in the cross-builder pilot, the median first-attempt time-to-conformance across the first five builders exceeds 180 minutes. Or: more than two of the first five builders give up before reaching conformance.

Check: Field - timed observation during the paired-run pilot. The pilot's invite-template.md discloses that time-to-conformance is being measured.

P2 - The standard's own field-evidence is auditable, dated, and falsifiable

Status: Promised · Recipient: the builder evaluating whether to adopt the standard for their own work, who needs to know what the standard's own evidence is and how strong it is.

Last verified: 2026-06-04 (the two-case retrospective audit at case-01-meta-evidence.md and case-02-artifact-audit.md) · Re-verify by: when the paired-run pilot reaches its first published result, the audit gets updated with the field data.

Every claim the standard makes about itself ("twelve self-applications by one author," "two case studies confirm format-conformance by that author," "the cross-builder pilot is the v1.0 gate, at 0 of 8") points to a dated artifact in research/ that the reader can open and check. The audit at research/audits/2026-06-04-stranger-test-10-rounds/ is the current source of truth for the standard's own field state. The "Open gap" sentence on the index page is the one the audit makes load-bearing: twelve self-applications by one author confirm format; zero pairs confirm recipient-acceptance.

Breaks if: a builder evaluating the standard cannot, within five minutes of landing on keepavalue.com, find the field-evidence summary, the dated case studies, the open gap, and the scaffolded experiment. Or: the index page implies field-evidence that is not in the audit.

Check: Test - when a stranger lands on the page, they reach research/audits/ (or its summary on the page) within three clicks. Verified on each index-page revision via the cold-read.

P3 - Adopting the standard does not impose a tool, a vendor, or a working style

Status: Proposed · Recipient: the builder whose org already has a docs-as-code setup, a different VCS, a different IDE, a different deadline cadence.

Last verified: not yet for cross-org adoption · Re-verify by: when the paired-run pilot recruits across at least three orgs with different toolchains and reports the no-tooling-required claim survives without modification.

VALUE.md is a Markdown file at the project root. The standard does not require a build pipeline, a CI gate, a specific IDE, a SaaS account, a code review tool, or a workflow change. It requires the file to exist, the file to conform to the six-part gate, and the Q3 check to be run when the work is delivered. Anything else - make check, the interactive stranger-test widget on keepavalue.com, the CI tooling templates in the runbook - is optional convenience and is documented as such. Self-application across twelve of my own artifacts spans three toolchains (a static-site repo, a research repo, and a writing project with no repo at all) - which is consistent with the no-tooling claim but cannot verify cross-org adoption; that is what the pilot is for.

Breaks if: an evaluator skims the standard and concludes that adoption requires installing a tool, integrating with a specific platform, or changing their team's existing review process. Or: a builder reports they could not adopt the standard because their org forbade the dependency the standard appeared to require.

Check: Test - the runbook is read by a builder whose context the author does not know; they can adopt the standard without asking what to install. Stranger-test the runbook itself: "what does this require you to install or sign up for?" Expected answer: "nothing."

P4 - Every miss pays you back

Status: Proposed · Recipient: the builder who has just had a VALUE.md fail its Q3 check.

Last verified: not yet · Re-verify by: when the paired-run pilot has produced at least three failed Q3 checks and their ### Learned entries can be inspected for the gate-item attribution and the structural-vs-craft call.

When a VALUE.md fails its Q3 check, the failure becomes named knowledge: a CHANGELOG ### Learned entry stating which of the six gate items the failure exposed (using the enumeration above), whether the failure is structural or craft (using the criterion in Q3), and what specifically would have caught it. The builder loses the shipped artifact but gains the diagnostic; the standard loses the conformance claim for that artifact but gains the evidence to either close a gap (if structural) or leave the gate alone (if craft). Misses are not waste; they are the only path by which the standard improves.

Breaks if: an audit of ### Learned entries shows entries that name no gate item, do not classify structural-vs-craft, or attribute every failure to craft (never to structure). All-craft attribution would be the smoking gun for the unbounded-escape-hatch concern: it would mean the standard never updates against its own failures.

Check: Inspect - quarterly audit of ### Learned entries (in this repo's CHANGELOG and, where reported, in adopter repos). Each entry scored against: names a specific gate item? classifies structural-vs-craft? names a change to the standard or a clean reason no change is needed?

What we don't promise

We don't promise the standard prevents bad work. The standard catches a particular failure mode (the unnamed-recipient, untested-claim build trap). It does not catch "the recipient was named correctly but the work was sloppy," "the verification was specific but the builder lied," or "the contract was honest but the recipient changed their mind." Many failures bypass the contract. The standard is one gate, not all gates. (The "Either fails = the standard failed" line in Q3 is about recipient-acceptance of the predicted change, not about general work quality - a sloppy build can pass Q3 if the contract was loose enough; the gate is what makes the contract tight enough to catch this.)
We don't promise applicability outside the recipient-action-evidence triad. The standard works when there is a recipient, an action they would take if the work landed, and evidence of the action that a stranger could see. Pure research with no named user, infrastructure work with no end-user, and exploratory artistic work without a recipient-in-mind are out of scope. The runbook says so on page one; do not adopt the standard for those.
We don't promise it scales to multi-recipient artifacts. The standard's subject test (gate item 4 above - one recipient across all three sentences) requires one recipient across all three questions. Artifacts that genuinely serve different recipients in different sections need different VALUE.md files per section, or a different framework. We have not built that.
We don't promise translations of the doctrine. The doctrine is currently English-only. A standard/translations/ directory exists for community translations; we do not maintain them and do not gate v1.0 on them.
We don't promise the format itself is the final form. The standard's anatomy (one grounding question + three doctrinal questions + six-part gate + four promises) is what we expect to hold. The wording of the runbook, the syntax of the template, the layout of the index page - those move when field evidence tells us to move them. CHANGELOG ### Learned entries are how we report those moves.

What we don't break

We won't add a question or a gate item without a ### Learned entry naming a real failure it would have caught. The temptation to add "Q4: what could go wrong?" or a seventh gate item is constant. We won't ship one unless a failed VALUE.md exists in the public audit trail that the new item would have caught at write-time, and adding it doesn't break the 60-second read of the runbook for a first-time builder.
We won't change the standard's VALUE.md filename, the Standard.md filename, or the three-questions / six-part-gate vocabulary without a major-version bump and a 30-day deprecation window. These are the public contract surface; renaming them silently breaks every adopter's automation, every index, every link.
We won't ship a feature that requires adopters to use a hosted service we control. The standard is a Markdown file convention; if a future "Keep a Value Cloud" exists, adoption of the standard will not require it. The stranger-test prompt is a prompt, not an API; the index page is a static file, not an account; the cross-builder pilot recruits via an open form, not a dashboard.
We won't withdraw a published case study, audit, or experiment result because it embarrasses the standard. If the paired-run pilot shows the standard does not hold, the result ships in research/experiments/paired-runs/ and the standard is withdrawn or revised. We will not redact unfavorable evidence to keep the doctrine intact. That is the standard the standard holds itself to.