The stranger test - a prompt anyone can run against any LLM
This is the cold-read critique anyone can run on a VALUE.md with an off-the-shelf LLM (Claude, ChatGPT, Gemini, anything). It is the same shape of prompt the 10-round stranger-test arc used on the Keep a Value standard itself.
The prompt is the artifact. Copy it, paste your VALUE.md after it, send to your LLM of choice. The output is structured findings you can act on.
How to use it
- Copy everything between the two
BEGIN PROMPTandEND PROMPTlines below. - Replace the bracketed
[paste your VALUE.md here]line with your actual VALUE.md content. - Send to any LLM with a context window big enough for your file plus the prompt (any current Claude, ChatGPT, or Gemini model is fine).
- Read the findings. Apply or skip each, with a reason.
The standard's six-part gate is the lens. The LLM cannot run the gate for you - you still decide what to APPLY and what to SKIP, and the editorial decision is yours. The LLM's job is to surface what a cold reader would notice that you stopped noticing because you wrote it.
A cold-read LLM is not a substitute for a real stranger. The stranger-check ladder in Runbook §1 lists real humans first. Use this prompt when you cannot get a real stranger soon enough, or to pre-filter your VALUE.md before handing it to one.
The prompt
BEGIN PROMPT
You are a cold reader of a VALUE.md file written against the Keep a Value standard. You have not seen this project before. You have no context except the file. The author is not available to clarify.
The standard's six-part gate (plus the Q0 grounding precondition):
Q0 grounding precondition. The author has named a specific moment they observed the
recipient doing the thing the project is meant to change - a date, a place, behavior
watched. Not a persona. Not a survey response. A real moment.
1. No hedge-words. No "maybe", "kind of", "ish".
2. A stranger gets it. You can paraphrase the three sentences back in your own words.
3. The file reads aloud without footnotes. No silent caveats.
4. One recipient across Q1, Q2, Q3. The "who" does not drift.
5. Subject test. Every promise's subject is the builder's behavior, not the
recipient's or a third party's.
6. The author named what they don't break. One thing that could get worse if they
optimize hard for the headline outcome.
Your job is to surface three findings the author would benefit from seeing. Each
finding should be load-bearing - something the author probably stopped noticing
because they wrote it. Skip cosmetic concerns; the standard already rejects them
as recurring-skip categories (rename suggestions, tonal preferences, citation
relocations, "soften this", hard deadlines without infrastructure).
For each finding, output exactly:
### Finding N
- Location: <which section of the VALUE.md - Q0, Q1, Q2, Q3, P1, etc.>
- Issue: <one or two sentences naming what a cold reader notices>
- Proposed change: <a specific edit - paste the actual new sentence or
paragraph, not a vague direction>
- Why apply or skip: <one sentence the author can use to decide>
Output three findings. No preamble. No closing summary.
Here is the VALUE.md to critique:
---
[paste your VALUE.md here]
END PROMPT
What to expect
A good cold-read finding sharpens an existing claim, names a real unknown the author has not yet named, or points at where the recipient ("who") drifts between Q1, Q2, and Q3 (the most common failure mode). A bad finding asks you to rename a noun, soften a verb, or add an academic-style appendix.
Run the prompt against two or three different LLMs in parallel if you have access. If three different LLMs all surface the same finding, it is almost certainly load-bearing. If only one surfaces it, weigh it against the recurring-skip ledger in research/audits/2026-06-04-stranger-test-10-rounds/WRAPUP.md before applying.
When the LLM gets it wrong
LLMs trained to be helpful will sometimes generate plausible-sounding feedback rather than the genuine incomprehension a real stranger provides. Watch for:
- Findings that paraphrase the standard back at you (the LLM read the docs and is regurgitating them, not critiquing your file).
- Findings that propose a deadline without infrastructure (the recurring-skip ledger names this).
- Findings that ask you to rename the project, the standard, or a section header.
Discard those and read the rest.
What this prompt does NOT do
- It does not run the six-part gate for you. The LLM is a cold reader; the gate is a decision instrument you operate.
- It does not substitute for a real stranger (see Runbook §1 ladder).
- It does not validate the paired-run experiment. That is the v1.0 to v1.1 gate; LLM critique cannot stand in for it.
Variations
- Domain-specific lens. Add one sentence to the prompt naming your domain ("the project is a fundraising campaign"). The LLM will surface domain-specific failure modes.
- Recipient lens. Add one sentence naming the recipient ("the recipient is a section editor at a trade publication"). The LLM will surface recipient-perspective failure modes.
- Round-2. Run the prompt again after applying the findings. The cold reader is now reading a different file; the second round's findings are about what your edits exposed.
The prompt itself is CC BY 4.0. Fork, adapt, translate. The standard the prompt invokes is locked under CC BY-ND 4.0 (see LICENSE). If you publish a fork of this prompt that names a different gate or different recurring-skip categories, that fork is no longer "the Keep a Value stranger test" - it is your team's stranger test, derived from this one.