BrennerBot

Your agent can produce a beautiful artifact that still fails Brenner's standards. Use this checklist to catch the most common failure modes.

Hypotheses

4+ distinct hypotheses (not variants of one idea)

If missing: Ask for 2 additional hypotheses that would require different tests to distinguish.

Each hypothesis includes an explicit mechanism

If missing: For each hypothesis: 'state mechanism in 1–2 sentences; if unknown, mark as unknown and propose tests to identify it.'

At least one genuine third alternative is treated as first-class

If missing: Explicitly demand a third alternative that is neither A nor B, with its own predictions and tests.

Discriminative Tests

Each test makes different predictions across hypotheses

If missing: Rewrite tests as prediction tables: H1/H2/H3 predict different outcomes; remove tests that don't discriminate.

Each test states exclusion logic (what outcome rules out what)

If missing: Add an explicit 'If we observe X, we exclude H2 because…' for each hypothesis/test pair.

Each test includes a potency check (🎭) for null/ambiguous outcomes

If missing: Ask the agent: 'If null, what do we learn? If ambiguous, redesign the test until it's informative.'

Tests are ranked by discriminative power + feasibility

If missing: Require a ranked list and pick the top 1 test that eliminates the most hypotheses with realistic effort.

Assumptions & Scale

Assumption ledger includes theoretical + methodological + background assumptions

If missing: Ask for three buckets and ensure at least 3–5 assumptions per hypothesis.

Scale checks include numbers or order-of-magnitude estimates (⊙)

If missing: Demand explicit numbers: 'Give rough magnitudes; if unknown, bound ranges and state what would falsify them.'

Critique & Next Steps

Adversarial critique attacks the framing (not just details)

If missing: Ask for 1–2 alternative framings and why your current framing might be wrong.

Next steps are specific and test-first (not vague reading)

If missing: Ask for the single most discriminative next test + the minimum evidence needed to run it.

Request a Revision (Copy/Paste)

When you find failures, paste this to your agent with the failed items listed:

Revision prompt

Your previous artifact is close, but it fails some Brenner review checks:
 
[LIST THE FAILED CHECKS HERE]
 
Please revise the artifact, keeping the same section headings and structure. For each fix:
- Make the change as minimally as possible
- Explicitly mark additions with "NEW:" so I can spot them
- Ensure tests are discriminative (different predictions across hypotheses)
- Ensure every test has a potency check (what we learn if null)
 
Return the updated artifact in full.

Compare to Worked Examples

If you’re unsure what “good” looks like, compare your artifact to the canonical worked examples. Notice how hypotheses are genuinely different, predictions are contrastive, and tests are designed to exclude.

Browse all examples →Biology Computer Science Social Science

Agent-Assisted Tutorial Complete!

You've accomplished:

Set up an AI coding agent to internalize the Brenner method
Refined a research question using Brenner-style critique
Generated a hypothesis slate + assumption ledger
Produced discriminative tests ranked by potency
Reviewed the artifact with a failure-mode checklist

Coming up: Next: run another loop on a new question, or move to Multi-Agent Cockpit for parallel role-separated orchestration.

Human Review

Checklist Progress