BrennerBot Lab Mode

Research Orchestration

BrennerBot: Three AI Minds. One Rigorous Method. Zero Blind Spots.

Coordinate Claude, GPT, and Gemini in structured debates. Run 11-phase research sessions that produce hypothesis slates, discriminative tests, and evidence trails. Prevent hindsight bias, unfalsifiable claims, and sloppy reasoning in one command.

Quick Install

curl -fsSL https://brennerbot.org/install.sh | bash

Options: --easy-mode, --verify, --system

Get Started Read the Method View Distillations

Locked predictionsStructured debatesAuditable artifacts

Agent Constellation11-phase loop

Hypothesis GeneratorGPT

Test DesignerClaude

Adversarial CriticGemini

Unified Artifact

Hypothesis slate / Discriminative tests / Evidence ledger

Threaded Agent Mail → deterministic merge → human approval gate

Ready to Apply the Method?

Three learning paths from quick start to multi-agent orchestration. Apply Brenner's scientific method to your own research questions.

~30 min Quick StartAgent-AssistedMulti-Agent Cockpit

Start Tutorial

What's Inside

A research toolkit for applying Brenner's epistemology to your own scientific questions.

Browse

Corpus

The complete Brenner transcript collection from Web of Stories, plus curated and .

Learn more

Compare

Distillations

Three frontier model of Brenner's methodology. Compare perspectives from GPT-5.2, Opus 4.5, and Gemini 3.

Learn more

Learn

Method

The , loop structure, and framework that operationalize Brenner's approach to scientific discovery.

Learn more

Core Workflow

From Question to Conclusion: The Brenner Loop

Research sessions follow a rigorous, reproducible path. Every step is tracked, auditable, and reversible.

1. Intake

Frame the research question

2. Sharpening

Refine hypotheses and scope

3. Level-Split

Separate program from interpreter

4. Exclusion-Test

Design discriminative tests

5. Object-Transpose

Choose the optimal system

6. Scale-Check

Validate against physics

7. Agent-Dispatch

Convene the tribunal

8. Synthesis

Merge agent outputs

9. Evidence

Gather external signals

10. Revision

Update hypotheses

11. Complete

Publish artifacts

Phase

Intake

Frame the research question

Phase

Sharpening

Refine hypotheses and scope

Phase

Level-Split

Separate program from interpreter

Phase

Exclusion-Test

Design discriminative tests

Phase

Object-Transpose

Choose the optimal system

Phase

Scale-Check

Validate against physics

Phase

Agent-Dispatch

Convene the tribunal

Phase

Synthesis

Merge agent outputs

Phase

Evidence

Gather external signals

Phase

Revision

Update hypotheses

Phase

Complete

Publish artifacts

Undo / Redo

Every action is reversible. Explore without fear.

Session Replay

Reproduce any session exactly for audit and learning.

Error Recovery

Graceful checkpoints when things go wrong.

Research Lab

Apply the Method to Your Research

The is an interactive research framework that helps you develop using four cognitive operators. Run structured sessions and track your evolving understanding.

⊘Level-Split

✂Exclusion-Test

⟂Object-Transpose

⊞Scale-Check

Explore Sessions View Operators Start Tutorial

Multi-Agent Orchestration

Your Research Team: AI Agents That Debate, Challenge, and Synthesize

Each agent has a precise mandate. Together they sharpen hypotheses, design lethal tests, and merge evidence into auditable artifacts - without surrendering control.

"What if you could have Claude, GPT, and Gemini debate your hypothesis - challenging each other until only the strongest ideas survive?"

Hypothesis Generator

Hunt paradoxes, propose hypotheses

Creative, divergent thinking

Signature stance

"What if both established models are wrong?"

Test Designer

Design discriminative tests with potency controls

Rigorous, detail-oriented

Signature stance

"This test will eliminate half our hypotheses in one observation."

Adversarial Critic

Attack framing, check scale constraints

Skeptical, thorough

Signature stance

"Have you considered that the entire premise might be wrong?"

Oxford Style

Debate Mode

Proposition vs opposition with a judge

Best for: Testing hypothesis strength

Socratic

Debate Mode

Probing questions to surface hidden assumptions

Best for: Finding weak links fast

Steelman Contest

Debate Mode

Build the strongest case, then dismantle it

Best for: Exploring the hypothesis space

Terminal

brenner-cli

# Start a debate session
brenner session start --thread-id RS-20260105 \
  --format oxford \
  --question "Does the morphogen gradient model explain cell fate?"

# Watch agents debate in real-time
brenner session status --thread-id RS-20260105 --watch

# See the merged artifact
brenner session compile --thread-id RS-20260105

Coordination Visualization

Deterministic Merge

Thread ID: RS-20260106-001Ack tracking enabled

Kickoff

Threaded prompt goes to each agent role

Deltas

Structured responses return with citations

Merge

Deterministic compiler reconciles evidence

Human

You decide what ships and what dies

Coordination Without Chaos

Agent Mail keeps every exchange auditable

Every message lands in a thread, every response is acknowledged, and every delta is preserved. You stay in the loop with human approval gates at every step.

Built on with thread IDs, ack receipts, and merge-safe deltas.

Kickoff sent3 agents live

Deltas merged1 artifact ready

Human approvalRequired

Research Hygiene

Built-In Guardrails for Rigorous Science

The system blocks common failure modes: hindsight bias, unfalsifiable hypotheses, ignored confounds, and overconfidence. Rigor is enforced before you waste a week.

Coach Mode

Guided checkpoints, inline explanations, and Brenner quotes as you work.

Beginner → ExpertContextual feedback

Prediction Lock

Lock outcomes before results arrive to eliminate hindsight bias.

Immutable predictionsAudit trail

Calibration Tracking

Brier score, overconfidence bias, and domain-level accuracy trends.

Confidence scorecardBias alerts

Confound Detection

Domain-specific confounds flagged with targeted prompting questions.

8 research domainsAutomatic prompts

Artifact Linting

50+ rules enforcing third alternatives, potency controls, and citation hygiene.

Structural checksCitation validation

Prediction Lock Timeline

No hindsight

Design test

Enter predictions

Lock outcomes

Locked

Run experiment

Compare results

Confound Detection

8 domains

PsychologyEpidemiologyEconomicsBiologySociologyNeuroscienceComputer ScienceGeneral

Selection bias detected - how will you ensure random sampling?

Reverse causation possible - can you establish temporal order?

Calibration + Linting

Scorecard

Calibration curve (last 10 tests)

Third alternative presentPass

Potency control definedPass

Citation anchorsReview

Without Guardrails

Predictions revised after results are known

Confounds discovered in peer review

Vague hypotheses survive unchanged

Overconfidence goes untracked

With BrennerBot

Predictions locked before execution

Confounds flagged during design

Unfalsifiable language is blocked

Calibration metrics stay visible

Discovery & Intelligence

Intelligence Built In: Search, Simulate, Score

Connect to prior work instantly, model evidence impact before you test, and track which hypotheses survive pressure. This is research intelligence, not a chat log.

Hypothesis Similarity Search

Find related work across sessions with offline embeddings and clusters.

Client-side onlyDuplicate detection

What-If Scenarios

Simulate outcomes before running tests and prioritize high-impact experiments.

Info gain rankedScenario builder

Robustness Scoring

Evidence-weighted survival scores reveal fragile vs battle-tested ideas.

Support vs challengeRobustness meter

Anomaly Detection

Track contradictions and spawn new hypotheses instead of burying them.

Anomaly registerParadigm alerts

Similarity Search

Offline

Query: "morphogen gradient cell fate"

Morphogen gradient (RS-20251230)82%

Statement 0.8 / Mechanism 0.6 / Domain 0.9

Timing gate model (RS-20250112)71%

Statement 0.7 / Mechanism 0.5 / Domain 0.8

Signal relay chain (RS-20241018)64%

Statement 0.6 / Mechanism 0.4 / Domain 0.9

Runs entirely client-side - your hypotheses never leave your machine.

What-If Scenario

Info gain

Starting confidence60%

If supports78%

If challenges35%

Expected information gain: 0.42

Best next test: Perturb gradient + checkpoint timing

Robustness

Survival score

H1: Morphogen gradient72%

3 supporting / 1 challenging (survived)

H2: Timing mechanism35%

1 supporting / 2 inconclusive

Anomaly Register

Quarantine

X-001Active

Oscillating fate markers

Conflicts with H1 + H2

X-014Deferred

Late-stage inversion

Waiting on potency control

Deep Dive

The Operator Algebra: Brenner's Methods as Executable Code

Sydney Brenner's breakthrough wasn't just his discoveries - it was his method. We've encoded his cognitive patterns as composable operators that you can apply systematically.

The Brenner Method in 4 Steps

Split the levels

Separate the 'what' from the 'how'

Design killing tests

Find experiments that eliminate possibilities

Choose your system

Pick the easiest organism/model to test with

Check the physics

Make sure it's physically possible

Want the precise notation? See the operators below.

⊘

Level-Split

"Separate program from interpreter"

Message vs machine, genotype vs phenotype. Includes the 'chastity vs impotence' diagnostic.

Template

"What is the information? What is the mechanism?"

✂

Exclusion-Test

"Design tests that eliminate, not confirm"

Forbidden patterns: what cannot occur if H is true. Rated by discriminative power.

Template

"If H1 is true, we should NEVER see..."

⟂

Object-Transpose

"Change the system until the test is easy"

Choose organism or model strategically. The experimental object is a design variable.

Template

"What system would make this test cheap and unambiguous?"

⊞

Scale-Check

"Stay imprisoned in physics"

Validate against physical constraints. Calculate timescales, length scales, energy scales.

Template

"Is this physically possible at the relevant scale?"

The Core Composition

(⌂ ∘ ✂ ∘ ≡ ∘ ⊘) powered by (↑ ∘ ⟂ ∘ 🔧) constrained by (⊞) kept honest by (ΔE ∘ †)

- Start from a paradox (◊), split levels (⊘), extract invariants (≡)

- Design exclusion tests (✂), materialize as decision procedure (⌂)

- Power by amplification (↑) in well-chosen system (⟂) you build yourself (🔧)

- Constrain by physics (⊞), keep honest with exception handling (ΔE) and theory killing (†)

Extended Operators6 more patterns

↑

Amplify

Use selection, dominance, regime switches

◊

Paradox-hunt

Use contradictions as beacons

⊕

Cross-domain

Import tools from other fields

∿

Dephase

Work out of phase with fashion

†

Theory-kill

Drop hypotheses when the world says no

⌂

Materialize

What would I see if this were true?

TypeScript

brenner-loop/operators

import { pipe } from "@/lib/brenner-loop/operators/framework";

const brennerPipeline = pipe(
  levelSplit,        // Separate levels
  invariantExtract,  // Find what survives
  exclusionTest,     // Design killing experiments
  materialize,       // Compile to decision procedure
);

const result = brennerPipeline(hypothesis, context);

Interview Segments

Model Distillations

Operator Types

0k+

Words of Wisdom

“
I think many fields of science could do a great deal better if they went back to the classical approach of studying a problem, rather than following the latest fashion.
Sydney Brenner-Nobel Laureate in Physiology or Medicine, 2002

Start the Tutorial