UABUnbiased AI BenchAI model rankings with source links.
Every score links back to its source.
Home/Sources/Terminal-Bench
Terminal-Bench
Live · updated continuously
Browse sectionsTerminal-Bench
TERMINAL-BENCH · benchmark platform

Terminal-Bench

Agent benchmark for hard, realistic multi-step tasks completed inside terminal environments.
verification status
verified
Last checked May 13, 2026

Evidence ledger

ModalitiescodeCadencerelease-basedAPInot publicEvaluations31VerificationverifiedVerified runtime28Manual verified0Relay / mirrored0Backfilled3

Relay sources mirror another provider's public page; manual rows are checked against the cited page; backfilled rows are historical inserts; seeded rows are demo fixtures. Relay rows are supporting evidence, not first-party measurements.

Operational state

snapshot
Latest pull

May 13, 2026

json
parser
Loaded 28 Terminal-Bench 2.0 benchmark records from verified rows.

0.1.0

ok
verify
terminal-bench verification finished with status verified.

May 13, 2026

verified

Benchmarks from this source

Terminal-Bench 2.0
Agentic terminal coding
Accuracy

Latest change explanation

terminal-bench matched terminal-bench-20260513T010704Z with no notable change causes detected.