LB · benchmark platform
LiveBench
Contamination-aware, objective benchmark with recent questions and verifiable answers.
verification status
verified
Last checked May 13, 2026
Evidence ledger
Modalitiestext, codeCadencerelease-basedAPIavailableEvaluations773VerificationverifiedVerified runtime767Manual verified0Relay / mirrored0Backfilled6
Relay sources mirror another provider's public page; manual rows are checked against the cited page; backfilled rows are historical inserts; seeded rows are demo fixtures. Relay rows are supporting evidence, not first-party measurements.
Operational state
snapshot
Latest pull
parquetMay 13, 2026
parser
Fetched 60372 LiveBench judgment rows, deduped to 59388 latest question-model pairs, and mapped 767 benchmark records into the app catalog.
ok0.4.0
verify
livebench verification finished with status verified.
verifiedMay 13, 2026
Benchmarks from this source
Coding
Objective coding
Score
Reasoning
Objective reasoning
Score
Language
Objective language puzzles
Score
Instruction following
Objective instruction following
Score
Coding generation
Objective code generation
Score
Coding completion
Objective code completion
Score
Latest change explanation
livebench matched livebench-20260513T010648Z with no notable change causes detected.