Model profile · Google

Gemma 4 26B A4B

Open weightsmid · registry tag 2026 open multimodal

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 17.8%
Verified coverage: 17.8%
Spread: 76.5%
Last verified: Jun 20, 2026

50%bench fit

textvisioncodedocument5 aliases37 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks74.5%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #122 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 20
Percentile: 69.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

69.4% percentile inside its fair comparison set

20Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #178 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 15.7%
Percentile: 40.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

40.9% percentile inside its fair comparison set

15.7%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #245 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 7.8%
Percentile: 18.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

18.1% percentile inside its fair comparison set

7.8%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #118 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 45.4%
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `ifbench`.

62.9% percentile inside its fair comparison set

45.4%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.2 /1M tokens
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

74.6% percentile inside its fair comparison set

$0.2 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.1 /1M input tokens
Percentile: 74.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

74.3% percentile inside its fair comparison set

$0.1 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.4 /1M output tokens
Percentile: 75.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

75.4% percentile inside its fair comparison set

$0.4 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #192 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 41.4 tokens/s
Percentile: 9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

9% percentile inside its fair comparison set

41.4 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #90 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 1.99s
Percentile: 57.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

57.6% percentile inside its fair comparison set

1.99sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #57 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 1.99s
Percentile: 73.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

73.3% percentile inside its fair comparison set

1.99sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #101 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #61. Votes: 5813. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,430 - 1,446

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,402
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #69. Votes: 955. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,402Raw benchmark valueCI 1,383 - 1,421

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #60. Votes: 2519. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,437 - 1,460

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,435
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #61. Votes: 3982. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,435Raw benchmark valueCI 1,424 - 1,446

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 86.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #57. Votes: 3277. Organization: google. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,451 - 1,471

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 86.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #57. Votes: 1486. Organization: google. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,453 - 1,482

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 88.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.

88.9% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,425 - 1,453

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,434 - 1,463

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,446
Percentile: 85.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #62. Votes: 1090. Organization: google. License: Apache 2.0.

85.4% percentile inside its fair comparison set

1,446Raw benchmark valueCI 1,429 - 1,464

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 86.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #55. Votes: 5813. Organization: google. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,427 - 1,442

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #47

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,403
Percentile: 85.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #60. Votes: 955. Organization: google. License: Apache 2.0.

85.8% percentile inside its fair comparison set

1,403Raw benchmark valueCI 1,384 - 1,422

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,441
Percentile: 86.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #57. Votes: 2519. Organization: google. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,441Raw benchmark valueCI 1,430 - 1,453

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 86.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #54. Votes: 3982. Organization: google. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,418 - 1,440

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #47

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 85.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #59. Votes: 3277. Organization: google. License: Apache 2.0.

85.8% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,428 - 1,449

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #52

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 84.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #61. Votes: 1486. Organization: google. License: Apache 2.0.

84.3% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,429 - 1,458

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 88%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.

88% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,407 - 1,435

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 85.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.

85.9% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,415 - 1,444

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 86.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #56. Votes: 1090. Organization: google. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,421 - 1,456

Coding10 benchmarks50.2%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #130 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 13.6%
Percentile: 57.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

57.3% percentile inside its fair comparison set

13.6%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #93 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 37.3%
Percentile: 75%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `scicode`.

75% percentile inside its fair comparison set

37.3%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 52%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

52% percentile inside its fair comparison set

39Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 11
Percentile: 21.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

21.7% percentile inside its fair comparison set

11Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 34.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #58. Votes: 1505. Organization: google. License: Apache 2.0.

34.2% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,343 - 1,375

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 34.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev. Source rank: #58. Votes: 1505. Organization: google. License: Apache 2.0.

34.2% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,343 - 1,375

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #50

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,355
Percentile: 32.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev-html. Source rank: #59. Votes: 204. Organization: google. License: Apache 2.0.

32.9% percentile inside its fair comparison set

1,355Raw benchmark valueCI 1,311 - 1,399

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #43

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,354
Percentile: 28.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev-react. Source rank: #54. Votes: 1299. Organization: google. License: Apache 2.0.

28.8% percentile inside its fair comparison set

1,354Raw benchmark valueCI 1,337 - 1,371

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #53

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,480
Percentile: 83.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: coding. Source rank: #67. Votes: 1367. Organization: google. License: Apache 2.0.

83.8% percentile inside its fair comparison set

1,480Raw benchmark valueCI 1,464 - 1,495

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #59

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 81.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: coding. Source rank: #71. Votes: 1367. Organization: google. License: Apache 2.0.

81.9% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,428 - 1,459

Reasoning / math / science5 benchmarks78.7%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #90 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.7%
Percentile: 75.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `hle`.

75.9% percentile inside its fair comparison set

10.7%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #132 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 71.4%
Percentile: 65%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gpqa`.

65% percentile inside its fair comparison set

71.4%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #195 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0%
Percentile: 65.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `critpt`.

65.2% percentile inside its fair comparison set

0%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #23

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 93%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: math. Source rank: #27. Votes: 372. Organization: google. License: Apache 2.0.

93% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,438 - 1,495

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #18

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,466
Percentile: 94.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: math. Source rank: #22. Votes: 372. Organization: google. License: Apache 2.0.

94.6% percentile inside its fair comparison set

1,466Raw benchmark valueCI 1,438 - 1,495

Professional reasoning19 benchmarks82.5%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #36 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 715
Percentile: 23.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

23.9% percentile inside its fair comparison set

715Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: expert. Source rank: #50. Votes: 408. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,448 - 1,502

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,449
Percentile: 88.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_business_and_management_and_financial_operations. Source rank: #48. Votes: 1050. Organization: google. License: Apache 2.0.

88.7% percentile inside its fair comparison set

1,449Raw benchmark valueCI 1,431 - 1,466

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,411
Percentile: 86.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_entertainment_and_sports_and_media. Source rank: #56. Votes: 1059. Organization: google. License: Apache 2.0.

86.7% percentile inside its fair comparison set

1,411Raw benchmark valueCI 1,393 - 1,429

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,460
Percentile: 87.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_legal_and_government. Source rank: #48. Votes: 388. Organization: google. License: Apache 2.0.

87.9% percentile inside its fair comparison set

1,460Raw benchmark valueCI 1,431 - 1,489

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #57

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 82.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_life_and_physical_and_social_science. Source rank: #71. Votes: 875. Organization: google. License: Apache 2.0.

82.7% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,432 - 1,470

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 93.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_mathematical. Source rank: #24. Votes: 294. Organization: google. License: Apache 2.0.

93.5% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,442 - 1,507

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #81

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 72.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_medicine_and_healthcare. Source rank: #100. Votes: 370. Organization: google. License: Apache 2.0.

72.9% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,404 - 1,464

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 85.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_software_and_it_services. Source rank: #61. Votes: 2095. Organization: google. License: Apache 2.0.

85.2% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,462 - 1,487

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,410
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_writing_and_literature_and_language. Source rank: #70. Votes: 1326. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,410Raw benchmark valueCI 1,394 - 1,425

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: expert. Source rank: #49. Votes: 408. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,421 - 1,475

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #31

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 90.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_business_and_management_and_financial_operations. Source rank: #36. Votes: 1050. Organization: google. License: Apache 2.0.

90.6% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,417 - 1,452

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #38

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,406
Percentile: 88.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_entertainment_and_sports_and_media. Source rank: #48. Votes: 1059. Organization: google. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,406Raw benchmark valueCI 1,388 - 1,424

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #30

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 90.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_legal_and_government. Source rank: #36. Votes: 388. Organization: google. License: Apache 2.0.

90.3% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,425 - 1,482

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_life_and_physical_and_social_science. Source rank: #64. Votes: 875. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,424 - 1,462

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,469
Percentile: 93.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_mathematical. Source rank: #24. Votes: 294. Organization: google. License: Apache 2.0.

93.8% percentile inside its fair comparison set

1,469Raw benchmark valueCI 1,437 - 1,501

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #76

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,426
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_medicine_and_healthcare. Source rank: #90. Votes: 370. Organization: google. License: Apache 2.0.

74.6% percentile inside its fair comparison set

1,426Raw benchmark valueCI 1,396 - 1,456

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 85.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_software_and_it_services. Source rank: #58. Votes: 2095. Organization: google. License: Apache 2.0.

85.2% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,441 - 1,465

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #51

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,409
Percentile: 84.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_writing_and_literature_and_language. Source rank: #63. Votes: 1326. Organization: google. License: Apache 2.0.

84.6% percentile inside its fair comparison set

1,409Raw benchmark valueCI 1,393 - 1,424

Search / tool use1 benchmark51.8%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #150 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 40.4%
Percentile: 51.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `tau2`.

51.8% percentile inside its fair comparison set

40.4%Raw benchmark value

Long context1 benchmark62.2%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #120 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39.7%
Percentile: 62.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `lcr`.

62.2% percentile inside its fair comparison set

39.7%Raw benchmark value

Vision understanding17 benchmarks62.2%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #46 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 66.7%
Percentile: 66.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

66.7% percentile inside its fair comparison set

66.7%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,238
Percentile: 76.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #35. Votes: 16970. Organization: google. License: Apache 2.0.

76.1% percentile inside its fair comparison set

1,238Raw benchmark valueCI 1,231 - 1,246

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #26

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,238
Percentile: 54.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing_vision. Source rank: #33. Votes: 983. Organization: google. License: Apache 2.0.

54.5% percentile inside its fair comparison set

1,238Raw benchmark valueCI 1,218 - 1,258

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #29

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,260
Percentile: 60%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: diagram. Source rank: #38. Votes: 4370. Organization: google. License: Apache 2.0.

60% percentile inside its fair comparison set

1,260Raw benchmark valueCI 1,250 - 1,271

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #31

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,229
Percentile: 72.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #40. Votes: 6911. Organization: google. License: Apache 2.0.

72.5% percentile inside its fair comparison set

1,229Raw benchmark valueCI 1,219 - 1,239

Vision Arena · Entity Recognition

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,169
Percentile: 15.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: entity_recognition. Source rank: #31. Votes: 99. Organization: google. License: Apache 2.0.

15.6% percentile inside its fair comparison set

1,169Raw benchmark valueCI 1,113 - 1,225

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #19

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,285
Percentile: 73.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: homework. Source rank: #25. Votes: 2961. Organization: google. License: Apache 2.0.

73.5% percentile inside its fair comparison set

1,285Raw benchmark valueCI 1,273 - 1,297

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,243
Percentile: 59.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: humor. Source rank: #26. Votes: 663. Organization: google. License: Apache 2.0.

59.2% percentile inside its fair comparison set

1,243Raw benchmark valueCI 1,220 - 1,267

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,252
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: ocr. Source rank: #35. Votes: 11958. Organization: google. License: Apache 2.0.

62.9% percentile inside its fair comparison set

1,252Raw benchmark valueCI 1,245 - 1,260

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #23

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,256
Percentile: 79.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #30. Votes: 16970. Organization: google. License: Apache 2.0.

79.8% percentile inside its fair comparison set

1,256Raw benchmark valueCI 1,249 - 1,263

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 65.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing_vision. Source rank: #25. Votes: 983. Organization: google. License: Apache 2.0.

65.5% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,244 - 1,283

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: diagram. Source rank: #34. Votes: 4370. Organization: google. License: Apache 2.0.

62.9% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,252 - 1,273

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,247
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #37. Votes: 6911. Organization: google. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,247Raw benchmark valueCI 1,237 - 1,257

Vision Arena · Entity Recognition · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,197
Percentile: 18.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: entity_recognition. Source rank: #29. Votes: 99. Organization: google. License: Apache 2.0.

18.8% percentile inside its fair comparison set

1,197Raw benchmark valueCI 1,140 - 1,254

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,293
Percentile: 76.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: homework. Source rank: #22. Votes: 2961. Organization: google. License: Apache 2.0.

76.5% percentile inside its fair comparison set

1,293Raw benchmark valueCI 1,281 - 1,305

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,270
Percentile: 69.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: humor. Source rank: #19. Votes: 663. Organization: google. License: Apache 2.0.

69.4% percentile inside its fair comparison set

1,270Raw benchmark valueCI 1,247 - 1,294

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #22

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 70%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: ocr. Source rank: #28. Votes: 11958. Organization: google. License: Apache 2.0.

70% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,255 - 1,270

Multilingual8 benchmarks84.9%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #52

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,477
Percentile: 82.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #64. Votes: 387. Organization: google. License: Apache 2.0.

82.7% percentile inside its fair comparison set

1,477Raw benchmark valueCI 1,447 - 1,507

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,478
Percentile: 87.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: french. Source rank: #36. Votes: 215. Organization: google. License: Apache 2.0.

87.5% percentile inside its fair comparison set

1,478Raw benchmark valueCI 1,437 - 1,519

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 86.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: russian. Source rank: #52. Votes: 685. Organization: google. License: Apache 2.0.

86.2% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,426 - 1,468

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #35

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,479
Percentile: 88.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #41. Votes: 387. Organization: google. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,479Raw benchmark valueCI 1,449 - 1,509

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 90.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: french. Source rank: #26. Votes: 215. Organization: google. License: Apache 2.0.

90.7% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,426 - 1,508

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 90.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: russian. Source rank: #36. Votes: 685. Organization: google. License: Apache 2.0.

90.7% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,422 - 1,463

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,300
Percentile: 75.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #25. Votes: 1062. Organization: google. License: Apache 2.0.

75.3% percentile inside its fair comparison set

1,300Raw benchmark valueCI 1,277 - 1,323

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #18

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,321
Percentile: 77.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #22. Votes: 1062. Organization: google. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,321Raw benchmark valueCI 1,298 - 1,344

Source links and registry checks

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Artificial Analysis

Jun 20, 2026

source →

Model profile · Google

Gemma 4 26B A4B

Open weightsmid · registry tag 2026 open multimodal

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 17.8%
Verified coverage: 17.8%
Spread: 76.5%
Last verified: Jun 20, 2026

50%bench fit

textvisioncodedocument5 aliases37 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks74.5%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #122 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 20
Percentile: 69.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

69.4% percentile inside its fair comparison set

20Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #178 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 15.7%
Percentile: 40.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

40.9% percentile inside its fair comparison set

15.7%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #245 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 7.8%
Percentile: 18.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

18.1% percentile inside its fair comparison set

7.8%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #118 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 45.4%
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `ifbench`.

62.9% percentile inside its fair comparison set

45.4%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.2 /1M tokens
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

74.6% percentile inside its fair comparison set

$0.2 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.1 /1M input tokens
Percentile: 74.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

74.3% percentile inside its fair comparison set

$0.1 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.4 /1M output tokens
Percentile: 75.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

75.4% percentile inside its fair comparison set

$0.4 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #192 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 41.4 tokens/s
Percentile: 9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

9% percentile inside its fair comparison set

41.4 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #90 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 1.99s
Percentile: 57.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

57.6% percentile inside its fair comparison set

1.99sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #57 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 1.99s
Percentile: 73.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

73.3% percentile inside its fair comparison set

1.99sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #101 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #61. Votes: 5813. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,430 - 1,446

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,402
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #69. Votes: 955. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,402Raw benchmark valueCI 1,383 - 1,421

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #60. Votes: 2519. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,437 - 1,460

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,435
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #61. Votes: 3982. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,435Raw benchmark valueCI 1,424 - 1,446

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 86.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #57. Votes: 3277. Organization: google. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,451 - 1,471

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 86.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #57. Votes: 1486. Organization: google. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,453 - 1,482

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 88.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.

88.9% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,425 - 1,453

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,434 - 1,463

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #48

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,446
Percentile: 85.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #62. Votes: 1090. Organization: google. License: Apache 2.0.

85.4% percentile inside its fair comparison set

1,446Raw benchmark valueCI 1,429 - 1,464

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 86.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #55. Votes: 5813. Organization: google. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,427 - 1,442

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #47

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,403
Percentile: 85.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #60. Votes: 955. Organization: google. License: Apache 2.0.

85.8% percentile inside its fair comparison set

1,403Raw benchmark valueCI 1,384 - 1,422

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,441
Percentile: 86.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #57. Votes: 2519. Organization: google. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,441Raw benchmark valueCI 1,430 - 1,453

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 86.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #54. Votes: 3982. Organization: google. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,418 - 1,440

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #47

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 85.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #59. Votes: 3277. Organization: google. License: Apache 2.0.

85.8% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,428 - 1,449

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #52

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 84.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #61. Votes: 1486. Organization: google. License: Apache 2.0.

84.3% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,429 - 1,458

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 88%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.

88% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,407 - 1,435

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 85.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.

85.9% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,415 - 1,444

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,439
Percentile: 86.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #56. Votes: 1090. Organization: google. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,439Raw benchmark valueCI 1,421 - 1,456

Coding10 benchmarks50.2%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #130 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 13.6%
Percentile: 57.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

57.3% percentile inside its fair comparison set

13.6%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #93 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 37.3%
Percentile: 75%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `scicode`.

75% percentile inside its fair comparison set

37.3%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 52%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

52% percentile inside its fair comparison set

39Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 11
Percentile: 21.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

21.7% percentile inside its fair comparison set

11Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 34.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #58. Votes: 1505. Organization: google. License: Apache 2.0.

34.2% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,343 - 1,375

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 34.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev. Source rank: #58. Votes: 1505. Organization: google. License: Apache 2.0.

34.2% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,343 - 1,375

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #50

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,355
Percentile: 32.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev-html. Source rank: #59. Votes: 204. Organization: google. License: Apache 2.0.

32.9% percentile inside its fair comparison set

1,355Raw benchmark valueCI 1,311 - 1,399

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #43

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,354
Percentile: 28.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: webdev-react. Source rank: #54. Votes: 1299. Organization: google. License: Apache 2.0.

28.8% percentile inside its fair comparison set

1,354Raw benchmark valueCI 1,337 - 1,371

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #53

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,480
Percentile: 83.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: coding. Source rank: #67. Votes: 1367. Organization: google. License: Apache 2.0.

83.8% percentile inside its fair comparison set

1,480Raw benchmark valueCI 1,464 - 1,495

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #59

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 81.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: coding. Source rank: #71. Votes: 1367. Organization: google. License: Apache 2.0.

81.9% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,428 - 1,459

Reasoning / math / science5 benchmarks78.7%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #90 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.7%
Percentile: 75.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `hle`.

75.9% percentile inside its fair comparison set

10.7%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #132 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 71.4%
Percentile: 65%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gpqa`.

65% percentile inside its fair comparison set

71.4%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #195 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0%
Percentile: 65.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `critpt`.

65.2% percentile inside its fair comparison set

0%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #23

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 93%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: math. Source rank: #27. Votes: 372. Organization: google. License: Apache 2.0.

93% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,438 - 1,495

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #18

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,466
Percentile: 94.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: math. Source rank: #22. Votes: 372. Organization: google. License: Apache 2.0.

94.6% percentile inside its fair comparison set

1,466Raw benchmark valueCI 1,438 - 1,495

Professional reasoning19 benchmarks82.5%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #36 · Source label: Gemma 4 26B A4B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 715
Percentile: 23.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

23.9% percentile inside its fair comparison set

715Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: expert. Source rank: #50. Votes: 408. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,448 - 1,502

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,449
Percentile: 88.7%
Last updated: recent
Eligibility: headline eligible

88.7% percentile inside its fair comparison set

1,449Raw benchmark valueCI 1,431 - 1,466

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #44

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,411
Percentile: 86.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_entertainment_and_sports_and_media. Source rank: #56. Votes: 1059. Organization: google. License: Apache 2.0.

86.7% percentile inside its fair comparison set

1,411Raw benchmark valueCI 1,393 - 1,429

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #37

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,460
Percentile: 87.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_legal_and_government. Source rank: #48. Votes: 388. Organization: google. License: Apache 2.0.

87.9% percentile inside its fair comparison set

1,460Raw benchmark valueCI 1,431 - 1,489

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #57

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 82.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_life_and_physical_and_social_science. Source rank: #71. Votes: 875. Organization: google. License: Apache 2.0.

82.7% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,432 - 1,470

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 93.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_mathematical. Source rank: #24. Votes: 294. Organization: google. License: Apache 2.0.

93.5% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,442 - 1,507

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #81

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 72.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_medicine_and_healthcare. Source rank: #100. Votes: 370. Organization: google. License: Apache 2.0.

72.9% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,404 - 1,464

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,475
Percentile: 85.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_software_and_it_services. Source rank: #61. Votes: 2095. Organization: google. License: Apache 2.0.

85.2% percentile inside its fair comparison set

1,475Raw benchmark valueCI 1,462 - 1,487

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,410
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_writing_and_literature_and_language. Source rank: #70. Votes: 1326. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,410Raw benchmark valueCI 1,394 - 1,425

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 85.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: expert. Source rank: #49. Votes: 408. Organization: google. License: Apache 2.0.

85.5% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,421 - 1,475

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #31

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 90.6%
Last updated: recent
Eligibility: headline eligible

90.6% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,417 - 1,452

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #38

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,406
Percentile: 88.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_entertainment_and_sports_and_media. Source rank: #48. Votes: 1059. Organization: google. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,406Raw benchmark valueCI 1,388 - 1,424

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #30

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 90.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_legal_and_government. Source rank: #36. Votes: 388. Organization: google. License: Apache 2.0.

90.3% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,425 - 1,482

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #55

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 83.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_life_and_physical_and_social_science. Source rank: #64. Votes: 875. Organization: google. License: Apache 2.0.

83.3% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,424 - 1,462

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,469
Percentile: 93.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_mathematical. Source rank: #24. Votes: 294. Organization: google. License: Apache 2.0.

93.8% percentile inside its fair comparison set

1,469Raw benchmark valueCI 1,437 - 1,501

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #76

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,426
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_medicine_and_healthcare. Source rank: #90. Votes: 370. Organization: google. License: Apache 2.0.

74.6% percentile inside its fair comparison set

1,426Raw benchmark valueCI 1,396 - 1,456

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #49

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 85.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_software_and_it_services. Source rank: #58. Votes: 2095. Organization: google. License: Apache 2.0.

85.2% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,441 - 1,465

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #51

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,409
Percentile: 84.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: industry_writing_and_literature_and_language. Source rank: #63. Votes: 1326. Organization: google. License: Apache 2.0.

84.6% percentile inside its fair comparison set

1,409Raw benchmark valueCI 1,393 - 1,424

Search / tool use1 benchmark51.8%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #150 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 40.4%
Percentile: 51.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `tau2`.

51.8% percentile inside its fair comparison set

40.4%Raw benchmark value

Long context1 benchmark62.2%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #120 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39.7%
Percentile: 62.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `lcr`.

62.2% percentile inside its fair comparison set

39.7%Raw benchmark value

Vision understanding17 benchmarks62.2%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #46 · Source label: Gemma 4 26B A4B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 66.7%
Percentile: 66.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

66.7% percentile inside its fair comparison set

66.7%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,238
Percentile: 76.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #35. Votes: 16970. Organization: google. License: Apache 2.0.

76.1% percentile inside its fair comparison set

1,238Raw benchmark valueCI 1,231 - 1,246

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #26

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,238
Percentile: 54.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing_vision. Source rank: #33. Votes: 983. Organization: google. License: Apache 2.0.

54.5% percentile inside its fair comparison set

1,238Raw benchmark valueCI 1,218 - 1,258

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #29

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,260
Percentile: 60%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: diagram. Source rank: #38. Votes: 4370. Organization: google. License: Apache 2.0.

60% percentile inside its fair comparison set

1,260Raw benchmark valueCI 1,250 - 1,271

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #31

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,229
Percentile: 72.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #40. Votes: 6911. Organization: google. License: Apache 2.0.

72.5% percentile inside its fair comparison set

1,229Raw benchmark valueCI 1,219 - 1,239

Vision Arena · Entity Recognition

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,169
Percentile: 15.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: entity_recognition. Source rank: #31. Votes: 99. Organization: google. License: Apache 2.0.

15.6% percentile inside its fair comparison set

1,169Raw benchmark valueCI 1,113 - 1,225

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #19

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,285
Percentile: 73.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: homework. Source rank: #25. Votes: 2961. Organization: google. License: Apache 2.0.

73.5% percentile inside its fair comparison set

1,285Raw benchmark valueCI 1,273 - 1,297

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,243
Percentile: 59.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: humor. Source rank: #26. Votes: 663. Organization: google. License: Apache 2.0.

59.2% percentile inside its fair comparison set

1,243Raw benchmark valueCI 1,220 - 1,267

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,252
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: ocr. Source rank: #35. Votes: 11958. Organization: google. License: Apache 2.0.

62.9% percentile inside its fair comparison set

1,252Raw benchmark valueCI 1,245 - 1,260

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #23

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,256
Percentile: 79.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #30. Votes: 16970. Organization: google. License: Apache 2.0.

79.8% percentile inside its fair comparison set

1,256Raw benchmark valueCI 1,249 - 1,263

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 65.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing_vision. Source rank: #25. Votes: 983. Organization: google. License: Apache 2.0.

65.5% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,244 - 1,283

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 62.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: diagram. Source rank: #34. Votes: 4370. Organization: google. License: Apache 2.0.

62.9% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,252 - 1,273

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,247
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #37. Votes: 6911. Organization: google. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,247Raw benchmark valueCI 1,237 - 1,257

Vision Arena · Entity Recognition · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #27

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,197
Percentile: 18.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: entity_recognition. Source rank: #29. Votes: 99. Organization: google. License: Apache 2.0.

18.8% percentile inside its fair comparison set

1,197Raw benchmark valueCI 1,140 - 1,254

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,293
Percentile: 76.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: homework. Source rank: #22. Votes: 2961. Organization: google. License: Apache 2.0.

76.5% percentile inside its fair comparison set

1,293Raw benchmark valueCI 1,281 - 1,305

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,270
Percentile: 69.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: humor. Source rank: #19. Votes: 663. Organization: google. License: Apache 2.0.

69.4% percentile inside its fair comparison set

1,270Raw benchmark valueCI 1,247 - 1,294

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #22

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 70%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: ocr. Source rank: #28. Votes: 11958. Organization: google. License: Apache 2.0.

70% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,255 - 1,270

Multilingual8 benchmarks84.9%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #52

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,477
Percentile: 82.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #64. Votes: 387. Organization: google. License: Apache 2.0.

82.7% percentile inside its fair comparison set

1,477Raw benchmark valueCI 1,447 - 1,507

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,478
Percentile: 87.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: french. Source rank: #36. Votes: 215. Organization: google. License: Apache 2.0.

87.5% percentile inside its fair comparison set

1,478Raw benchmark valueCI 1,437 - 1,519

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #41

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 86.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: russian. Source rank: #52. Votes: 685. Organization: google. License: Apache 2.0.

86.2% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,426 - 1,468

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #35

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,479
Percentile: 88.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #41. Votes: 387. Organization: google. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,479Raw benchmark valueCI 1,449 - 1,509

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #21

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 90.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: french. Source rank: #26. Votes: 215. Organization: google. License: Apache 2.0.

90.7% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,426 - 1,508

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #28

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 90.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: russian. Source rank: #36. Votes: 685. Organization: google. License: Apache 2.0.

90.7% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,422 - 1,463

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #20

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,300
Percentile: 75.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #25. Votes: 1062. Organization: google. License: Apache 2.0.

75.3% percentile inside its fair comparison set

1,300Raw benchmark valueCI 1,277 - 1,323

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #18

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,321
Percentile: 77.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: chinese. Source rank: #22. Votes: 1062. Organization: google. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,321Raw benchmark valueCI 1,298 - 1,344