Intelligence Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #122 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 20
- Percentile
- 69.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.
69.4% percentile inside its fair comparison set20Raw benchmark value
AA-Omniscience accuracy
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #178 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 15.7%
- Percentile
- 40.9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.
40.9% percentile inside its fair comparison set15.7%Raw benchmark value
AA-Omniscience non-hallucination
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #245 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 7.8%
- Percentile
- 18.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.
18.1% percentile inside its fair comparison set7.8%Raw benchmark value
IFBench
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #118 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 45.4%
- Percentile
- 62.9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `ifbench`.
62.9% percentile inside its fair comparison set45.4%Raw benchmark value
Blended price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #71 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.2 /1M tokens
- Percentile
- 74.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.
74.6% percentile inside its fair comparison set$0.2 /1M tokensRaw benchmark value
Input price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.1 /1M input tokens
- Percentile
- 74.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.
74.3% percentile inside its fair comparison set$0.1 /1M input tokensRaw benchmark value
Output price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #72 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.4 /1M output tokens
- Percentile
- 75.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.
75.4% percentile inside its fair comparison set$0.4 /1M output tokensRaw benchmark value
Output Speed
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #192 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 41.4 tokens/s
- Percentile
- 9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.
9% percentile inside its fair comparison set41.4 tokens/sRaw benchmark value
Time to first token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #90 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 1.99s
- Percentile
- 57.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.
57.6% percentile inside its fair comparison set1.99sRaw benchmark value
Time to first answer token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #57 · Source label: Gemma 4 26B A4B (Non-reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 1.99s
- Percentile
- 73.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.
73.3% percentile inside its fair comparison set1.99sRaw benchmark value
Openness Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #101 · Source label: Gemma 4 26B A4B (Reasoning)
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 39
- Percentile
- 54.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.
54.8% percentile inside its fair comparison set39Raw benchmark value
Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #48
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,438
- Percentile
- 85.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #61. Votes: 5813. Organization: google. License: Apache 2.0.
85.5% percentile inside its fair comparison set1,438Raw benchmark valueCI 1,430 - 1,446
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #55
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,402
- Percentile
- 83.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #69. Votes: 955. Organization: google. License: Apache 2.0.
83.3% percentile inside its fair comparison set1,402Raw benchmark valueCI 1,383 - 1,421
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #48
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,448
- Percentile
- 85.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #60. Votes: 2519. Organization: google. License: Apache 2.0.
85.5% percentile inside its fair comparison set1,448Raw benchmark valueCI 1,437 - 1,460
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #48
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,435
- Percentile
- 85.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #61. Votes: 3982. Organization: google. License: Apache 2.0.
85.5% percentile inside its fair comparison set1,435Raw benchmark valueCI 1,424 - 1,446
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,461
- Percentile
- 86.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #57. Votes: 3277. Organization: google. License: Apache 2.0.
86.5% percentile inside its fair comparison set1,461Raw benchmark valueCI 1,451 - 1,471
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,467
- Percentile
- 86.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #57. Votes: 1486. Organization: google. License: Apache 2.0.
86.4% percentile inside its fair comparison set1,467Raw benchmark valueCI 1,453 - 1,482
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #37
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,439
- Percentile
- 88.9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.
88.9% percentile inside its fair comparison set1,439Raw benchmark valueCI 1,425 - 1,453
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,448
- Percentile
- 85.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.
85.5% percentile inside its fair comparison set1,448Raw benchmark valueCI 1,434 - 1,463
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #48
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,446
- Percentile
- 85.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #62. Votes: 1090. Organization: google. License: Apache 2.0.
85.4% percentile inside its fair comparison set1,446Raw benchmark valueCI 1,429 - 1,464
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #44
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,434
- Percentile
- 86.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: overall. Source rank: #55. Votes: 5813. Organization: google. License: Apache 2.0.
86.8% percentile inside its fair comparison set1,434Raw benchmark valueCI 1,427 - 1,442
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,403
- Percentile
- 85.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: creative_writing. Source rank: #60. Votes: 955. Organization: google. License: Apache 2.0.
85.8% percentile inside its fair comparison set1,403Raw benchmark valueCI 1,384 - 1,422
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,441
- Percentile
- 86.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: english. Source rank: #57. Votes: 2519. Organization: google. License: Apache 2.0.
86.5% percentile inside its fair comparison set1,441Raw benchmark valueCI 1,430 - 1,453
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #44
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,429
- Percentile
- 86.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: exclude_ties. Source rank: #54. Votes: 3982. Organization: google. License: Apache 2.0.
86.8% percentile inside its fair comparison set1,429Raw benchmark valueCI 1,418 - 1,440
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,439
- Percentile
- 85.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts. Source rank: #59. Votes: 3277. Organization: google. License: Apache 2.0.
85.8% percentile inside its fair comparison set1,439Raw benchmark valueCI 1,428 - 1,449
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #52
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,444
- Percentile
- 84.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: hard_prompts_english. Source rank: #61. Votes: 1486. Organization: google. License: Apache 2.0.
84.3% percentile inside its fair comparison set1,444Raw benchmark valueCI 1,429 - 1,458
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #40
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,421
- Percentile
- 88%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: instruction_following. Source rank: #48. Votes: 1611. Organization: google. License: Apache 2.0.
88% percentile inside its fair comparison set1,421Raw benchmark valueCI 1,407 - 1,435
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #44
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,429
- Percentile
- 85.9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: longer_query. Source rank: #56. Votes: 1561. Organization: google. License: Apache 2.0.
85.9% percentile inside its fair comparison set1,429Raw benchmark valueCI 1,415 - 1,444
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,439
- Percentile
- 86.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `gemma-4-26b-a4b`. Category: multi_turn. Source rank: #56. Votes: 1090. Organization: google. License: Apache 2.0.
86.4% percentile inside its fair comparison set1,439Raw benchmark valueCI 1,421 - 1,456