MMMU-Pro
AA · Vision understanding · Objective
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #38 · Source label: Grok 4
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 68.8%
- Percentile
- 72.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `mmmuPro`.
72.6% percentile inside its fair comparison set68.8%Raw benchmark value
Vision Arena
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #55 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,182
- Percentile
- 50.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: overall. Source rank: #68. Votes: 32725. Organization: xai. License: Proprietary.
50.5% percentile inside its fair comparison set1,182Raw benchmark valueCI 1,174 - 1,189
Vision Arena · Captioning
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #21 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,164
- Percentile
- 23.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: captioning. Source rank: #22. Votes: 375. Organization: xai. License: Proprietary.
23.1% percentile inside its fair comparison set1,164Raw benchmark valueCI 1,129 - 1,199
Vision Arena · Creative Writing Vision
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #35 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,211
- Percentile
- 38.2%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: creative_writing_vision. Source rank: #45. Votes: 1262. Organization: xai. License: Proprietary.
38.2% percentile inside its fair comparison set1,211Raw benchmark valueCI 1,192 - 1,231
Vision Arena · Diagram
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #52 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,192
- Percentile
- 27.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: diagram. Source rank: #66. Votes: 2528. Organization: xai. License: Proprietary.
27.1% percentile inside its fair comparison set1,192Raw benchmark valueCI 1,179 - 1,205
Vision Arena · English
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #55 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,186
- Percentile
- 50.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: english. Source rank: #68. Votes: 16057. Organization: xai. License: Proprietary.
50.5% percentile inside its fair comparison set1,186Raw benchmark valueCI 1,176 - 1,196
Vision Arena · Entity Recognition
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #13 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,238
- Percentile
- 62.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: entity_recognition. Source rank: #12. Votes: 341. Organization: xai. License: Proprietary.
62.5% percentile inside its fair comparison set1,238Raw benchmark valueCI 1,204 - 1,272
Vision Arena · Homework
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #55 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,171
- Percentile
- 20.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: homework. Source rank: #71. Votes: 1123. Organization: xai. License: Proprietary.
20.6% percentile inside its fair comparison set1,171Raw benchmark valueCI 1,153 - 1,189
Vision Arena · Humor
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #38 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,171
- Percentile
- 24.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: humor. Source rank: #50. Votes: 1223. Organization: xai. License: Proprietary.
24.5% percentile inside its fair comparison set1,171Raw benchmark valueCI 1,151 - 1,191
Vision Arena · Ocr
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #53 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,175
- Percentile
- 25.7%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: ocr. Source rank: #69. Votes: 9997. Organization: xai. License: Proprietary.
25.7% percentile inside its fair comparison set1,175Raw benchmark valueCI 1,166 - 1,184
Vision Arena · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #43 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,209
- Percentile
- 61.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: overall. Source rank: #55. Votes: 32725. Organization: xai. License: Proprietary.
61.5% percentile inside its fair comparison set1,209Raw benchmark valueCI 1,201 - 1,216
Vision Arena · Captioning · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #10 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,239
- Percentile
- 65.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: captioning. Source rank: #11. Votes: 375. Organization: xai. License: Proprietary.
65.4% percentile inside its fair comparison set1,239Raw benchmark valueCI 1,208 - 1,271
Vision Arena · Creative Writing Vision · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #29 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,239
- Percentile
- 49.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: creative_writing_vision. Source rank: #37. Votes: 1262. Organization: xai. License: Proprietary.
49.1% percentile inside its fair comparison set1,239Raw benchmark valueCI 1,220 - 1,259
Vision Arena · Diagram · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #46 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,204
- Percentile
- 35.7%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: diagram. Source rank: #58. Votes: 2528. Organization: xai. License: Proprietary.
35.7% percentile inside its fair comparison set1,204Raw benchmark valueCI 1,191 - 1,217
Vision Arena · English · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #43 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,218
- Percentile
- 61.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: english. Source rank: #55. Votes: 16057. Organization: xai. License: Proprietary.
61.5% percentile inside its fair comparison set1,218Raw benchmark valueCI 1,208 - 1,228
Vision Arena · Entity Recognition · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #7 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,287
- Percentile
- 81.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: entity_recognition. Source rank: #8. Votes: 341. Organization: xai. License: Proprietary.
81.3% percentile inside its fair comparison set1,287Raw benchmark valueCI 1,255 - 1,320
Vision Arena · Homework · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #56 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,171
- Percentile
- 19.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: homework. Source rank: #71. Votes: 1123. Organization: xai. License: Proprietary.
19.1% percentile inside its fair comparison set1,171Raw benchmark valueCI 1,153 - 1,189
Vision Arena · Humor · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #35 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,203
- Percentile
- 30.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: humor. Source rank: #46. Votes: 1223. Organization: xai. License: Proprietary.
30.6% percentile inside its fair comparison set1,203Raw benchmark valueCI 1,184 - 1,223
Vision Arena · Ocr · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #50 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,195
- Percentile
- 30%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: ocr. Source rank: #62. Votes: 9997. Organization: xai. License: Proprietary.
30% percentile inside its fair comparison set1,195Raw benchmark valueCI 1,186 - 1,203
MMMU Pro
VALS-AI · Vision understanding · Objective
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #34 · Source label: grok/grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Vals AI
- Raw value
- 76.3%
- Percentile
- 43.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Vals AI BenchmarkView overall scores. Vals slug: mmmu; provider: xAI.
43.1% percentile inside its fair comparison set76.3%Raw benchmark valueCI 74.3% - 78.3%
VISTA
SL · Vision understanding · Rubric
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #13
verified runtimeexact direct
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Scale Labs
- Raw value
- 73%
- Percentile
- 14.3%
- Last updated
- aging
- Eligibility
- headline eligible
14.3% percentile inside its fair comparison set73%Raw benchmark value
Vision Arena · Creative Writing
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #9 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,237
- Percentile
- 75%
- Last updated
- archived
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: creative_writing. Source rank: #10. Votes: 1461. Organization: xai. License: Proprietary.
75% percentile inside its fair comparison set1,237Raw benchmark valueCI 1,220 - 1,253
Vision Arena · Creative Writing · No Style Control
AR · Vision understanding · Human
It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.
Rank #7 · Source label: grok-4-0709
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,265
- Percentile
- 81.3%
- Last updated
- archived
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `grok-4-0709`. Category: creative_writing. Source rank: #8. Votes: 1461. Organization: xai. License: Proprietary.
81.3% percentile inside its fair comparison set1,265Raw benchmark valueCI 1,249 - 1,282