Intelligence Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #156 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 16
- Percentile
- 60.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.
60.8% percentile inside its fair comparison set16Raw benchmark value
AA-Omniscience accuracy
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #71 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 24.1%
- Percentile
- 76.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.
76.5% percentile inside its fair comparison set24.1%Raw benchmark value
AA-Omniscience non-hallucination
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #140 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 16.3%
- Percentile
- 53.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.
53.4% percentile inside its fair comparison set16.3%Raw benchmark value
IFBench
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #209 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 36.2%
- Percentile
- 34.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `ifbench`.
34.3% percentile inside its fair comparison set36.2%Raw benchmark value
Blended price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #144 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.8 /1M tokens
- Percentile
- 48.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.
48.6% percentile inside its fair comparison set$0.8 /1M tokensRaw benchmark value
Input price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #162 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.5 /1M input tokens
- Percentile
- 42.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.
42.8% percentile inside its fair comparison set$0.5 /1M input tokensRaw benchmark value
Output price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #141 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $1.5 /1M output tokens
- Percentile
- 50%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.
50% percentile inside its fair comparison set$1.5 /1M output tokensRaw benchmark value
Output Speed
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #161 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 53.3 tokens/s
- Percentile
- 23.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.
23.8% percentile inside its fair comparison set53.3 tokens/sRaw benchmark value
Time to first token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 1.14s
- Percentile
- 78.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.
78.1% percentile inside its fair comparison set1.14sRaw benchmark value
Time to first answer token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #28 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 1.14s
- Percentile
- 87.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.
87.1% percentile inside its fair comparison set1.14sRaw benchmark value
Openness Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #104 · Source label: Mistral Large 3
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 39
- Percentile
- 54.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.
54.8% percentile inside its fair comparison set39Raw benchmark value
Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #81
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,416
- Percentile
- 75.4%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: overall. Source rank: #100. Votes: 44094. Organization: mistral. License: Apache 2.0.
75.4% percentile inside its fair comparison set1,416Raw benchmark valueCI 1,412 - 1,419
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #88
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,376
- Percentile
- 73.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: creative_writing. Source rank: #109. Votes: 6732. Organization: mistral. License: Apache 2.0.
73.1% percentile inside its fair comparison set1,376Raw benchmark valueCI 1,369 - 1,384
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #77
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,429
- Percentile
- 76.6%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: english. Source rank: #96. Votes: 20659. Organization: mistral. License: Apache 2.0.
76.6% percentile inside its fair comparison set1,429Raw benchmark valueCI 1,424 - 1,434
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #79
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,403
- Percentile
- 76%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: exclude_ties. Source rank: #98. Votes: 31217. Organization: mistral. License: Apache 2.0.
76% percentile inside its fair comparison set1,403Raw benchmark valueCI 1,398 - 1,408
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #82
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,432
- Percentile
- 75.1%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: hard_prompts. Source rank: #102. Votes: 24756. Organization: mistral. License: Apache 2.0.
75.1% percentile inside its fair comparison set1,432Raw benchmark valueCI 1,427 - 1,437
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #78
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,441
- Percentile
- 76.2%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: hard_prompts_english. Source rank: #97. Votes: 12119. Organization: mistral. License: Apache 2.0.
76.2% percentile inside its fair comparison set1,441Raw benchmark valueCI 1,435 - 1,447
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #78
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,404
- Percentile
- 76.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: instruction_following. Source rank: #97. Votes: 12420. Organization: mistral. License: Apache 2.0.
76.3% percentile inside its fair comparison set1,404Raw benchmark valueCI 1,399 - 1,410
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #86
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,417
- Percentile
- 72%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: longer_query. Source rank: #107. Votes: 12819. Organization: mistral. License: Apache 2.0.
72% percentile inside its fair comparison set1,417Raw benchmark valueCI 1,411 - 1,423
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #72
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,421
- Percentile
- 78%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: multi_turn. Source rank: #90. Votes: 7648. Organization: mistral. License: Apache 2.0.
78% percentile inside its fair comparison set1,421Raw benchmark valueCI 1,414 - 1,428
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,430
- Percentile
- 86.5%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: overall. Source rank: #57. Votes: 44094. Organization: mistral. License: Apache 2.0.
86.5% percentile inside its fair comparison set1,430Raw benchmark valueCI 1,427 - 1,434
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #65
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,392
- Percentile
- 80.2%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: creative_writing. Source rank: #78. Votes: 6732. Organization: mistral. License: Apache 2.0.
80.2% percentile inside its fair comparison set1,392Raw benchmark valueCI 1,385 - 1,400
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,441
- Percentile
- 85.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: english. Source rank: #59. Votes: 20659. Organization: mistral. License: Apache 2.0.
85.8% percentile inside its fair comparison set1,441Raw benchmark valueCI 1,436 - 1,445
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #46
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,423
- Percentile
- 86.2%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: exclude_ties. Source rank: #58. Votes: 31217. Organization: mistral. License: Apache 2.0.
86.2% percentile inside its fair comparison set1,423Raw benchmark valueCI 1,418 - 1,428
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #57
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,431
- Percentile
- 82.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: hard_prompts. Source rank: #71. Votes: 24756. Organization: mistral. License: Apache 2.0.
82.8% percentile inside its fair comparison set1,431Raw benchmark valueCI 1,426 - 1,435
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #62
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,437
- Percentile
- 81.2%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: hard_prompts_english. Source rank: #73. Votes: 12119. Organization: mistral. License: Apache 2.0.
81.2% percentile inside its fair comparison set1,437Raw benchmark valueCI 1,431 - 1,443
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #60
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,406
- Percentile
- 81.8%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: instruction_following. Source rank: #74. Votes: 12420. Organization: mistral. License: Apache 2.0.
81.8% percentile inside its fair comparison set1,406Raw benchmark valueCI 1,400 - 1,412
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #70
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,413
- Percentile
- 77.3%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: longer_query. Source rank: #83. Votes: 12819. Organization: mistral. License: Apache 2.0.
77.3% percentile inside its fair comparison set1,413Raw benchmark valueCI 1,407 - 1,419
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #53
verified runtimeexact alias
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,430
- Percentile
- 83.9%
- Last updated
- recent
- Eligibility
- headline eligible
Parsed from Arena leaderboard dataset row `mistral-large-3`. Category: multi_turn. Source rank: #67. Votes: 7648. Organization: mistral. License: Apache 2.0.
83.9% percentile inside its fair comparison set1,430Raw benchmark valueCI 1,423 - 1,437