Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #115 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,384
- Percentile
- 64.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: overall. Source rank: #140. Votes: 17128. Organization: minimax. License: MIT.
64.9% percentile inside its fair comparison set1,384Raw benchmark valueCI 1,379 - 1,390
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #115 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,346
- Percentile
- 64.7%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: creative_writing. Source rank: #140. Votes: 2716. Organization: minimax. License: MIT.
64.7% percentile inside its fair comparison set1,346Raw benchmark valueCI 1,334 - 1,357
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #113 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,402
- Percentile
- 65.5%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: english. Source rank: #136. Votes: 7641. Organization: minimax. License: MIT.
65.5% percentile inside its fair comparison set1,402Raw benchmark valueCI 1,395 - 1,409
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #118 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,357
- Percentile
- 64%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: exclude_ties. Source rank: #143. Votes: 12077. Organization: minimax. License: MIT.
64% percentile inside its fair comparison set1,357Raw benchmark valueCI 1,350 - 1,365
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #111 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,408
- Percentile
- 66.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: hard_prompts. Source rank: #135. Votes: 9069. Organization: minimax. License: MIT.
66.2% percentile inside its fair comparison set1,408Raw benchmark valueCI 1,401 - 1,414
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #103 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,425
- Percentile
- 68.5%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: hard_prompts_english. Source rank: #123. Votes: 4200. Organization: minimax. License: MIT.
68.5% percentile inside its fair comparison set1,425Raw benchmark valueCI 1,416 - 1,434
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #100 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,386
- Percentile
- 69.5%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: instruction_following. Source rank: #123. Votes: 4473. Organization: minimax. License: MIT.
69.5% percentile inside its fair comparison set1,386Raw benchmark valueCI 1,377 - 1,395
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #92 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,410
- Percentile
- 70.1%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: longer_query. Source rank: #116. Votes: 4256. Organization: minimax. License: MIT.
70.1% percentile inside its fair comparison set1,410Raw benchmark valueCI 1,401 - 1,419
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #109 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,391
- Percentile
- 66.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: multi_turn. Source rank: #132. Votes: 2872. Organization: minimax. License: MIT.
66.6% percentile inside its fair comparison set1,391Raw benchmark valueCI 1,380 - 1,402
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #102 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,391
- Percentile
- 68.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: overall. Source rank: #123. Votes: 17128. Organization: minimax. License: MIT.
68.9% percentile inside its fair comparison set1,391Raw benchmark valueCI 1,386 - 1,396
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #92 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,362
- Percentile
- 71.8%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: creative_writing. Source rank: #113. Votes: 2716. Organization: minimax. License: MIT.
71.8% percentile inside its fair comparison set1,362Raw benchmark valueCI 1,351 - 1,374
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #99 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,406
- Percentile
- 69.8%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: english. Source rank: #120. Votes: 7641. Organization: minimax. License: MIT.
69.8% percentile inside its fair comparison set1,406Raw benchmark valueCI 1,399 - 1,413
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #102 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,367
- Percentile
- 68.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: exclude_ties. Source rank: #123. Votes: 12077. Organization: minimax. License: MIT.
68.9% percentile inside its fair comparison set1,367Raw benchmark valueCI 1,359 - 1,374
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #89 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,410
- Percentile
- 72.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: hard_prompts. Source rank: #106. Votes: 9069. Organization: minimax. License: MIT.
72.9% percentile inside its fair comparison set1,410Raw benchmark valueCI 1,404 - 1,417
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #80 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,423
- Percentile
- 75.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: hard_prompts_english. Source rank: #96. Votes: 4200. Organization: minimax. License: MIT.
75.6% percentile inside its fair comparison set1,423Raw benchmark valueCI 1,414 - 1,432
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #75 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,398
- Percentile
- 77.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: instruction_following. Source rank: #91. Votes: 4473. Organization: minimax. License: MIT.
77.2% percentile inside its fair comparison set1,398Raw benchmark valueCI 1,390 - 1,407
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #64 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,418
- Percentile
- 79.3%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: longer_query. Source rank: #77. Votes: 4256. Organization: minimax. License: MIT.
79.3% percentile inside its fair comparison set1,418Raw benchmark valueCI 1,409 - 1,427
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #95 · Source label: minimax-m2.1-preview
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,394
- Percentile
- 70.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `minimax-m2.1-preview`. Category: multi_turn. Source rank: #116. Votes: 2872. Organization: minimax. License: MIT.
70.9% percentile inside its fair comparison set1,394Raw benchmark valueCI 1,383 - 1,405