Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #30 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,455
- Percentile
- 91.1%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: overall. Source rank: #41. Votes: 50401. Organization: bytedance. License: Proprietary.
91.1% percentile inside its fair comparison set1,455Raw benchmark valueCI 1,451 - 1,459
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #54 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,402
- Percentile
- 83.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: creative_writing. Source rank: #68. Votes: 7983. Organization: bytedance. License: Proprietary.
83.6% percentile inside its fair comparison set1,402Raw benchmark valueCI 1,395 - 1,410
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #35 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,462
- Percentile
- 89.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: english. Source rank: #45. Votes: 23580. Organization: bytedance. License: Proprietary.
89.5% percentile inside its fair comparison set1,462Raw benchmark valueCI 1,457 - 1,468
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #29 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,458
- Percentile
- 91.4%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: exclude_ties. Source rank: #40. Votes: 38007. Organization: bytedance. License: Proprietary.
91.4% percentile inside its fair comparison set1,458Raw benchmark valueCI 1,453 - 1,464
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #27 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,479
- Percentile
- 92%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: hard_prompts. Source rank: #36. Votes: 31614. Organization: bytedance. License: Proprietary.
92% percentile inside its fair comparison set1,479Raw benchmark valueCI 1,475 - 1,484
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #32 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,480
- Percentile
- 90.4%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: hard_prompts_english. Source rank: #42. Votes: 15560. Organization: bytedance. License: Proprietary.
90.4% percentile inside its fair comparison set1,480Raw benchmark valueCI 1,474 - 1,486
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #43 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,433
- Percentile
- 87.1%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: instruction_following. Source rank: #54. Votes: 15921. Organization: bytedance. License: Proprietary.
87.1% percentile inside its fair comparison set1,433Raw benchmark valueCI 1,427 - 1,439
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #41 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,450
- Percentile
- 86.8%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: longer_query. Source rank: #52. Votes: 19039. Organization: bytedance. License: Proprietary.
86.8% percentile inside its fair comparison set1,450Raw benchmark valueCI 1,445 - 1,456
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #51 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,444
- Percentile
- 84.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: multi_turn. Source rank: #67. Votes: 8804. Organization: bytedance. License: Proprietary.
84.5% percentile inside its fair comparison set1,444Raw benchmark valueCI 1,436 - 1,451
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #25 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,448
- Percentile
- 92.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: overall. Source rank: #31. Votes: 50401. Organization: bytedance. License: Proprietary.
92.6% percentile inside its fair comparison set1,448Raw benchmark valueCI 1,444 - 1,452
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #42 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,407
- Percentile
- 87.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: creative_writing. Source rank: #53. Votes: 7983. Organization: bytedance. License: Proprietary.
87.3% percentile inside its fair comparison set1,407Raw benchmark valueCI 1,400 - 1,415
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #33 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,450
- Percentile
- 90.2%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: english. Source rank: #42. Votes: 23580. Organization: bytedance. License: Proprietary.
90.2% percentile inside its fair comparison set1,450Raw benchmark valueCI 1,445 - 1,455
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #24 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,447
- Percentile
- 92.9%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: exclude_ties. Source rank: #30. Votes: 38007. Organization: bytedance. License: Proprietary.
92.9% percentile inside its fair comparison set1,447Raw benchmark valueCI 1,442 - 1,452
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #28 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,455
- Percentile
- 91.7%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: hard_prompts. Source rank: #35. Votes: 31614. Organization: bytedance. License: Proprietary.
91.7% percentile inside its fair comparison set1,455Raw benchmark valueCI 1,450 - 1,459
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #41 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,451
- Percentile
- 87.7%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: hard_prompts_english. Source rank: #48. Votes: 15560. Organization: bytedance. License: Proprietary.
87.7% percentile inside its fair comparison set1,451Raw benchmark valueCI 1,445 - 1,456
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,413
- Percentile
- 85.8%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: instruction_following. Source rank: #60. Votes: 15921. Organization: bytedance. License: Proprietary.
85.8% percentile inside its fair comparison set1,413Raw benchmark valueCI 1,407 - 1,419
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #47 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,428
- Percentile
- 84.9%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: longer_query. Source rank: #59. Votes: 19039. Organization: bytedance. License: Proprietary.
84.9% percentile inside its fair comparison set1,428Raw benchmark valueCI 1,423 - 1,434
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #46 · Source label: dola-seed-2.0-pro
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,438
- Percentile
- 86.1%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `dola-seed-2.0-pro`. Category: multi_turn. Source rank: #57. Votes: 8804. Organization: bytedance. License: Proprietary.
86.1% percentile inside its fair comparison set1,438Raw benchmark valueCI 1,431 - 1,445