Model profile · Qwen

Qwen3.5 122B A10B

Open weightsmid · registry tag 2026 open flagship

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 17.8%
Verified coverage: 17.8%
Spread: 56.3%
Last verified: Jun 20, 2026

56%bench fit

textcodevisiondocument6 aliases40 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks67.2%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #64 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 28
Percentile: 84.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

84.1% percentile inside its fair comparison set

28Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #128 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 18.6%
Percentile: 57.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

57.4% percentile inside its fair comparison set

18.6%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #199 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.9%
Percentile: 33.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

33.6% percentile inside its fair comparison set

10.9%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #90 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 50.8%
Percentile: 71.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `ifbench`.

71.7% percentile inside its fair comparison set

50.8%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #176 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $1.1 /1M tokens
Percentile: 36.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

36.6% percentile inside its fair comparison set

$1.1 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #148 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.4 /1M input tokens
Percentile: 47.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

47.5% percentile inside its fair comparison set

$0.4 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #191 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $3.2 /1M output tokens
Percentile: 31.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

31.2% percentile inside its fair comparison set

$3.2 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #60 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 140.4 tokens/s
Percentile: 71.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

71.9% percentile inside its fair comparison set

140.4 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #127 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 2.59s
Percentile: 40%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

40% percentile inside its fair comparison set

2.59sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #118 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 16.63s
Percentile: 44.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

44.3% percentile inside its fair comparison set

16.63sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #113 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,417
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #94. Votes: 28575. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,417Raw benchmark valueCI 1,413 - 1,422

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #96 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 70.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing. Source rank: #119. Votes: 4361. Organization: alibaba. License: Apache 2.0.

70.6% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,357 - 1,376

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,431
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #91. Votes: 13479. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,431Raw benchmark valueCI 1,425 - 1,437

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,404
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: exclude_ties. Source rank: #94. Votes: 20913. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,404Raw benchmark valueCI 1,398 - 1,410

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #80 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 75.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts. Source rank: #100. Votes: 17878. Organization: alibaba. License: Apache 2.0.

75.7% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,427 - 1,438

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts_english. Source rank: #94. Votes: 8868. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,437 - 1,451

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,407
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: instruction_following. Source rank: #94. Votes: 9045. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,407Raw benchmark valueCI 1,400 - 1,413

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #85 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,418
Percentile: 72.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: longer_query. Source rank: #106. Votes: 10944. Organization: alibaba. License: Apache 2.0.

72.4% percentile inside its fair comparison set

1,418Raw benchmark valueCI 1,411 - 1,424

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,417
Percentile: 76.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: multi_turn. Source rank: #97. Votes: 4953. Organization: alibaba. License: Apache 2.0.

76.2% percentile inside its fair comparison set

1,417Raw benchmark valueCI 1,408 - 1,426

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,418
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #86. Votes: 28575. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,418Raw benchmark valueCI 1,413 - 1,422

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #83 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,369
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing. Source rank: #103. Votes: 4361. Organization: alibaba. License: Apache 2.0.

74.6% percentile inside its fair comparison set

1,369Raw benchmark valueCI 1,359 - 1,378

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #83. Votes: 13479. Organization: alibaba. License: Apache 2.0.

78.5% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,423 - 1,435

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,404
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: exclude_ties. Source rank: #89. Votes: 20913. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,404Raw benchmark valueCI 1,398 - 1,410

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,422
Percentile: 77.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts. Source rank: #91. Votes: 17878. Organization: alibaba. License: Apache 2.0.

77.5% percentile inside its fair comparison set

1,422Raw benchmark valueCI 1,416 - 1,427

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,431
Percentile: 78.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts_english. Source rank: #86. Votes: 8868. Organization: alibaba. License: Apache 2.0.

78.4% percentile inside its fair comparison set

1,431Raw benchmark valueCI 1,424 - 1,438

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,399
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: instruction_following. Source rank: #88. Votes: 9045. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,399Raw benchmark valueCI 1,393 - 1,406

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,410
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: longer_query. Source rank: #90. Votes: 10944. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,410Raw benchmark valueCI 1,403 - 1,416

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,414
Percentile: 76.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: multi_turn. Source rank: #92. Votes: 4953. Organization: alibaba. License: Apache 2.0.

76.8% percentile inside its fair comparison set

1,414Raw benchmark valueCI 1,405 - 1,423

Coding10 benchmarks53.8%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #66 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 29.5%
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

78.5% percentile inside its fair comparison set

29.5%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #127 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 35.6%
Percentile: 65.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `scicode`.

65.8% percentile inside its fair comparison set

35.6%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 46
Percentile: 61.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

61.3% percentile inside its fair comparison set

46Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 21
Percentile: 37%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

37% percentile inside its fair comparison set

21Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,364
Percentile: 38.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #55. Votes: 8152. Organization: alibaba. License: Apache 2.0.

38.4% percentile inside its fair comparison set

1,364Raw benchmark valueCI 1,357 - 1,371

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,364
Percentile: 38.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev. Source rank: #55. Votes: 8152. Organization: alibaba. License: Apache 2.0.

38.4% percentile inside its fair comparison set

1,364Raw benchmark valueCI 1,357 - 1,371

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #52 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,348
Percentile: 30.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev-html. Source rank: #62. Votes: 992. Organization: alibaba. License: Apache 2.0.

30.1% percentile inside its fair comparison set

1,348Raw benchmark valueCI 1,328 - 1,367

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #40 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 33.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev-react. Source rank: #51. Votes: 7131. Organization: alibaba. License: Apache 2.0.

33.9% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,352 - 1,367

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #80 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,459
Percentile: 75.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: coding. Source rank: #100. Votes: 7816. Organization: alibaba. License: Apache 2.0.

75.3% percentile inside its fair comparison set

1,459Raw benchmark valueCI 1,452 - 1,466

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,436
Percentile: 79.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: coding. Source rank: #83. Votes: 7816. Organization: alibaba. License: Apache 2.0.

79.4% percentile inside its fair comparison set

1,436Raw benchmark valueCI 1,429 - 1,443

Reasoning / math / science5 benchmarks83%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #58 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 14.8%
Percentile: 84.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `hle`.

84.6% percentile inside its fair comparison set

14.8%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #48 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 82.7%
Percentile: 87.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gpqa`.

87.4% percentile inside its fair comparison set

82.7%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #65 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0.6%
Percentile: 80.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `critpt`.

80.1% percentile inside its fair comparison set

0.6%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #63 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,424
Percentile: 80.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: math. Source rank: #80. Votes: 1779. Organization: alibaba. License: Apache 2.0.

80.3% percentile inside its fair comparison set

1,424Raw benchmark valueCI 1,410 - 1,438

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #55 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,428
Percentile: 82.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: math. Source rank: #67. Votes: 1779. Organization: alibaba. License: Apache 2.0.

82.8% percentile inside its fair comparison set

1,428Raw benchmark valueCI 1,414 - 1,442

Professional reasoning19 benchmarks75.2%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #26 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 979
Percentile: 45.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

45.7% percentile inside its fair comparison set

979Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,445
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: expert. Source rank: #85. Votes: 2473. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,445Raw benchmark valueCI 1,433 - 1,457

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #63 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 80.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_business_and_management_and_financial_operations. Source rank: #79. Votes: 5448. Organization: alibaba. License: Apache 2.0.

80.5% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,413 - 1,430

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #96 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 70.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_entertainment_and_sports_and_media. Source rank: #119. Votes: 5504. Organization: alibaba. License: Apache 2.0.

70.6% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,358 - 1,375

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 75.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_legal_and_government. Source rank: #95. Votes: 2216. Organization: alibaba. License: Apache 2.0.

75.2% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,410 - 1,437

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,436
Percentile: 77.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_life_and_physical_and_social_science. Source rank: #92. Votes: 4641. Organization: alibaba. License: Apache 2.0.

77.4% percentile inside its fair comparison set

1,436Raw benchmark valueCI 1,427 - 1,445

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 77.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_mathematical. Source rank: #84. Votes: 1581. Organization: alibaba. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,414 - 1,445

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #85 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 71.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_medicine_and_healthcare. Source rank: #105. Votes: 2072. Organization: alibaba. License: Apache 2.0.

71.5% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,419 - 1,447

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #79 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_software_and_it_services. Source rank: #98. Votes: 11212. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,445 - 1,457

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #84 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,389
Percentile: 74.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_writing_and_literature_and_language. Source rank: #104. Votes: 6476. Organization: alibaba. License: Apache 2.0.

74.4% percentile inside its fair comparison set

1,389Raw benchmark valueCI 1,382 - 1,397

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #59 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 78.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: expert. Source rank: #70. Votes: 2473. Organization: alibaba. License: Apache 2.0.

78.9% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,422 - 1,447

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #61 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,415
Percentile: 81.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_business_and_management_and_financial_operations. Source rank: #75. Votes: 5448. Organization: alibaba. License: Apache 2.0.

81.1% percentile inside its fair comparison set

1,415Raw benchmark valueCI 1,406 - 1,423

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #86 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,369
Percentile: 73.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_entertainment_and_sports_and_media. Source rank: #104. Votes: 5504. Organization: alibaba. License: Apache 2.0.

73.7% percentile inside its fair comparison set

1,369Raw benchmark valueCI 1,360 - 1,377

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_legal_and_government. Source rank: #84. Votes: 2216. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,410 - 1,436

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 79.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_life_and_physical_and_social_science. Source rank: #81. Votes: 4641. Organization: alibaba. License: Apache 2.0.

79.6% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,424 - 1,442

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #62 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 80.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_mathematical. Source rank: #74. Votes: 1581. Organization: alibaba. License: Apache 2.0.

80.2% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,415 - 1,446

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_medicine_and_healthcare. Source rank: #80. Votes: 2072. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,416 - 1,443

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 79.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_software_and_it_services. Source rank: #83. Votes: 11212. Organization: alibaba. License: Apache 2.0.

79.1% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,431 - 1,444

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,389
Percentile: 76.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_writing_and_literature_and_language. Source rank: #95. Votes: 6476. Organization: alibaba. License: Apache 2.0.

76.2% percentile inside its fair comparison set

1,389Raw benchmark valueCI 1,381 - 1,396

Search / tool use1 benchmark79.6%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #64 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 84.5%
Percentile: 79.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `tau2`.

79.6% percentile inside its fair comparison set

84.5%Raw benchmark value

Long context1 benchmark79.7%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #65 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 56%
Percentile: 79.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `lcr`.

79.7% percentile inside its fair comparison set

56%Raw benchmark value

Vision understanding15 benchmarks57.7%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 70.3%
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

78.5% percentile inside its fair comparison set

70.3%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,228
Percentile: 71.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #42. Votes: 12571. Organization: alibaba. License: Apache 2.0.

71.6% percentile inside its fair comparison set

1,228Raw benchmark valueCI 1,220 - 1,235

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #36 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,210
Percentile: 36.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing_vision. Source rank: #46. Votes: 741. Organization: alibaba. License: Apache 2.0.

36.4% percentile inside its fair comparison set

1,210Raw benchmark valueCI 1,187 - 1,232

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #35 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,245
Percentile: 51.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: diagram. Source rank: #46. Votes: 3251. Organization: alibaba. License: Apache 2.0.

51.4% percentile inside its fair comparison set

1,245Raw benchmark valueCI 1,234 - 1,256

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,226
Percentile: 69.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #44. Votes: 5328. Organization: alibaba. License: Apache 2.0.

69.7% percentile inside its fair comparison set

1,226Raw benchmark valueCI 1,215 - 1,236

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #33 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,253
Percentile: 52.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: homework. Source rank: #43. Votes: 1902. Organization: alibaba. License: Apache 2.0.

52.9% percentile inside its fair comparison set

1,253Raw benchmark valueCI 1,239 - 1,267

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #26 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,229
Percentile: 49%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: humor. Source rank: #33. Votes: 419. Organization: alibaba. License: Apache 2.0.

49% percentile inside its fair comparison set

1,229Raw benchmark valueCI 1,201 - 1,258

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,240
Percentile: 52.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: ocr. Source rank: #43. Votes: 8775. Organization: alibaba. License: Apache 2.0.

52.9% percentile inside its fair comparison set

1,240Raw benchmark valueCI 1,233 - 1,248

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,246
Percentile: 71.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #40. Votes: 12571. Organization: alibaba. License: Apache 2.0.

71.6% percentile inside its fair comparison set

1,246Raw benchmark valueCI 1,239 - 1,254

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,230
Percentile: 40%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing_vision. Source rank: #42. Votes: 741. Organization: alibaba. License: Apache 2.0.

40% percentile inside its fair comparison set

1,230Raw benchmark valueCI 1,208 - 1,252

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #36 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,251
Percentile: 50%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: diagram. Source rank: #44. Votes: 3251. Organization: alibaba. License: Apache 2.0.

50% percentile inside its fair comparison set

1,251Raw benchmark valueCI 1,239 - 1,262

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,245
Percentile: 69.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #41. Votes: 5328. Organization: alibaba. License: Apache 2.0.

69.7% percentile inside its fair comparison set

1,245Raw benchmark valueCI 1,235 - 1,255

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,260
Percentile: 54.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: homework. Source rank: #41. Votes: 1902. Organization: alibaba. License: Apache 2.0.

54.4% percentile inside its fair comparison set

1,260Raw benchmark valueCI 1,246 - 1,274

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,254
Percentile: 59.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: humor. Source rank: #27. Votes: 419. Organization: alibaba. License: Apache 2.0.

59.2% percentile inside its fair comparison set

1,254Raw benchmark valueCI 1,225 - 1,283

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,252
Percentile: 58.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: ocr. Source rank: #38. Votes: 8775. Organization: alibaba. License: Apache 2.0.

58.6% percentile inside its fair comparison set

1,252Raw benchmark valueCI 1,245 - 1,260

Multilingual16 benchmarks72.1%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #68 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,464
Percentile: 77.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #82. Votes: 1639. Organization: alibaba. License: Apache 2.0.

77.3% percentile inside its fair comparison set

1,464Raw benchmark valueCI 1,448 - 1,479

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #54 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 75.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: french. Source rank: #69. Votes: 872. Organization: alibaba. License: Apache 2.0.

75.5% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,429 - 1,473

Text Arena · German

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #64 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,413
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: german. Source rank: #82. Votes: 423. Organization: alibaba. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,413Raw benchmark valueCI 1,385 - 1,442

Text Arena · Japanese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,362
Percentile: 67.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: japanese. Source rank: #86. Votes: 229. Organization: alibaba. License: Apache 2.0.

67.5% percentile inside its fair comparison set

1,362Raw benchmark valueCI 1,322 - 1,402

Text Arena · Korean

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,349
Percentile: 63%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: korean. Source rank: #97. Votes: 451. Organization: alibaba. License: Apache 2.0.

63% percentile inside its fair comparison set

1,349Raw benchmark valueCI 1,320 - 1,378

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #90 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,397
Percentile: 69.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: russian. Source rank: #111. Votes: 3007. Organization: alibaba. License: Apache 2.0.

69.2% percentile inside its fair comparison set

1,397Raw benchmark valueCI 1,386 - 1,408

Text Arena · Spanish

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,412
Percentile: 69.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: spanish. Source rank: #84. Votes: 854. Organization: alibaba. License: Apache 2.0.

69.2% percentile inside its fair comparison set

1,412Raw benchmark valueCI 1,391 - 1,433

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #56 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,465
Percentile: 81.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #67. Votes: 1639. Organization: alibaba. License: Apache 2.0.

81.4% percentile inside its fair comparison set

1,465Raw benchmark valueCI 1,449 - 1,480

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #51 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,445
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: french. Source rank: #62. Votes: 872. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,445Raw benchmark valueCI 1,423 - 1,467

Text Arena · German · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #51 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 78.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: german. Source rank: #64. Votes: 423. Organization: alibaba. License: Apache 2.0.

78.9% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,392 - 1,449

Text Arena · Japanese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #59 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 71.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: japanese. Source rank: #75. Votes: 229. Organization: alibaba. License: Apache 2.0.

71.4% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,328 - 1,405

Text Arena · Korean · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,357
Percentile: 63.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: korean. Source rank: #92. Votes: 451. Organization: alibaba. License: Apache 2.0.

63.9% percentile inside its fair comparison set

1,357Raw benchmark valueCI 1,328 - 1,386

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #82 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,397
Percentile: 72%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: russian. Source rank: #102. Votes: 3007. Organization: alibaba. License: Apache 2.0.

72% percentile inside its fair comparison set

1,397Raw benchmark valueCI 1,386 - 1,409

Text Arena · Spanish · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #58 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,419
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: spanish. Source rank: #71. Votes: 854. Organization: alibaba. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,419Raw benchmark valueCI 1,399 - 1,440

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #25 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,282
Percentile: 68.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #32. Votes: 739. Organization: alibaba. License: Apache 2.0.

68.8% percentile inside its fair comparison set

1,282Raw benchmark valueCI 1,256 - 1,308

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #22 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,304
Percentile: 72.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #27. Votes: 739. Organization: alibaba. License: Apache 2.0.

72.7% percentile inside its fair comparison set

1,304Raw benchmark valueCI 1,279 - 1,330

Source links and registry checks

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Artificial Analysis

Jun 20, 2026

source →

Model profile · Qwen

Qwen3.5 122B A10B

Open weightsmid · registry tag 2026 open flagship

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 17.8%
Verified coverage: 17.8%
Spread: 56.3%
Last verified: Jun 20, 2026

56%bench fit

textcodevisiondocument6 aliases40 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks67.2%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #64 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 28
Percentile: 84.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

84.1% percentile inside its fair comparison set

28Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #128 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 18.6%
Percentile: 57.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

57.4% percentile inside its fair comparison set

18.6%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #199 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.9%
Percentile: 33.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

33.6% percentile inside its fair comparison set

10.9%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #90 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 50.8%
Percentile: 71.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `ifbench`.

71.7% percentile inside its fair comparison set

50.8%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #176 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $1.1 /1M tokens
Percentile: 36.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

36.6% percentile inside its fair comparison set

$1.1 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #148 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.4 /1M input tokens
Percentile: 47.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

47.5% percentile inside its fair comparison set

$0.4 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #191 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $3.2 /1M output tokens
Percentile: 31.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

31.2% percentile inside its fair comparison set

$3.2 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #60 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 140.4 tokens/s
Percentile: 71.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

71.9% percentile inside its fair comparison set

140.4 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #127 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 2.59s
Percentile: 40%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

40% percentile inside its fair comparison set

2.59sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #118 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 16.63s
Percentile: 44.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

44.3% percentile inside its fair comparison set

16.63sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #113 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,417
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #94. Votes: 28575. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,417Raw benchmark valueCI 1,413 - 1,422

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #96 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 70.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing. Source rank: #119. Votes: 4361. Organization: alibaba. License: Apache 2.0.

70.6% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,357 - 1,376

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,431
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #91. Votes: 13479. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,431Raw benchmark valueCI 1,425 - 1,437

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,404
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: exclude_ties. Source rank: #94. Votes: 20913. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,404Raw benchmark valueCI 1,398 - 1,410

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #80 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 75.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts. Source rank: #100. Votes: 17878. Organization: alibaba. License: Apache 2.0.

75.7% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,427 - 1,438

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts_english. Source rank: #94. Votes: 8868. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,437 - 1,451

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,407
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: instruction_following. Source rank: #94. Votes: 9045. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,407Raw benchmark valueCI 1,400 - 1,413

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #85 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,418
Percentile: 72.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: longer_query. Source rank: #106. Votes: 10944. Organization: alibaba. License: Apache 2.0.

72.4% percentile inside its fair comparison set

1,418Raw benchmark valueCI 1,411 - 1,424

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,417
Percentile: 76.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: multi_turn. Source rank: #97. Votes: 4953. Organization: alibaba. License: Apache 2.0.

76.2% percentile inside its fair comparison set

1,417Raw benchmark valueCI 1,408 - 1,426

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,418
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #86. Votes: 28575. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,418Raw benchmark valueCI 1,413 - 1,422

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #83 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,369
Percentile: 74.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing. Source rank: #103. Votes: 4361. Organization: alibaba. License: Apache 2.0.

74.6% percentile inside its fair comparison set

1,369Raw benchmark valueCI 1,359 - 1,378

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,429
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #83. Votes: 13479. Organization: alibaba. License: Apache 2.0.

78.5% percentile inside its fair comparison set

1,429Raw benchmark valueCI 1,423 - 1,435

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,404
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: exclude_ties. Source rank: #89. Votes: 20913. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,404Raw benchmark valueCI 1,398 - 1,410

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,422
Percentile: 77.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts. Source rank: #91. Votes: 17878. Organization: alibaba. License: Apache 2.0.

77.5% percentile inside its fair comparison set

1,422Raw benchmark valueCI 1,416 - 1,427

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #71 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,431
Percentile: 78.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: hard_prompts_english. Source rank: #86. Votes: 8868. Organization: alibaba. License: Apache 2.0.

78.4% percentile inside its fair comparison set

1,431Raw benchmark valueCI 1,424 - 1,438

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #73 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,399
Percentile: 77.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: instruction_following. Source rank: #88. Votes: 9045. Organization: alibaba. License: Apache 2.0.

77.8% percentile inside its fair comparison set

1,399Raw benchmark valueCI 1,393 - 1,406

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,410
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: longer_query. Source rank: #90. Votes: 10944. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,410Raw benchmark valueCI 1,403 - 1,416

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,414
Percentile: 76.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: multi_turn. Source rank: #92. Votes: 4953. Organization: alibaba. License: Apache 2.0.

76.8% percentile inside its fair comparison set

1,414Raw benchmark valueCI 1,405 - 1,423

Coding10 benchmarks53.8%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #66 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 29.5%
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

78.5% percentile inside its fair comparison set

29.5%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #127 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 35.6%
Percentile: 65.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `scicode`.

65.8% percentile inside its fair comparison set

35.6%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 46
Percentile: 61.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

61.3% percentile inside its fair comparison set

46Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 21
Percentile: 37%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

37% percentile inside its fair comparison set

21Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,364
Percentile: 38.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #55. Votes: 8152. Organization: alibaba. License: Apache 2.0.

38.4% percentile inside its fair comparison set

1,364Raw benchmark valueCI 1,357 - 1,371

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,364
Percentile: 38.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev. Source rank: #55. Votes: 8152. Organization: alibaba. License: Apache 2.0.

38.4% percentile inside its fair comparison set

1,364Raw benchmark valueCI 1,357 - 1,371

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #52 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,348
Percentile: 30.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev-html. Source rank: #62. Votes: 992. Organization: alibaba. License: Apache 2.0.

30.1% percentile inside its fair comparison set

1,348Raw benchmark valueCI 1,328 - 1,367

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #40 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,359
Percentile: 33.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: webdev-react. Source rank: #51. Votes: 7131. Organization: alibaba. License: Apache 2.0.

33.9% percentile inside its fair comparison set

1,359Raw benchmark valueCI 1,352 - 1,367

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #80 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,459
Percentile: 75.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: coding. Source rank: #100. Votes: 7816. Organization: alibaba. License: Apache 2.0.

75.3% percentile inside its fair comparison set

1,459Raw benchmark valueCI 1,452 - 1,466

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,436
Percentile: 79.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: coding. Source rank: #83. Votes: 7816. Organization: alibaba. License: Apache 2.0.

79.4% percentile inside its fair comparison set

1,436Raw benchmark valueCI 1,429 - 1,443

Reasoning / math / science5 benchmarks83%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #58 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 14.8%
Percentile: 84.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `hle`.

84.6% percentile inside its fair comparison set

14.8%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #48 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 82.7%
Percentile: 87.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gpqa`.

87.4% percentile inside its fair comparison set

82.7%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #65 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0.6%
Percentile: 80.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `critpt`.

80.1% percentile inside its fair comparison set

0.6%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #63 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,424
Percentile: 80.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: math. Source rank: #80. Votes: 1779. Organization: alibaba. License: Apache 2.0.

80.3% percentile inside its fair comparison set

1,424Raw benchmark valueCI 1,410 - 1,438

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #55 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,428
Percentile: 82.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: math. Source rank: #67. Votes: 1779. Organization: alibaba. License: Apache 2.0.

82.8% percentile inside its fair comparison set

1,428Raw benchmark valueCI 1,414 - 1,442

Professional reasoning19 benchmarks75.2%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #26 · Source label: Qwen3.5 122B A10B (Reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 979
Percentile: 45.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

45.7% percentile inside its fair comparison set

979Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,445
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: expert. Source rank: #85. Votes: 2473. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,445Raw benchmark valueCI 1,433 - 1,457

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #63 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 80.5%
Last updated: recent
Eligibility: headline eligible

80.5% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,413 - 1,430

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #96 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 70.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_entertainment_and_sports_and_media. Source rank: #119. Votes: 5504. Organization: alibaba. License: Apache 2.0.

70.6% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,358 - 1,375

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #75 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 75.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_legal_and_government. Source rank: #95. Votes: 2216. Organization: alibaba. License: Apache 2.0.

75.2% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,410 - 1,437

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #74 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,436
Percentile: 77.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_life_and_physical_and_social_science. Source rank: #92. Votes: 4641. Organization: alibaba. License: Apache 2.0.

77.4% percentile inside its fair comparison set

1,436Raw benchmark valueCI 1,427 - 1,445

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 77.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_mathematical. Source rank: #84. Votes: 1581. Organization: alibaba. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,414 - 1,445

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #85 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 71.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_medicine_and_healthcare. Source rank: #105. Votes: 2072. Organization: alibaba. License: Apache 2.0.

71.5% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,419 - 1,447

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #79 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 76%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_software_and_it_services. Source rank: #98. Votes: 11212. Organization: alibaba. License: Apache 2.0.

76% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,445 - 1,457

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #84 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,389
Percentile: 74.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_writing_and_literature_and_language. Source rank: #104. Votes: 6476. Organization: alibaba. License: Apache 2.0.

74.4% percentile inside its fair comparison set

1,389Raw benchmark valueCI 1,382 - 1,397

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #59 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 78.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: expert. Source rank: #70. Votes: 2473. Organization: alibaba. License: Apache 2.0.

78.9% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,422 - 1,447

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #61 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,415
Percentile: 81.1%
Last updated: recent
Eligibility: headline eligible

81.1% percentile inside its fair comparison set

1,415Raw benchmark valueCI 1,406 - 1,423

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #86 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,369
Percentile: 73.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_entertainment_and_sports_and_media. Source rank: #104. Votes: 5504. Organization: alibaba. License: Apache 2.0.

73.7% percentile inside its fair comparison set

1,369Raw benchmark valueCI 1,360 - 1,377

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 77.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_legal_and_government. Source rank: #84. Votes: 2216. Organization: alibaba. License: Apache 2.0.

77.2% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,410 - 1,436

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,433
Percentile: 79.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_life_and_physical_and_social_science. Source rank: #81. Votes: 4641. Organization: alibaba. License: Apache 2.0.

79.6% percentile inside its fair comparison set

1,433Raw benchmark valueCI 1,424 - 1,442

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #62 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 80.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_mathematical. Source rank: #74. Votes: 1581. Organization: alibaba. License: Apache 2.0.

80.2% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,415 - 1,446

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,430
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_medicine_and_healthcare. Source rank: #80. Votes: 2072. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,430Raw benchmark valueCI 1,416 - 1,443

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #69 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 79.1%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_software_and_it_services. Source rank: #83. Votes: 11212. Organization: alibaba. License: Apache 2.0.

79.1% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,431 - 1,444

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,389
Percentile: 76.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: industry_writing_and_literature_and_language. Source rank: #95. Votes: 6476. Organization: alibaba. License: Apache 2.0.

76.2% percentile inside its fair comparison set

1,389Raw benchmark valueCI 1,381 - 1,396

Search / tool use1 benchmark79.6%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #64 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 84.5%
Percentile: 79.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `tau2`.

79.6% percentile inside its fair comparison set

84.5%Raw benchmark value

Long context1 benchmark79.7%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #65 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 56%
Percentile: 79.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `lcr`.

79.7% percentile inside its fair comparison set

56%Raw benchmark value

Vision understanding15 benchmarks57.7%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30 · Source label: Qwen3.5 122B A10B (Non-reasoning)

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 70.3%
Percentile: 78.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

78.5% percentile inside its fair comparison set

70.3%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,228
Percentile: 71.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #42. Votes: 12571. Organization: alibaba. License: Apache 2.0.

71.6% percentile inside its fair comparison set

1,228Raw benchmark valueCI 1,220 - 1,235

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #36 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,210
Percentile: 36.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing_vision. Source rank: #46. Votes: 741. Organization: alibaba. License: Apache 2.0.

36.4% percentile inside its fair comparison set

1,210Raw benchmark valueCI 1,187 - 1,232

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #35 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,245
Percentile: 51.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: diagram. Source rank: #46. Votes: 3251. Organization: alibaba. License: Apache 2.0.

51.4% percentile inside its fair comparison set

1,245Raw benchmark valueCI 1,234 - 1,256

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,226
Percentile: 69.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #44. Votes: 5328. Organization: alibaba. License: Apache 2.0.

69.7% percentile inside its fair comparison set

1,226Raw benchmark valueCI 1,215 - 1,236

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #33 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,253
Percentile: 52.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: homework. Source rank: #43. Votes: 1902. Organization: alibaba. License: Apache 2.0.

52.9% percentile inside its fair comparison set

1,253Raw benchmark valueCI 1,239 - 1,267

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #26 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,229
Percentile: 49%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: humor. Source rank: #33. Votes: 419. Organization: alibaba. License: Apache 2.0.

49% percentile inside its fair comparison set

1,229Raw benchmark valueCI 1,201 - 1,258

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,240
Percentile: 52.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: ocr. Source rank: #43. Votes: 8775. Organization: alibaba. License: Apache 2.0.

52.9% percentile inside its fair comparison set

1,240Raw benchmark valueCI 1,233 - 1,248

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,246
Percentile: 71.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: overall. Source rank: #40. Votes: 12571. Organization: alibaba. License: Apache 2.0.

71.6% percentile inside its fair comparison set

1,246Raw benchmark valueCI 1,239 - 1,254

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,230
Percentile: 40%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: creative_writing_vision. Source rank: #42. Votes: 741. Organization: alibaba. License: Apache 2.0.

40% percentile inside its fair comparison set

1,230Raw benchmark valueCI 1,208 - 1,252

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #36 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,251
Percentile: 50%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: diagram. Source rank: #44. Votes: 3251. Organization: alibaba. License: Apache 2.0.

50% percentile inside its fair comparison set

1,251Raw benchmark valueCI 1,239 - 1,262

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #34 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,245
Percentile: 69.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: english. Source rank: #41. Votes: 5328. Organization: alibaba. License: Apache 2.0.

69.7% percentile inside its fair comparison set

1,245Raw benchmark valueCI 1,235 - 1,255

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #32 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,260
Percentile: 54.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: homework. Source rank: #41. Votes: 1902. Organization: alibaba. License: Apache 2.0.

54.4% percentile inside its fair comparison set

1,260Raw benchmark valueCI 1,246 - 1,274

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,254
Percentile: 59.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: humor. Source rank: #27. Votes: 419. Organization: alibaba. License: Apache 2.0.

59.2% percentile inside its fair comparison set

1,254Raw benchmark valueCI 1,225 - 1,283

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #30 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,252
Percentile: 58.6%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: ocr. Source rank: #38. Votes: 8775. Organization: alibaba. License: Apache 2.0.

58.6% percentile inside its fair comparison set

1,252Raw benchmark valueCI 1,245 - 1,260

Multilingual16 benchmarks72.1%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #68 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,464
Percentile: 77.3%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #82. Votes: 1639. Organization: alibaba. License: Apache 2.0.

77.3% percentile inside its fair comparison set

1,464Raw benchmark valueCI 1,448 - 1,479

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #54 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 75.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: french. Source rank: #69. Votes: 872. Organization: alibaba. License: Apache 2.0.

75.5% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,429 - 1,473

Text Arena · German

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #64 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,413
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: german. Source rank: #82. Votes: 423. Organization: alibaba. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,413Raw benchmark valueCI 1,385 - 1,442

Text Arena · Japanese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,362
Percentile: 67.5%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: japanese. Source rank: #86. Votes: 229. Organization: alibaba. License: Apache 2.0.

67.5% percentile inside its fair comparison set

1,362Raw benchmark valueCI 1,322 - 1,402

Text Arena · Korean

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #78 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,349
Percentile: 63%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: korean. Source rank: #97. Votes: 451. Organization: alibaba. License: Apache 2.0.

63% percentile inside its fair comparison set

1,349Raw benchmark valueCI 1,320 - 1,378

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #90 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,397
Percentile: 69.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: russian. Source rank: #111. Votes: 3007. Organization: alibaba. License: Apache 2.0.

69.2% percentile inside its fair comparison set

1,397Raw benchmark valueCI 1,386 - 1,408

Text Arena · Spanish

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #67 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,412
Percentile: 69.2%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: spanish. Source rank: #84. Votes: 854. Organization: alibaba. License: Apache 2.0.

69.2% percentile inside its fair comparison set

1,412Raw benchmark valueCI 1,391 - 1,433

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #56 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,465
Percentile: 81.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #67. Votes: 1639. Organization: alibaba. License: Apache 2.0.

81.4% percentile inside its fair comparison set

1,465Raw benchmark valueCI 1,449 - 1,480

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #51 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,445
Percentile: 76.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: french. Source rank: #62. Votes: 872. Organization: alibaba. License: Apache 2.0.

76.9% percentile inside its fair comparison set

1,445Raw benchmark valueCI 1,423 - 1,467

Text Arena · German · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #51 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,421
Percentile: 78.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: german. Source rank: #64. Votes: 423. Organization: alibaba. License: Apache 2.0.

78.9% percentile inside its fair comparison set

1,421Raw benchmark valueCI 1,392 - 1,449

Text Arena · Japanese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #59 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,367
Percentile: 71.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: japanese. Source rank: #75. Votes: 229. Organization: alibaba. License: Apache 2.0.

71.4% percentile inside its fair comparison set

1,367Raw benchmark valueCI 1,328 - 1,405

Text Arena · Korean · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #76 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,357
Percentile: 63.9%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: korean. Source rank: #92. Votes: 451. Organization: alibaba. License: Apache 2.0.

63.9% percentile inside its fair comparison set

1,357Raw benchmark valueCI 1,328 - 1,386

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #82 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,397
Percentile: 72%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: russian. Source rank: #102. Votes: 3007. Organization: alibaba. License: Apache 2.0.

72% percentile inside its fair comparison set

1,397Raw benchmark valueCI 1,386 - 1,409

Text Arena · Spanish · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #58 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,419
Percentile: 73.4%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: spanish. Source rank: #71. Votes: 854. Organization: alibaba. License: Apache 2.0.

73.4% percentile inside its fair comparison set

1,419Raw benchmark valueCI 1,399 - 1,440

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #25 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,282
Percentile: 68.8%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #32. Votes: 739. Organization: alibaba. License: Apache 2.0.

68.8% percentile inside its fair comparison set

1,282Raw benchmark valueCI 1,256 - 1,308

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #22 · Source label: qwen3.5-122b-a10b

verified runtimeexact alias

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,304
Percentile: 72.7%
Last updated: recent
Eligibility: headline eligible

Parsed from Arena leaderboard dataset row `qwen3.5-122b-a10b`. Category: chinese. Source rank: #27. Votes: 739. Organization: alibaba. License: Apache 2.0.

72.7% percentile inside its fair comparison set

1,304Raw benchmark valueCI 1,279 - 1,330