Model profile · Qwen

Qwen3.5 397B A17B

Open weightsfrontier · registry tag 2026 hosted flagship

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 18.5%
Verified coverage: 18.5%
Spread: n/a
Last verified: Jun 20, 2026

textcodevisiondocument6 aliases41 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks71.8%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #43 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 32
Percentile: 89.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

89.4% percentile inside its fair comparison set

32Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #64 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 24.3%
Percentile: 78.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

78.9% percentile inside its fair comparison set

24.3%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #198 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.9%
Percentile: 33.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

33.9% percentile inside its fair comparison set

10.9%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #86 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 51.6%
Percentile: 73%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `ifbench`.

73% percentile inside its fair comparison set

51.6%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #187 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $1.4 /1M tokens
Percentile: 33%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

33% percentile inside its fair comparison set

$1.4 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #175 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.6 /1M input tokens
Percentile: 38%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

38% percentile inside its fair comparison set

$0.6 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #195 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $3.6 /1M output tokens
Percentile: 29.7%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

29.7% percentile inside its fair comparison set

$3.6 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #174 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 49.7 tokens/s
Percentile: 17.6%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

17.6% percentile inside its fair comparison set

49.7 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #130 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 2.68s
Percentile: 38.6%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

38.6% percentile inside its fair comparison set

2.68sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #200 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 66.79s
Percentile: 5.2%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

5.2% percentile inside its fair comparison set

66.79sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #93 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 86.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #57. Votes: 43048. Organization: alibaba. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,440 - 1,448

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #43 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,413
Percentile: 87%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #56. Votes: 6614. Organization: alibaba. License: Apache 2.0.

87% percentile inside its fair comparison set

1,413Raw benchmark valueCI 1,405 - 1,420

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 87.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #53. Votes: 20748. Organization: alibaba. License: Apache 2.0.

87.7% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,447 - 1,458

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 86.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #56. Votes: 31275. Organization: alibaba. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,438 - 1,449

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,465
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #52. Votes: 27400. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,465Raw benchmark valueCI 1,461 - 1,470

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,472
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #52. Votes: 13873. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,472Raw benchmark valueCI 1,466 - 1,478

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,435
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #51. Votes: 13770. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,435Raw benchmark valueCI 1,429 - 1,441

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #37 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,455
Percentile: 88.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #48. Votes: 17022. Organization: alibaba. License: Apache 2.0.

88.2% percentile inside its fair comparison set

1,455Raw benchmark valueCI 1,449 - 1,461

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,454
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #53. Votes: 7329. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,454Raw benchmark valueCI 1,446 - 1,461

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,440
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #45. Votes: 43048. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,440Raw benchmark valueCI 1,436 - 1,444

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,409
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #52. Votes: 6614. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,409Raw benchmark valueCI 1,401 - 1,417

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 88.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #47. Votes: 20748. Organization: alibaba. License: Apache 2.0.

88.6% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,443 - 1,453

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #35 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,437
Percentile: 89.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #44. Votes: 31275. Organization: alibaba. License: Apache 2.0.

89.5% percentile inside its fair comparison set

1,437Raw benchmark valueCI 1,432 - 1,442

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 89.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #41. Votes: 27400. Organization: alibaba. License: Apache 2.0.

89.8% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,446 - 1,456

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #32 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,456
Percentile: 90.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #39. Votes: 13873. Organization: alibaba. License: Apache 2.0.

90.4% percentile inside its fair comparison set

1,456Raw benchmark valueCI 1,450 - 1,462

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,425
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #44. Votes: 13770. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,425Raw benchmark valueCI 1,419 - 1,431

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #31 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 90.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #39. Votes: 17022. Organization: alibaba. License: Apache 2.0.

90.1% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,439 - 1,450

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #46. Votes: 7329. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,439 - 1,455

Coding10 benchmarks66.2%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 35.6%
Percentile: 89.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

89.4% percentile inside its fair comparison set

35.6%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 41.1%
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `scicode`.

88% percentile inside its fair comparison set

41.1%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #27 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 48
Percentile: 65.3%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

65.3% percentile inside its fair comparison set

48Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #31 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 20
Percentile: 34.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

34.8% percentile inside its fair comparison set

20Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,395
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #42. Votes: 11997. Organization: alibaba. License: Apache 2.0.

54.8% percentile inside its fair comparison set

1,395Raw benchmark valueCI 1,388 - 1,401

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,395
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev. Source rank: #42. Votes: 11997. Organization: alibaba. License: Apache 2.0.

54.8% percentile inside its fair comparison set

1,395Raw benchmark valueCI 1,388 - 1,401

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,400
Percentile: 50.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev-html. Source rank: #44. Votes: 1539. Organization: alibaba. License: Apache 2.0.

50.7% percentile inside its fair comparison set

1,400Raw benchmark valueCI 1,385 - 1,416

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,387
Percentile: 45.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev-react. Source rank: #41. Votes: 10419. Organization: alibaba. License: Apache 2.0.

45.8% percentile inside its fair comparison set

1,387Raw benchmark valueCI 1,380 - 1,393

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,494
Percentile: 87.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: coding. Source rank: #50. Votes: 12124. Organization: alibaba. License: Apache 2.0.

87.8% percentile inside its fair comparison set

1,494Raw benchmark valueCI 1,488 - 1,501

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,469
Percentile: 90.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: coding. Source rank: #38. Votes: 12124. Organization: alibaba. License: Apache 2.0.

90.9% percentile inside its fair comparison set

1,469Raw benchmark valueCI 1,462 - 1,475

Reasoning / math / science5 benchmarks89%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #46 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 18.8%
Percentile: 87.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `hle`.

87.8% percentile inside its fair comparison set

18.8%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #24 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 86.1%
Percentile: 93.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `gpqa`.

93.9% percentile inside its fair comparison set

86.1%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #53 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0.9%
Percentile: 83.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `critpt`.

83.4% percentile inside its fair comparison set

0.9%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 88.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: math. Source rank: #46. Votes: 2614. Organization: alibaba. License: Apache 2.0.

88.9% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,435 - 1,458

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #29 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,449
Percentile: 91.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: math. Source rank: #36. Votes: 2614. Organization: alibaba. License: Apache 2.0.

91.1% percentile inside its fair comparison set

1,449Raw benchmark valueCI 1,437 - 1,461

Professional reasoning20 benchmarks83.8%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #28 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 955
Percentile: 41.3%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

41.3% percentile inside its fair comparison set

955Raw benchmark value

APEX-Agents-AA

AA · Professional reasoning · Objective

Long-horizon agentic task completion.

Rank #14 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 15.3%
Percentile: 45.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `apexAgents`.

45.8% percentile inside its fair comparison set

15.3%Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,482
Percentile: 88.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: expert. Source rank: #40. Votes: 3928. Organization: alibaba. License: Apache 2.0.

88.4% percentile inside its fair comparison set

1,482Raw benchmark valueCI 1,472 - 1,492

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #43 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 86.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_business_and_management_and_financial_operations. Source rank: #56. Votes: 8474. Organization: alibaba. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,436 - 1,451

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,408
Percentile: 86.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_entertainment_and_sports_and_media. Source rank: #57. Votes: 8527. Organization: alibaba. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,408Raw benchmark valueCI 1,401 - 1,416

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #56 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,441
Percentile: 81.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_legal_and_government. Source rank: #74. Votes: 3139. Organization: alibaba. License: Apache 2.0.

81.5% percentile inside its fair comparison set

1,441Raw benchmark valueCI 1,430 - 1,452

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,468
Percentile: 88.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_life_and_physical_and_social_science. Source rank: #49. Votes: 7026. Organization: alibaba. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,468Raw benchmark valueCI 1,460 - 1,475

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #30 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 90.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_mathematical. Source rank: #36. Votes: 2381. Organization: alibaba. License: Apache 2.0.

90.6% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,449 - 1,474

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,466
Percentile: 85.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_medicine_and_healthcare. Source rank: #56. Votes: 3164. Organization: alibaba. License: Apache 2.0.

85.1% percentile inside its fair comparison set

1,466Raw benchmark valueCI 1,455 - 1,477

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,484
Percentile: 87.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_software_and_it_services. Source rank: #53. Votes: 17164. Organization: alibaba. License: Apache 2.0.

87.7% percentile inside its fair comparison set

1,484Raw benchmark valueCI 1,478 - 1,489

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #42 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,425
Percentile: 87.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_writing_and_literature_and_language. Source rank: #54. Votes: 9857. Organization: alibaba. License: Apache 2.0.

87.3% percentile inside its fair comparison set

1,425Raw benchmark valueCI 1,418 - 1,432

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #25 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,468
Percentile: 91.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: expert. Source rank: #33. Votes: 3928. Organization: alibaba. License: Apache 2.0.

91.3% percentile inside its fair comparison set

1,468Raw benchmark valueCI 1,458 - 1,478

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #32 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 90.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_business_and_management_and_financial_operations. Source rank: #37. Votes: 8474. Organization: alibaba. License: Apache 2.0.

90.3% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,427 - 1,441

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,405
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_entertainment_and_sports_and_media. Source rank: #52. Votes: 8527. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,405Raw benchmark valueCI 1,398 - 1,412

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #51 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,437
Percentile: 83.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_legal_and_government. Source rank: #63. Votes: 3139. Organization: alibaba. License: Apache 2.0.

83.2% percentile inside its fair comparison set

1,437Raw benchmark valueCI 1,426 - 1,448

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #25 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 92.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_life_and_physical_and_social_science. Source rank: #30. Votes: 7026. Organization: alibaba. License: Apache 2.0.

92.6% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,454 - 1,469

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #27 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,460
Percentile: 91.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_mathematical. Source rank: #32. Votes: 2381. Organization: alibaba. License: Apache 2.0.

91.6% percentile inside its fair comparison set

1,460Raw benchmark valueCI 1,447 - 1,472

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,457
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_medicine_and_healthcare. Source rank: #36. Votes: 3164. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,457Raw benchmark valueCI 1,446 - 1,468

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #28 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 91.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_software_and_it_services. Source rank: #35. Votes: 17164. Organization: alibaba. License: Apache 2.0.

91.7% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,462 - 1,473

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,420
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_writing_and_literature_and_language. Source rank: #46. Votes: 9857. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,420Raw benchmark valueCI 1,413 - 1,426

Search / tool use1 benchmark79%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #66 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 83.9%
Percentile: 79%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `tau2`.

79% percentile inside its fair comparison set

83.9%Raw benchmark value

Long context1 benchmark81.9%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #58 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 58%
Percentile: 81.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `lcr`.

81.9% percentile inside its fair comparison set

58%Raw benchmark value

Vision understanding17 benchmarks74.8%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #95 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 52.7%
Percentile: 30.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

30.4% percentile inside its fair comparison set

52.7%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,249
Percentile: 81.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #27. Votes: 17536. Organization: alibaba. License: Apache 2.0.

81.7% percentile inside its fair comparison set

1,249Raw benchmark valueCI 1,243 - 1,256

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #13 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 78.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing_vision. Source rank: #17. Votes: 997. Organization: alibaba. License: Apache 2.0.

78.2% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,243 - 1,282

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #19 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,278
Percentile: 74.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: diagram. Source rank: #25. Votes: 4610. Organization: alibaba. License: Apache 2.0.

74.3% percentile inside its fair comparison set

1,278Raw benchmark valueCI 1,268 - 1,288

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,253
Percentile: 84.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #23. Votes: 7466. Organization: alibaba. License: Apache 2.0.

84.4% percentile inside its fair comparison set

1,253Raw benchmark valueCI 1,244 - 1,263

Vision Arena · Entity Recognition

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #12 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,241
Percentile: 65.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: entity_recognition. Source rank: #11. Votes: 117. Organization: alibaba. License: Apache 2.0.

65.6% percentile inside its fair comparison set

1,241Raw benchmark valueCI 1,190 - 1,292

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #20 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,283
Percentile: 72.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: homework. Source rank: #26. Votes: 2443. Organization: alibaba. License: Apache 2.0.

72.1% percentile inside its fair comparison set

1,283Raw benchmark valueCI 1,271 - 1,296

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,246
Percentile: 69.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: humor. Source rank: #21. Votes: 556. Organization: alibaba. License: Apache 2.0.

69.4% percentile inside its fair comparison set

1,246Raw benchmark valueCI 1,221 - 1,272

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,267
Percentile: 77.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: ocr. Source rank: #23. Votes: 12291. Organization: alibaba. License: Apache 2.0.

77.1% percentile inside its fair comparison set

1,267Raw benchmark valueCI 1,261 - 1,274

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,266
Percentile: 85.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #22. Votes: 17536. Organization: alibaba. License: Apache 2.0.

85.3% percentile inside its fair comparison set

1,266Raw benchmark valueCI 1,259 - 1,273

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #11 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,278
Percentile: 81.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing_vision. Source rank: #15. Votes: 997. Organization: alibaba. License: Apache 2.0.

81.8% percentile inside its fair comparison set

1,278Raw benchmark valueCI 1,259 - 1,298

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,281
Percentile: 78.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: diagram. Source rank: #21. Votes: 4610. Organization: alibaba. License: Apache 2.0.

78.6% percentile inside its fair comparison set

1,281Raw benchmark valueCI 1,272 - 1,291

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,272
Percentile: 85.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #22. Votes: 7466. Organization: alibaba. License: Apache 2.0.

85.3% percentile inside its fair comparison set

1,272Raw benchmark valueCI 1,263 - 1,281

Vision Arena · Entity Recognition · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #5 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,293
Percentile: 87.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: entity_recognition. Source rank: #6. Votes: 117. Organization: alibaba. License: Apache 2.0.

87.5% percentile inside its fair comparison set

1,293Raw benchmark valueCI 1,242 - 1,343

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,289
Percentile: 75%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: homework. Source rank: #24. Votes: 2443. Organization: alibaba. License: Apache 2.0.

75% percentile inside its fair comparison set

1,289Raw benchmark valueCI 1,276 - 1,301

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,264
Percentile: 65.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: humor. Source rank: #23. Votes: 556. Organization: alibaba. License: Apache 2.0.

65.3% percentile inside its fair comparison set

1,264Raw benchmark valueCI 1,239 - 1,290

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #15 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,277
Percentile: 80%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: ocr. Source rank: #20. Votes: 12291. Organization: alibaba. License: Apache 2.0.

80% percentile inside its fair comparison set

1,277Raw benchmark valueCI 1,271 - 1,284

Multilingual16 benchmarks85.6%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #23 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,504
Percentile: 92.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #28. Votes: 2130. Organization: alibaba. License: Apache 2.0.

92.5% percentile inside its fair comparison set

1,504Raw benchmark valueCI 1,490 - 1,517

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #47 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,462
Percentile: 78.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: french. Source rank: #60. Votes: 1348. Organization: alibaba. License: Apache 2.0.

78.7% percentile inside its fair comparison set

1,462Raw benchmark valueCI 1,444 - 1,480

Text Arena · German

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,440
Percentile: 84.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: german. Source rank: #51. Votes: 675. Organization: alibaba. License: Apache 2.0.

84.4% percentile inside its fair comparison set

1,440Raw benchmark valueCI 1,416 - 1,463

Text Arena · Japanese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #27 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,422
Percentile: 87.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: japanese. Source rank: #37. Votes: 355. Organization: alibaba. License: Apache 2.0.

87.2% percentile inside its fair comparison set

1,422Raw benchmark valueCI 1,388 - 1,455

Text Arena · Korean

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,398
Percentile: 80.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: korean. Source rank: #52. Votes: 647. Organization: alibaba. License: Apache 2.0.

80.8% percentile inside its fair comparison set

1,398Raw benchmark valueCI 1,374 - 1,422

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 84.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: russian. Source rank: #59. Votes: 4534. Organization: alibaba. License: Apache 2.0.

84.8% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,429 - 1,447

Text Arena · Spanish

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #31 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 86%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: spanish. Source rank: #40. Votes: 1346. Organization: alibaba. License: Apache 2.0.

86% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,434 - 1,468

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,504
Percentile: 94.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #21. Votes: 2130. Organization: alibaba. License: Apache 2.0.

94.9% percentile inside its fair comparison set

1,504Raw benchmark valueCI 1,490 - 1,518

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,456
Percentile: 81.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: french. Source rank: #47. Votes: 1348. Organization: alibaba. License: Apache 2.0.

81.9% percentile inside its fair comparison set

1,456Raw benchmark valueCI 1,438 - 1,474

Text Arena · German · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #24 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 90.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: german. Source rank: #29. Votes: 675. Organization: alibaba. License: Apache 2.0.

90.3% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,420 - 1,467

Text Arena · Japanese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #19 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 91.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: japanese. Source rank: #23. Votes: 355. Organization: alibaba. License: Apache 2.0.

91.1% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,390 - 1,457

Text Arena · Korean · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,398
Percentile: 84.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: korean. Source rank: #41. Votes: 647. Organization: alibaba. License: Apache 2.0.

84.1% percentile inside its fair comparison set

1,398Raw benchmark valueCI 1,374 - 1,422

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,432
Percentile: 87.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: russian. Source rank: #45. Votes: 4534. Organization: alibaba. License: Apache 2.0.

87.9% percentile inside its fair comparison set

1,432Raw benchmark valueCI 1,423 - 1,441

Text Arena · Spanish · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #26 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 88.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: spanish. Source rank: #29. Votes: 1346. Organization: alibaba. License: Apache 2.0.

88.3% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,436 - 1,470

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,301
Percentile: 77.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #23. Votes: 877. Organization: alibaba. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,301Raw benchmark valueCI 1,277 - 1,325

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,323
Percentile: 79.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #21. Votes: 877. Organization: alibaba. License: Apache 2.0.

79.2% percentile inside its fair comparison set

1,323Raw benchmark valueCI 1,299 - 1,347

Source links and registry checks

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Arena

Jun 20, 2026

source →

official

Artificial Analysis

Jun 20, 2026

source →

Model profile · Qwen

Qwen3.5 397B A17B

Open weightsfrontier · registry tag 2026 hosted flagship

Thin verified coverage

Reads as thin verified coverage across the resolved source data.

Visible coverage: 18.5%
Verified coverage: 18.5%
Spread: n/a
Last verified: Jun 20, 2026

textcodevisiondocument6 aliases41 official source links

Open compare

Data version

Current snapshot.

Data version Jun 20, 2026Model list checked9 providers · 1081 tracked modelsPage refreshed Jul 5, 2026

The registry snapshot and page stamp are shown so a stale deploy is visible at a glance.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved source data.

Chat / text29 benchmarks71.8%

Intelligence Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #43 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 32
Percentile: 89.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.

89.4% percentile inside its fair comparison set

32Raw benchmark value

AA-Omniscience accuracy

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #64 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 24.3%
Percentile: 78.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.

78.9% percentile inside its fair comparison set

24.3%Raw benchmark value

AA-Omniscience non-hallucination

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #198 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 10.9%
Percentile: 33.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.

33.9% percentile inside its fair comparison set

10.9%Raw benchmark value

IFBench

AA · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #86 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 51.6%
Percentile: 73%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `ifbench`.

73% percentile inside its fair comparison set

51.6%Raw benchmark value

Blended price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #187 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $1.4 /1M tokens
Percentile: 33%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.

33% percentile inside its fair comparison set

$1.4 /1M tokensRaw benchmark value

Input price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #175 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $0.6 /1M input tokens
Percentile: 38%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.

38% percentile inside its fair comparison set

$0.6 /1M input tokensRaw benchmark value

Output price

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #195 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: $3.6 /1M output tokens
Percentile: 29.7%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.

29.7% percentile inside its fair comparison set

$3.6 /1M output tokensRaw benchmark value

Output Speed

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #174 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 49.7 tokens/s
Percentile: 17.6%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.

17.6% percentile inside its fair comparison set

49.7 tokens/sRaw benchmark value

Time to first token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #130 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 2.68s
Percentile: 38.6%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.

38.6% percentile inside its fair comparison set

2.68sRaw benchmark value

Time to first answer token

AA · Chat / text · Speed / cost

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #200 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 66.79s
Percentile: 5.2%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.

5.2% percentile inside its fair comparison set

66.79sRaw benchmark value

Openness Index

AA · Chat / text · Combined

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #93 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 39
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.

54.8% percentile inside its fair comparison set

39Raw benchmark value

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 86.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #57. Votes: 43048. Organization: alibaba. License: Apache 2.0.

86.5% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,440 - 1,448

Text Arena · Creative Writing

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #43 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,413
Percentile: 87%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #56. Votes: 6614. Organization: alibaba. License: Apache 2.0.

87% percentile inside its fair comparison set

1,413Raw benchmark valueCI 1,405 - 1,420

Text Arena · English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 87.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #53. Votes: 20748. Organization: alibaba. License: Apache 2.0.

87.7% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,447 - 1,458

Text Arena · Exclude Ties

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #44 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 86.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #56. Votes: 31275. Organization: alibaba. License: Apache 2.0.

86.8% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,438 - 1,449

Text Arena · Hard Prompts

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,465
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #52. Votes: 27400. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,465Raw benchmark valueCI 1,461 - 1,470

Text Arena · Hard Prompts English

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,472
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #52. Votes: 13873. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,472Raw benchmark valueCI 1,466 - 1,478

Text Arena · Instruction Following

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,435
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #51. Votes: 13770. Organization: alibaba. License: Apache 2.0.

88% percentile inside its fair comparison set

1,435Raw benchmark valueCI 1,429 - 1,441

Text Arena · Longer Query

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #37 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,455
Percentile: 88.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #48. Votes: 17022. Organization: alibaba. License: Apache 2.0.

88.2% percentile inside its fair comparison set

1,455Raw benchmark valueCI 1,449 - 1,461

Text Arena · Multi Turn

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,454
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #53. Votes: 7329. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,454Raw benchmark valueCI 1,446 - 1,461

Text Arena · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,440
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #45. Votes: 43048. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,440Raw benchmark valueCI 1,436 - 1,444

Text Arena · Creative Writing · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,409
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #52. Votes: 6614. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,409Raw benchmark valueCI 1,401 - 1,417

Text Arena · English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,448
Percentile: 88.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #47. Votes: 20748. Organization: alibaba. License: Apache 2.0.

88.6% percentile inside its fair comparison set

1,448Raw benchmark valueCI 1,443 - 1,453

Text Arena · Exclude Ties · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #35 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,437
Percentile: 89.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #44. Votes: 31275. Organization: alibaba. License: Apache 2.0.

89.5% percentile inside its fair comparison set

1,437Raw benchmark valueCI 1,432 - 1,442

Text Arena · Hard Prompts · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 89.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #41. Votes: 27400. Organization: alibaba. License: Apache 2.0.

89.8% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,446 - 1,456

Text Arena · Hard Prompts English · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #32 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,456
Percentile: 90.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #39. Votes: 13873. Organization: alibaba. License: Apache 2.0.

90.4% percentile inside its fair comparison set

1,456Raw benchmark valueCI 1,450 - 1,462

Text Arena · Instruction Following · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,425
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #44. Votes: 13770. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,425Raw benchmark valueCI 1,419 - 1,431

Text Arena · Longer Query · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #31 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,444
Percentile: 90.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #39. Votes: 17022. Organization: alibaba. License: Apache 2.0.

90.1% percentile inside its fair comparison set

1,444Raw benchmark valueCI 1,439 - 1,450

Text Arena · Multi Turn · No Style Control

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #46. Votes: 7329. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,439 - 1,455

Coding10 benchmarks66.2%

Terminal-Bench Hard

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 35.6%
Percentile: 89.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `terminalbenchHard`.

89.4% percentile inside its fair comparison set

35.6%Raw benchmark value

SciCode

AA · Coding · Objective

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #46 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 41.1%
Percentile: 88%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `scicode`.

88% percentile inside its fair comparison set

41.1%Raw benchmark value

Coding Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #27 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 48
Percentile: 65.3%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `codingIndex`.

65.3% percentile inside its fair comparison set

48Raw benchmark value

Agentic Index

AA · Coding · Combined

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #31 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 20
Percentile: 34.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `agenticIndex`.

34.8% percentile inside its fair comparison set

20Raw benchmark value

Code Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,395
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #42. Votes: 11997. Organization: alibaba. License: Apache 2.0.

54.8% percentile inside its fair comparison set

1,395Raw benchmark valueCI 1,388 - 1,401

WebDev Arena

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,395
Percentile: 54.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev. Source rank: #42. Votes: 11997. Organization: alibaba. License: Apache 2.0.

54.8% percentile inside its fair comparison set

1,395Raw benchmark valueCI 1,388 - 1,401

Code Arena · Webdev Html

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #37 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,400
Percentile: 50.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev-html. Source rank: #44. Votes: 1539. Organization: alibaba. License: Apache 2.0.

50.7% percentile inside its fair comparison set

1,400Raw benchmark valueCI 1,385 - 1,416

Code Arena · Webdev React

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,387
Percentile: 45.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: webdev-react. Source rank: #41. Votes: 10419. Organization: alibaba. License: Apache 2.0.

45.8% percentile inside its fair comparison set

1,387Raw benchmark valueCI 1,380 - 1,393

Text Arena · Coding

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,494
Percentile: 87.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: coding. Source rank: #50. Votes: 12124. Organization: alibaba. License: Apache 2.0.

87.8% percentile inside its fair comparison set

1,494Raw benchmark valueCI 1,488 - 1,501

Text Arena · Coding · No Style Control

AR · Coding · Human

It tells you whether the model can generate, repair, and reason over code under evaluator pressure rather than marketing examples.

Rank #30 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,469
Percentile: 90.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: coding. Source rank: #38. Votes: 12124. Organization: alibaba. License: Apache 2.0.

90.9% percentile inside its fair comparison set

1,469Raw benchmark valueCI 1,462 - 1,475

Reasoning / math / science5 benchmarks89%

Humanity's Last Exam

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #46 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 18.8%
Percentile: 87.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `hle`.

87.8% percentile inside its fair comparison set

18.8%Raw benchmark value

GPQA

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #24 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 86.1%
Percentile: 93.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `gpqa`.

93.9% percentile inside its fair comparison set

86.1%Raw benchmark value

CritPt

AA · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #53 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 0.9%
Percentile: 83.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `critpt`.

83.4% percentile inside its fair comparison set

0.9%Raw benchmark value

Text Arena · Math

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,447
Percentile: 88.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: math. Source rank: #46. Votes: 2614. Organization: alibaba. License: Apache 2.0.

88.9% percentile inside its fair comparison set

1,447Raw benchmark valueCI 1,435 - 1,458

Text Arena · Math · No Style Control

AR · Reasoning / math / science · Human

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

Rank #29 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,449
Percentile: 91.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: math. Source rank: #36. Votes: 2614. Organization: alibaba. License: Apache 2.0.

91.1% percentile inside its fair comparison set

1,449Raw benchmark valueCI 1,437 - 1,461

Professional reasoning20 benchmarks83.8%

GDPval-AA

AA · Professional reasoning · Rubric

Agentic performance on economically valuable work tasks.

Rank #28 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 955
Percentile: 41.3%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `gdpvalBreakdown.elo`.

41.3% percentile inside its fair comparison set

955Raw benchmark value

APEX-Agents-AA

AA · Professional reasoning · Objective

Long-horizon agentic task completion.

Rank #14 · Source label: Qwen3.5 397B A17B (Reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 15.3%
Percentile: 45.8%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `apexAgents`.

45.8% percentile inside its fair comparison set

15.3%Raw benchmark value

Text Arena · Expert

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,482
Percentile: 88.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: expert. Source rank: #40. Votes: 3928. Organization: alibaba. License: Apache 2.0.

88.4% percentile inside its fair comparison set

1,482Raw benchmark valueCI 1,472 - 1,492

Text Arena · Industry Business And Management And Financial Operations

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #43 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 86.8%
Last updated: recent
Eligibility: preview_model

86.8% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,436 - 1,451

Text Arena · Industry Entertainment And Sports And Media

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,408
Percentile: 86.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_entertainment_and_sports_and_media. Source rank: #57. Votes: 8527. Organization: alibaba. License: Apache 2.0.

86.4% percentile inside its fair comparison set

1,408Raw benchmark valueCI 1,401 - 1,416

Text Arena · Industry Legal And Government

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #56 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,441
Percentile: 81.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_legal_and_government. Source rank: #74. Votes: 3139. Organization: alibaba. License: Apache 2.0.

81.5% percentile inside its fair comparison set

1,441Raw benchmark valueCI 1,430 - 1,452

Text Arena · Industry Life And Physical And Social Science

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,468
Percentile: 88.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_life_and_physical_and_social_science. Source rank: #49. Votes: 7026. Organization: alibaba. License: Apache 2.0.

88.5% percentile inside its fair comparison set

1,468Raw benchmark valueCI 1,460 - 1,475

Text Arena · Industry Mathematical

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #30 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 90.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_mathematical. Source rank: #36. Votes: 2381. Organization: alibaba. License: Apache 2.0.

90.6% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,449 - 1,474

Text Arena · Industry Medicine And Healthcare

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,466
Percentile: 85.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_medicine_and_healthcare. Source rank: #56. Votes: 3164. Organization: alibaba. License: Apache 2.0.

85.1% percentile inside its fair comparison set

1,466Raw benchmark valueCI 1,455 - 1,477

Text Arena · Industry Software And It Services

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,484
Percentile: 87.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_software_and_it_services. Source rank: #53. Votes: 17164. Organization: alibaba. License: Apache 2.0.

87.7% percentile inside its fair comparison set

1,484Raw benchmark valueCI 1,478 - 1,489

Text Arena · Industry Writing And Literature And Language

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #42 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,425
Percentile: 87.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_writing_and_literature_and_language. Source rank: #54. Votes: 9857. Organization: alibaba. License: Apache 2.0.

87.3% percentile inside its fair comparison set

1,425Raw benchmark valueCI 1,418 - 1,432

Text Arena · Expert · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena expert leaderboard.

Rank #25 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,468
Percentile: 91.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: expert. Source rank: #33. Votes: 3928. Organization: alibaba. License: Apache 2.0.

91.3% percentile inside its fair comparison set

1,468Raw benchmark valueCI 1,458 - 1,478

Text Arena · Industry Business And Management And Financial Operations · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_business_and_management_and_financial_operations leaderboard.

Rank #32 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,434
Percentile: 90.3%
Last updated: recent
Eligibility: preview_model

90.3% percentile inside its fair comparison set

1,434Raw benchmark valueCI 1,427 - 1,441

Text Arena · Industry Entertainment And Sports And Media · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_entertainment_and_sports_and_media leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,405
Percentile: 87.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_entertainment_and_sports_and_media. Source rank: #52. Votes: 8527. Organization: alibaba. License: Apache 2.0.

87.6% percentile inside its fair comparison set

1,405Raw benchmark valueCI 1,398 - 1,412

Text Arena · Industry Legal And Government · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_legal_and_government leaderboard.

Rank #51 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,437
Percentile: 83.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_legal_and_government. Source rank: #63. Votes: 3139. Organization: alibaba. License: Apache 2.0.

83.2% percentile inside its fair comparison set

1,437Raw benchmark valueCI 1,426 - 1,448

Text Arena · Industry Life And Physical And Social Science · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_life_and_physical_and_social_science leaderboard.

Rank #25 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,461
Percentile: 92.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_life_and_physical_and_social_science. Source rank: #30. Votes: 7026. Organization: alibaba. License: Apache 2.0.

92.6% percentile inside its fair comparison set

1,461Raw benchmark valueCI 1,454 - 1,469

Text Arena · Industry Mathematical · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_mathematical leaderboard.

Rank #27 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,460
Percentile: 91.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_mathematical. Source rank: #32. Votes: 2381. Organization: alibaba. License: Apache 2.0.

91.6% percentile inside its fair comparison set

1,460Raw benchmark valueCI 1,447 - 1,472

Text Arena · Industry Medicine And Healthcare · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_medicine_and_healthcare leaderboard.

Rank #33 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,457
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_medicine_and_healthcare. Source rank: #36. Votes: 3164. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,457Raw benchmark valueCI 1,446 - 1,468

Text Arena · Industry Software And It Services · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_software_and_it_services leaderboard.

Rank #28 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,467
Percentile: 91.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_software_and_it_services. Source rank: #35. Votes: 17164. Organization: alibaba. License: Apache 2.0.

91.7% percentile inside its fair comparison set

1,467Raw benchmark valueCI 1,462 - 1,473

Text Arena · Industry Writing And Literature And Language · No Style Control

AR · Professional reasoning · Human

Observed user preference in Arena's Text Arena industry_writing_and_literature_and_language leaderboard.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,420
Percentile: 89.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: industry_writing_and_literature_and_language. Source rank: #46. Votes: 9857. Organization: alibaba. License: Apache 2.0.

89.2% percentile inside its fair comparison set

1,420Raw benchmark valueCI 1,413 - 1,426

Search / tool use1 benchmark79%

Tau2-Bench Telecom

AA · Search / tool use · Objective

It matters when the model must browse, call tools, and recover useful answers from external systems.

Rank #66 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 83.9%
Percentile: 79%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `tau2`.

79% percentile inside its fair comparison set

83.9%Raw benchmark value

Long context1 benchmark81.9%

Long Context Reasoning

AA · Long context · Objective

It checks whether long-context claims survive contact with retrieval, memory, or long-document tasks.

Rank #58 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 58%
Percentile: 81.9%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `lcr`.

81.9% percentile inside its fair comparison set

58%Raw benchmark value

Vision understanding17 benchmarks74.8%

MMMU-Pro

AA · Vision understanding · Objective

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #95 · Source label: Qwen3.5 397B A17B (Non-reasoning)

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Artificial Analysis
Raw value: 52.7%
Percentile: 30.4%
Last updated: recent
Eligibility: preview_model

Parsed from Artificial Analysis public leaderboard field `mmmuPro`.

30.4% percentile inside its fair comparison set

52.7%Raw benchmark value

Vision Arena

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #21 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,249
Percentile: 81.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #27. Votes: 17536. Organization: alibaba. License: Apache 2.0.

81.7% percentile inside its fair comparison set

1,249Raw benchmark valueCI 1,243 - 1,256

Vision Arena · Creative Writing Vision

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #13 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,263
Percentile: 78.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing_vision. Source rank: #17. Votes: 997. Organization: alibaba. License: Apache 2.0.

78.2% percentile inside its fair comparison set

1,263Raw benchmark valueCI 1,243 - 1,282

Vision Arena · Diagram

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #19 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,278
Percentile: 74.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: diagram. Source rank: #25. Votes: 4610. Organization: alibaba. License: Apache 2.0.

74.3% percentile inside its fair comparison set

1,278Raw benchmark valueCI 1,268 - 1,288

Vision Arena · English

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,253
Percentile: 84.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #23. Votes: 7466. Organization: alibaba. License: Apache 2.0.

84.4% percentile inside its fair comparison set

1,253Raw benchmark valueCI 1,244 - 1,263

Vision Arena · Entity Recognition

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #12 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,241
Percentile: 65.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: entity_recognition. Source rank: #11. Votes: 117. Organization: alibaba. License: Apache 2.0.

65.6% percentile inside its fair comparison set

1,241Raw benchmark valueCI 1,190 - 1,292

Vision Arena · Homework

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #20 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,283
Percentile: 72.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: homework. Source rank: #26. Votes: 2443. Organization: alibaba. License: Apache 2.0.

72.1% percentile inside its fair comparison set

1,283Raw benchmark valueCI 1,271 - 1,296

Vision Arena · Humor

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,246
Percentile: 69.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: humor. Source rank: #21. Votes: 556. Organization: alibaba. License: Apache 2.0.

69.4% percentile inside its fair comparison set

1,246Raw benchmark valueCI 1,221 - 1,272

Vision Arena · Ocr

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,267
Percentile: 77.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: ocr. Source rank: #23. Votes: 12291. Organization: alibaba. License: Apache 2.0.

77.1% percentile inside its fair comparison set

1,267Raw benchmark valueCI 1,261 - 1,274

Vision Arena · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,266
Percentile: 85.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #22. Votes: 17536. Organization: alibaba. License: Apache 2.0.

85.3% percentile inside its fair comparison set

1,266Raw benchmark valueCI 1,259 - 1,273

Vision Arena · Creative Writing Vision · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #11 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,278
Percentile: 81.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing_vision. Source rank: #15. Votes: 997. Organization: alibaba. License: Apache 2.0.

81.8% percentile inside its fair comparison set

1,278Raw benchmark valueCI 1,259 - 1,298

Vision Arena · Diagram · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,281
Percentile: 78.6%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: diagram. Source rank: #21. Votes: 4610. Organization: alibaba. License: Apache 2.0.

78.6% percentile inside its fair comparison set

1,281Raw benchmark valueCI 1,272 - 1,291

Vision Arena · English · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,272
Percentile: 85.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #22. Votes: 7466. Organization: alibaba. License: Apache 2.0.

85.3% percentile inside its fair comparison set

1,272Raw benchmark valueCI 1,263 - 1,281

Vision Arena · Entity Recognition · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #5 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,293
Percentile: 87.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: entity_recognition. Source rank: #6. Votes: 117. Organization: alibaba. License: Apache 2.0.

87.5% percentile inside its fair comparison set

1,293Raw benchmark valueCI 1,242 - 1,343

Vision Arena · Homework · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,289
Percentile: 75%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: homework. Source rank: #24. Votes: 2443. Organization: alibaba. License: Apache 2.0.

75% percentile inside its fair comparison set

1,289Raw benchmark valueCI 1,276 - 1,301

Vision Arena · Humor · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,264
Percentile: 65.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: humor. Source rank: #23. Votes: 556. Organization: alibaba. License: Apache 2.0.

65.3% percentile inside its fair comparison set

1,264Raw benchmark valueCI 1,239 - 1,290

Vision Arena · Ocr · No Style Control

AR · Vision understanding · Human

It is useful when the model must read charts, UI, screenshots, or visual scenes rather than text alone.

Rank #15 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,277
Percentile: 80%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: ocr. Source rank: #20. Votes: 12291. Organization: alibaba. License: Apache 2.0.

80% percentile inside its fair comparison set

1,277Raw benchmark valueCI 1,271 - 1,284

Multilingual16 benchmarks85.6%

Text Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #23 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,504
Percentile: 92.5%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #28. Votes: 2130. Organization: alibaba. License: Apache 2.0.

92.5% percentile inside its fair comparison set

1,504Raw benchmark valueCI 1,490 - 1,517

Text Arena · French

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #47 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,462
Percentile: 78.7%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: french. Source rank: #60. Votes: 1348. Organization: alibaba. License: Apache 2.0.

78.7% percentile inside its fair comparison set

1,462Raw benchmark valueCI 1,444 - 1,480

Text Arena · German

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #38 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,440
Percentile: 84.4%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: german. Source rank: #51. Votes: 675. Organization: alibaba. License: Apache 2.0.

84.4% percentile inside its fair comparison set

1,440Raw benchmark valueCI 1,416 - 1,463

Text Arena · Japanese

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #27 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,422
Percentile: 87.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: japanese. Source rank: #37. Votes: 355. Organization: alibaba. License: Apache 2.0.

87.2% percentile inside its fair comparison set

1,422Raw benchmark valueCI 1,388 - 1,455

Text Arena · Korean

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #41 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,398
Percentile: 80.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: korean. Source rank: #52. Votes: 647. Organization: alibaba. License: Apache 2.0.

80.8% percentile inside its fair comparison set

1,398Raw benchmark valueCI 1,374 - 1,422

Text Arena · Russian

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #45 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,438
Percentile: 84.8%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: russian. Source rank: #59. Votes: 4534. Organization: alibaba. License: Apache 2.0.

84.8% percentile inside its fair comparison set

1,438Raw benchmark valueCI 1,429 - 1,447

Text Arena · Spanish

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #31 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,451
Percentile: 86%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: spanish. Source rank: #40. Votes: 1346. Organization: alibaba. License: Apache 2.0.

86% percentile inside its fair comparison set

1,451Raw benchmark valueCI 1,434 - 1,468

Text Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena chinese leaderboard.

Rank #16 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,504
Percentile: 94.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #21. Votes: 2130. Organization: alibaba. License: Apache 2.0.

94.9% percentile inside its fair comparison set

1,504Raw benchmark valueCI 1,490 - 1,518

Text Arena · French · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena french leaderboard.

Rank #40 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,456
Percentile: 81.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: french. Source rank: #47. Votes: 1348. Organization: alibaba. License: Apache 2.0.

81.9% percentile inside its fair comparison set

1,456Raw benchmark valueCI 1,438 - 1,474

Text Arena · German · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena german leaderboard.

Rank #24 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,443
Percentile: 90.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: german. Source rank: #29. Votes: 675. Organization: alibaba. License: Apache 2.0.

90.3% percentile inside its fair comparison set

1,443Raw benchmark valueCI 1,420 - 1,467

Text Arena · Japanese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena japanese leaderboard.

Rank #19 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,423
Percentile: 91.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: japanese. Source rank: #23. Votes: 355. Organization: alibaba. License: Apache 2.0.

91.1% percentile inside its fair comparison set

1,423Raw benchmark valueCI 1,390 - 1,457

Text Arena · Korean · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena korean leaderboard.

Rank #34 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,398
Percentile: 84.1%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: korean. Source rank: #41. Votes: 647. Organization: alibaba. License: Apache 2.0.

84.1% percentile inside its fair comparison set

1,398Raw benchmark valueCI 1,374 - 1,422

Text Arena · Russian · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena russian leaderboard.

Rank #36 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,432
Percentile: 87.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: russian. Source rank: #45. Votes: 4534. Organization: alibaba. License: Apache 2.0.

87.9% percentile inside its fair comparison set

1,432Raw benchmark valueCI 1,423 - 1,441

Text Arena · Spanish · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Text Arena spanish leaderboard.

Rank #26 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,453
Percentile: 88.3%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: spanish. Source rank: #29. Votes: 1346. Organization: alibaba. License: Apache 2.0.

88.3% percentile inside its fair comparison set

1,453Raw benchmark valueCI 1,436 - 1,470

Vision Arena · Chinese

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #18 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,301
Percentile: 77.9%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #23. Votes: 877. Organization: alibaba. License: Apache 2.0.

77.9% percentile inside its fair comparison set

1,301Raw benchmark valueCI 1,277 - 1,325

Vision Arena · Chinese · No Style Control

AR · Multilingual · Human

Observed user preference in Arena's Vision Arena chinese leaderboard.

Rank #17 · Source label: qwen3.5-397b-a17b

verified runtimeexact aliasBackground only

Raw row drilldownsource row, percentile, last updated, eligibility

Source: Arena
Raw value: 1,323
Percentile: 79.2%
Last updated: recent
Eligibility: preview_model

Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: chinese. Source rank: #21. Votes: 877. Organization: alibaba. License: Apache 2.0.

79.2% percentile inside its fair comparison set

1,323Raw benchmark valueCI 1,299 - 1,347