Intelligence Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #43 · Source label: Qwen3.5 397B A17B (Non-reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 32
- Percentile
- 89.4%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.
89.4% percentile inside its fair comparison set32Raw benchmark value
AA-Omniscience accuracy
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #64 · Source label: Qwen3.5 397B A17B (Non-reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 24.3%
- Percentile
- 78.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.
78.9% percentile inside its fair comparison set24.3%Raw benchmark value
AA-Omniscience non-hallucination
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #198 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 10.9%
- Percentile
- 33.9%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.
33.9% percentile inside its fair comparison set10.9%Raw benchmark value
IFBench
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #86 · Source label: Qwen3.5 397B A17B (Non-reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 51.6%
- Percentile
- 73%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `ifbench`.
73% percentile inside its fair comparison set51.6%Raw benchmark value
Blended price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #187 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $1.4 /1M tokens
- Percentile
- 33%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.
33% percentile inside its fair comparison set$1.4 /1M tokensRaw benchmark value
Input price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #175 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.6 /1M input tokens
- Percentile
- 38%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.
38% percentile inside its fair comparison set$0.6 /1M input tokensRaw benchmark value
Output price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #195 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $3.6 /1M output tokens
- Percentile
- 29.7%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.
29.7% percentile inside its fair comparison set$3.6 /1M output tokensRaw benchmark value
Output Speed
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #174 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 49.7 tokens/s
- Percentile
- 17.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.
17.6% percentile inside its fair comparison set49.7 tokens/sRaw benchmark value
Time to first token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #130 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 2.68s
- Percentile
- 38.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.
38.6% percentile inside its fair comparison set2.68sRaw benchmark value
Time to first answer token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #200 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 66.79s
- Percentile
- 5.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.
5.2% percentile inside its fair comparison set66.79sRaw benchmark value
Openness Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #93 · Source label: Qwen3.5 397B A17B (Reasoning)
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 39
- Percentile
- 54.8%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.
54.8% percentile inside its fair comparison set39Raw benchmark value
Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #45 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,444
- Percentile
- 86.5%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #57. Votes: 43048. Organization: alibaba. License: Apache 2.0.
86.5% percentile inside its fair comparison set1,444Raw benchmark valueCI 1,440 - 1,448
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #43 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,413
- Percentile
- 87%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #56. Votes: 6614. Organization: alibaba. License: Apache 2.0.
87% percentile inside its fair comparison set1,413Raw benchmark valueCI 1,405 - 1,420
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #41 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,453
- Percentile
- 87.7%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #53. Votes: 20748. Organization: alibaba. License: Apache 2.0.
87.7% percentile inside its fair comparison set1,453Raw benchmark valueCI 1,447 - 1,458
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #44 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,444
- Percentile
- 86.8%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #56. Votes: 31275. Organization: alibaba. License: Apache 2.0.
86.8% percentile inside its fair comparison set1,444Raw benchmark valueCI 1,438 - 1,449
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #40 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,465
- Percentile
- 88%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #52. Votes: 27400. Organization: alibaba. License: Apache 2.0.
88% percentile inside its fair comparison set1,465Raw benchmark valueCI 1,461 - 1,470
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #40 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,472
- Percentile
- 88%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #52. Votes: 13873. Organization: alibaba. License: Apache 2.0.
88% percentile inside its fair comparison set1,472Raw benchmark valueCI 1,466 - 1,478
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #40 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,435
- Percentile
- 88%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #51. Votes: 13770. Organization: alibaba. License: Apache 2.0.
88% percentile inside its fair comparison set1,435Raw benchmark valueCI 1,429 - 1,441
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #37 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,455
- Percentile
- 88.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #48. Votes: 17022. Organization: alibaba. License: Apache 2.0.
88.2% percentile inside its fair comparison set1,455Raw benchmark valueCI 1,449 - 1,461
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #41 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,454
- Percentile
- 87.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #53. Votes: 7329. Organization: alibaba. License: Apache 2.0.
87.6% percentile inside its fair comparison set1,454Raw benchmark valueCI 1,446 - 1,461
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #36 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,440
- Percentile
- 89.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: overall. Source rank: #45. Votes: 43048. Organization: alibaba. License: Apache 2.0.
89.2% percentile inside its fair comparison set1,440Raw benchmark valueCI 1,436 - 1,444
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #41 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,409
- Percentile
- 87.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: creative_writing. Source rank: #52. Votes: 6614. Organization: alibaba. License: Apache 2.0.
87.6% percentile inside its fair comparison set1,409Raw benchmark valueCI 1,401 - 1,417
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #38 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,448
- Percentile
- 88.6%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: english. Source rank: #47. Votes: 20748. Organization: alibaba. License: Apache 2.0.
88.6% percentile inside its fair comparison set1,448Raw benchmark valueCI 1,443 - 1,453
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #35 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,437
- Percentile
- 89.5%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: exclude_ties. Source rank: #44. Votes: 31275. Organization: alibaba. License: Apache 2.0.
89.5% percentile inside its fair comparison set1,437Raw benchmark valueCI 1,432 - 1,442
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #34 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,451
- Percentile
- 89.8%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts. Source rank: #41. Votes: 27400. Organization: alibaba. License: Apache 2.0.
89.8% percentile inside its fair comparison set1,451Raw benchmark valueCI 1,446 - 1,456
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #32 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,456
- Percentile
- 90.4%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: hard_prompts_english. Source rank: #39. Votes: 13873. Organization: alibaba. License: Apache 2.0.
90.4% percentile inside its fair comparison set1,456Raw benchmark valueCI 1,450 - 1,462
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #36 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,425
- Percentile
- 89.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: instruction_following. Source rank: #44. Votes: 13770. Organization: alibaba. License: Apache 2.0.
89.2% percentile inside its fair comparison set1,425Raw benchmark valueCI 1,419 - 1,431
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #31 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,444
- Percentile
- 90.1%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: longer_query. Source rank: #39. Votes: 17022. Organization: alibaba. License: Apache 2.0.
90.1% percentile inside its fair comparison set1,444Raw benchmark valueCI 1,439 - 1,450
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #36 · Source label: qwen3.5-397b-a17b
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,447
- Percentile
- 89.2%
- Last updated
- recent
- Eligibility
- preview_model
Parsed from Arena leaderboard dataset row `qwen3.5-397b-a17b`. Category: multi_turn. Source rank: #46. Votes: 7329. Organization: alibaba. License: Apache 2.0.
89.2% percentile inside its fair comparison set1,447Raw benchmark valueCI 1,439 - 1,455