Intelligence Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #136 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 18
- Percentile
- 65.8%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `intelligenceIndex`.
65.8% percentile inside its fair comparison set18Raw benchmark value
AA-Omniscience accuracy
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #160 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 16.8%
- Percentile
- 46.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `omniscienceAccuracy`.
46.6% percentile inside its fair comparison set16.8%Raw benchmark value
AA-Omniscience non-hallucination
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #71 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 29.2%
- Percentile
- 76.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `omniscienceNonHallucination`.
76.5% percentile inside its fair comparison set29.2%Raw benchmark value
IFBench
AA · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #112 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 46.1%
- Percentile
- 64.8%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `ifbench`.
64.8% percentile inside its fair comparison set46.1%Raw benchmark value
Blended price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #102 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.4 /1M tokens
- Percentile
- 63.4%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `price1mBlended0To3To1`.
63.4% percentile inside its fair comparison set$0.4 /1M tokensRaw benchmark value
Input price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #105 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.2 /1M input tokens
- Percentile
- 66.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `price1mInputTokens`.
66.3% percentile inside its fair comparison set$0.2 /1M input tokensRaw benchmark value
Output price
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #107 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- $0.8 /1M output tokens
- Percentile
- 61.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `price1mOutputTokens`.
61.6% percentile inside its fair comparison set$0.8 /1M output tokensRaw benchmark value
Output Speed
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #141 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 63.2 tokens/s
- Percentile
- 33.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `medianOutputTokensPerSecond`.
33.3% percentile inside its fair comparison set63.2 tokens/sRaw benchmark value
Time to first token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #112 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 2.39s
- Percentile
- 47.1%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstTokenSeconds`.
47.1% percentile inside its fair comparison set2.39sRaw benchmark value
Time to first answer token
AA · Chat / text · Speed / cost
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #66 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 2.39s
- Percentile
- 69%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `medianTimeToFirstAnswerTokenSeconds`.
69% percentile inside its fair comparison set2.39sRaw benchmark value
Openness Index
AA · Chat / text · Combined
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #76 · Source label: Qwen3 235B A22B 2507 Instruct
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Artificial Analysis
- Raw value
- 44
- Percentile
- 71%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Artificial Analysis public leaderboard field `opennessBreakdown.opennessIndex`.
71% percentile inside its fair comparison set44Raw benchmark value
Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #66
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,423
- Percentile
- 80%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: overall. Source rank: #84. Votes: 97241. Organization: alibaba. License: Apache 2.0.
80% percentile inside its fair comparison set1,423Raw benchmark valueCI 1,421 - 1,426
Text Arena · Creative Writing
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #84
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,381
- Percentile
- 74.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: creative_writing. Source rank: #105. Votes: 13593. Organization: alibaba. License: Apache 2.0.
74.3% percentile inside its fair comparison set1,381Raw benchmark valueCI 1,375 - 1,386
Text Arena · English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #74
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,431
- Percentile
- 77.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: english. Source rank: #92. Votes: 45078. Organization: alibaba. License: Apache 2.0.
77.5% percentile inside its fair comparison set1,431Raw benchmark valueCI 1,427 - 1,434
Text Arena · Exclude Ties
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #65
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,415
- Percentile
- 80.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: exclude_ties. Source rank: #83. Votes: 69277. Organization: alibaba. License: Apache 2.0.
80.3% percentile inside its fair comparison set1,415Raw benchmark valueCI 1,411 - 1,419
Text Arena · Hard Prompts
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #58
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,448
- Percentile
- 82.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: hard_prompts. Source rank: #74. Votes: 52312. Organization: alibaba. License: Apache 2.0.
82.5% percentile inside its fair comparison set1,448Raw benchmark valueCI 1,445 - 1,451
Text Arena · Hard Prompts English
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #67
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,451
- Percentile
- 79.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: hard_prompts_english. Source rank: #84. Votes: 25618. Organization: alibaba. License: Apache 2.0.
79.6% percentile inside its fair comparison set1,451Raw benchmark valueCI 1,447 - 1,455
Text Arena · Instruction Following
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #64
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,416
- Percentile
- 80.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: instruction_following. Source rank: #80. Votes: 26979. Organization: alibaba. License: Apache 2.0.
80.6% percentile inside its fair comparison set1,416Raw benchmark valueCI 1,412 - 1,421
Text Arena · Longer Query
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #64
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,435
- Percentile
- 79.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: longer_query. Source rank: #80. Votes: 25800. Organization: alibaba. License: Apache 2.0.
79.3% percentile inside its fair comparison set1,435Raw benchmark valueCI 1,431 - 1,440
Text Arena · Multi Turn
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #54
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,438
- Percentile
- 83.6%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: multi_turn. Source rank: #71. Votes: 17023. Organization: alibaba. License: Apache 2.0.
83.6% percentile inside its fair comparison set1,438Raw benchmark valueCI 1,433 - 1,443
Text Arena · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #67
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,419
- Percentile
- 79.7%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: overall. Source rank: #80. Votes: 97241. Organization: alibaba. License: Apache 2.0.
79.7% percentile inside its fair comparison set1,419Raw benchmark valueCI 1,417 - 1,422
Text Arena · Creative Writing · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #79
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,376
- Percentile
- 75.9%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: creative_writing. Source rank: #96. Votes: 13593. Organization: alibaba. License: Apache 2.0.
75.9% percentile inside its fair comparison set1,376Raw benchmark valueCI 1,371 - 1,382
Text Arena · English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #75
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,425
- Percentile
- 77.2%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: english. Source rank: #88. Votes: 45078. Organization: alibaba. License: Apache 2.0.
77.2% percentile inside its fair comparison set1,425Raw benchmark valueCI 1,422 - 1,428
Text Arena · Exclude Ties · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #67
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,408
- Percentile
- 79.7%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: exclude_ties. Source rank: #80. Votes: 69277. Organization: alibaba. License: Apache 2.0.
79.7% percentile inside its fair comparison set1,408Raw benchmark valueCI 1,404 - 1,411
Text Arena · Hard Prompts · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #52
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,434
- Percentile
- 84.3%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: hard_prompts. Source rank: #66. Votes: 52312. Organization: alibaba. License: Apache 2.0.
84.3% percentile inside its fair comparison set1,434Raw benchmark valueCI 1,430 - 1,437
Text Arena · Hard Prompts English · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #61
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,437
- Percentile
- 81.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: hard_prompts_english. Source rank: #72. Votes: 25618. Organization: alibaba. License: Apache 2.0.
81.5% percentile inside its fair comparison set1,437Raw benchmark valueCI 1,432 - 1,441
Text Arena · Instruction Following · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #54
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,409
- Percentile
- 83.7%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: instruction_following. Source rank: #68. Votes: 26979. Organization: alibaba. License: Apache 2.0.
83.7% percentile inside its fair comparison set1,409Raw benchmark valueCI 1,405 - 1,413
Text Arena · Longer Query · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #48
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,427
- Percentile
- 84.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: longer_query. Source rank: #60. Votes: 25800. Organization: alibaba. License: Apache 2.0.
84.5% percentile inside its fair comparison set1,427Raw benchmark valueCI 1,423 - 1,432
Text Arena · Multi Turn · No Style Control
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #51
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- Arena
- Raw value
- 1,433
- Percentile
- 84.5%
- Last updated
- recent
- Eligibility
- benchmark_derived_model
Parsed from Arena leaderboard dataset row `qwen3-235b-a22b-instruct-2507`. Category: multi_turn. Source rank: #63. Votes: 17023. Organization: alibaba. License: Apache 2.0.
84.5% percentile inside its fair comparison set1,433Raw benchmark valueCI 1,428 - 1,438
Instruction following
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #97
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 21.7%
- Percentile
- 11.1%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Category: IF. Tasks scored: 4.
11.1% percentile inside its fair comparison set21.7%Raw benchmark value
Language
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #74
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 66.1%
- Percentile
- 32.4%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Category: Language. Tasks scored: 3.
32.4% percentile inside its fair comparison set66.1%Raw benchmark value
Paraphrase
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #93
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 18%
- Percentile
- 14.8%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: paraphrase. Category: IF.
14.8% percentile inside its fair comparison set18%Raw benchmark value
Simplify
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #95
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 21.9%
- Percentile
- 13%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: simplify. Category: IF.
13% percentile inside its fair comparison set21.9%Raw benchmark value
Story generation
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #98
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 19.4%
- Percentile
- 10.2%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: story_generation. Category: IF.
10.2% percentile inside its fair comparison set19.4%Raw benchmark value
Summarize
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #91
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 27.7%
- Percentile
- 16.7%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: summarize. Category: IF.
16.7% percentile inside its fair comparison set27.7%Raw benchmark value
Connections
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #50
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 91%
- Percentile
- 57.4%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: connections. Category: Language.
57.4% percentile inside its fair comparison set91%Raw benchmark value
Plot unscrambling
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #86
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 39.2%
- Percentile
- 21.3%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: plot_unscrambling. Category: Language.
21.3% percentile inside its fair comparison set39.2%Raw benchmark value
Typos
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
Rank #78
verified runtimeexact aliasBackground only
Raw row drilldownsource row, percentile, last updated, eligibility
- Source
- LiveBench
- Raw value
- 68%
- Percentile
- 29.9%
- Last updated
- archived
- Eligibility
- benchmark_derived_model
Derived from the official LiveBench website leaderboard table. Task: typos. Category: Language.
29.9% percentile inside its fair comparison set68%Raw benchmark value