| Text Arena AR · rating Text · Chat / text | 1,462n/a | 1,478n/a | n/a |
| Code Arena AR · rating Code · Coding | 1,439n/a | 1,559n/a | n/a |
| Vision Arena AR · rating Vision · Vision understanding | 1,282n/a | 1,303n/a | n/a |
| WebDev Arena AR · rating Code · Coding | 1,439n/a | 1,559n/a | n/a |
| Search Arena AR · rating Search · Search / tool use | 1,239n/a | 1,237n/a | n/a |
| Document Arena AR · rating Document · Document understanding | 1,492n/a | 1,509n/a | n/a |
| Intelligence Index AA · index Text · Chat / text | 41n/a | 52n/a | n/a |
| Time to first token AA · s Text · Chat / text | 88.88sn/a | 11.36sn/a | n/a |
| Debugging BB · % Code · Coding | 77.5%n/a | 86.2%n/a | n/a |
| Security BB · % Code · Coding | 85.3%n/a | 76%n/a | n/a |
| BS pushback BB · % Text · Professional reasoning | 88%n/a | 75.5%n/a | n/a |
| Speed throughput BB · t/s Code · Coding | 152.3 t/sn/a | 116.4 t/sn/a | n/a |
| Speed TTFT BB · ms Code · Coding | 930.00msn/a | 852.00msn/a | n/a |
| Terminal-Bench 2.0 TERMINAL-BENCH · % Code · Coding | 82%n/a | 38%n/a | n/a |
| HiL-Bench SL · % Code · Coding | 29.1%n/a | 27.7%80% | n/a |