UABUnbiased AI BenchAI model rankings with source links.
Every score links back to its source.
Home/Benchmarks/Retrieval
Retrieval
Live · updated continuously
Browse sectionsRetrieval
Benchmarks · /benchmarks/mteb-retrieval-en-v2

Retrieval

MTEB retrieval slice for embeddings.
Source · MTEB
Version · mteb snapshot 2026-05-13
Scores · 11

Passport

Visible tradeoffsThis is a retrieval signal, so it is best read as search-stack quality rather than broad model capability.
source
MTEB
metric
NDCG@10 (ndcg)
judge
Retrieval
direction
higher better
group id
mteb_retrieval_en_v2
domain
Embeddings / retrieval

What it measures vs what it misses

✓ Measures

Embedding quality for retrieval tasks.

✗ Misses

Chat quality, generation, latency.

Why this countsIt is one of the few direct signals for retrieval stacks, where embedding quality matters more than chat style.Comparable-group ruleThis percentile only compares models inside the exact benchmark/version group shown here. It is not a universal score.What it missesIt does not tell you whether the same model is strong at generation, ranking policy, or final answer quality.

Leaderboard · this benchmark version

#1 · GPT-5
MTEB · Mar 19, 2026
58.8 ndcg
#2 · GPT-5.4
MTEB · Mar 19, 2026
58.8 ndcg
#3 · GPT-5.4 mini
MTEB · Mar 19, 2026
58.8 ndcg
#4 · GPT-5.4 nano
MTEB · Mar 19, 2026
58.8 ndcg
#5 · Gemini 2.5 Pro
MTEB · Mar 19, 2026
57.9 ndcg
#6 · Qwen3 235B A22B
MTEB · Mar 19, 2026
56.2 ndcg
#7 · Grok 4
MTEB · Mar 19, 2026
52.1 ndcg
#8 · Mistral Medium 3
MTEB · Mar 19, 2026
51.3 ndcg
#9 · Llama 4 Maverick
MTEB · Mar 19, 2026
49.6 ndcg
#10 · BAAI bge-large-en-v1.5
MTEB · May 13, 2026
49.3 ndcg
#11 · deepseek-r1-0528
MTEB · Mar 19, 2026
48.4 ndcg