Models · /models/gemma-3-4b-it
gemma-3-4b-it
Google · Open weights · mid · registry tag 2026 benchmark-derived
textcode1 aliases2 official source links
Data version
Read this before trusting a headline.
Data version May 13, 2026Model list checked9 providers · 800 tracked modelsPage refreshed May 18, 2026
Model pages expose the current registry snapshot and page stamp so stale deployments are visible without reading the code.
Source-linked scores by benchmark
Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.
Thin verified coverageThis model currently reads as thin verified coverage across the resolved evidence surface.
Text Arena
AR · Chat / text · Human
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
47.9% percentile inside its fair comparison set
1,291Raw benchmark value
Reasoning
LB · Reasoning / math / science · Objective
It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.
4.6% percentile inside its fair comparison set
15.1%Raw benchmark value
Language
LB · Chat / text · Objective
It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.
24.6% percentile inside its fair comparison set
15.1%Raw benchmark value