Models · /models/gemma-3-4b-it

gemma-3-4b-it

Google · Open weights · mid · registry tag 2026 benchmark-derived

textcode1 aliases2 official source links

Last verified · May 13, 2026
Visible coverage · 6.5%
Verified coverage · 6.5%
Benchmark fit · n/a
Benchmark spread · n/a

Data version

Read this before trusting a headline.

Data version May 13, 2026Model list checked9 providers · 800 tracked modelsPage refreshed May 18, 2026

Model pages expose the current registry snapshot and page stamp so stale deployments are visible without reading the code.

Source-linked scores by benchmark

Each row keeps the benchmark source, source type, raw metric, and percentile inside its fair comparison set.

Thin verified coverageThis model currently reads as thin verified coverage across the resolved evidence surface.

Text Arena

AR · Chat / text · Human

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

47.9% percentile inside its fair comparison set

1,291Raw benchmark value

Reasoning

LB · Reasoning / math / science · Objective

It is one of the cleaner reads on deliberate reasoning strength rather than style or popularity.

4.6% percentile inside its fair comparison set

15.1%Raw benchmark value

Language

LB · Chat / text · Objective

It tests whether the model is actually useful in normal conversational turns, not just on narrow correctness tasks.

24.6% percentile inside its fair comparison set

15.1%Raw benchmark value

Source links and registry checks

official

Arena

May 13, 2026

source →

official

LiveBench

May 13, 2026

source →

Radar · current evidence

Axes without current public source data stay blank rather than being treated as zero. Current gaps: Coding, Professional reasoning, Search / tool use, Long context, Vision understanding, Document understanding.

Registry facts

FamilyGemmaVariantgemma-3-4b-itProvider typeopen releaseReasoning trackgeneral

Debate this model

Open a pair page when you want a public-facing winner, strongest alternative, and decisive benchmark list instead of a generic profile.

gemma-3-4b-it vs acm_rewrite_qwen2-72B-Chat

Pairwise debate page for the current public benchmark surface.

Open debate

gemma-3-4b-it vs alpaca-13b

Pairwise debate page for the current public benchmark surface.

Open debate

gemma-3-4b-it vs amazon-nova-experimental-chat-10-09

Pairwise debate page for the current public benchmark surface.

Open debate