robitec97/gemma3.c

Gemma 3 pure inference in C

27
/ 100
Experimental

Implements Gemma 3 4B inference with native SentencePiece tokenization (262K vocab) and memory-mapped BF16 SafeTensors weights, supporting hybrid attention with grouped query attention and 128K context windows. Offers Metal GPU acceleration for Apple Silicon, optional OpenBLAS BLAS operations, and multi-threaded CPU inference, with both CLI and C library interfaces. Achieves ~3GB runtime memory via KV cache scaling and includes interactive chat mode with multi-turn conversation history.

112 stars.

No License No Package No Dependents
Maintenance 10 / 25
Adoption 9 / 25
Maturity 1 / 25
Community 7 / 25

How are scores calculated?

Stars

112

Forks

5

Language

C

License

Last pushed

Feb 04, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/robitec97/gemma3.c"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.