MacBook Pro M4 Pro Benchmarks

BENCHMARK RESULTS (147)

Model	Quant	Measured	Estimated	RAM used	App	Source	Date
SmolLM2 135M	Q8	246 tok/s	200 tok/s +23%	0.2 GB	LM Studio	community	2026-02-10
Whisper Tiny	Q8	238 tok/s	200 tok/s +19%	0.1 GB	MacWhisper	editorial	2026-05-12
🔢 Nomic Embed Text	Q8	235 tok/s	200 tok/s +18%	0.2 GB	Jan	community	2026-03-16
Qwen 2.5 0.5B	Q8	234 tok/s	200 tok/s +17%	0.8 GB	Jan	community	2026-03-13
🔢 GTE Large	Q8	234 tok/s	200 tok/s +17%	0.4 GB	Jan	community	2026-02-07
SmolLM2 135M	FP16	223 tok/s	200 tok/s +12%	0.3 GB	Ollama	community	2026-03-04
SmolLM2 360M	Q8	223 tok/s	200 tok/s +12%	0.7 GB	LM Studio	editorial	2026-05-10
🗣️ Kokoro 82M	Q8	222 tok/s	200 tok/s +11%	0.3 GB	Xybrid CLI	community	2026-04-04
Whisper Small	Q8	222 tok/s	200 tok/s +11%	0.3 GB	Xybrid CLI	editorial	2026-02-23
Bonsai 8B (1-bit)	Q1	221 tok/s	200 tok/s +11%	1.3 GB	Ollama	community	2026-03-02
Ternary Bonsai 1.7B	Q2	221 tok/s	200 tok/s +11%	0.4 GB	Xybrid CLI	editorial	2026-01-03
🔢 all-MiniLM-L6-v2	Q8	220 tok/s	200 tok/s +10%	0.1 GB	Xybrid CLI	editorial	2026-01-19
SmolLM2 360M	FP16	218 tok/s	200 tok/s +9%	1.1 GB	Xybrid CLI	community	2026-03-12
🔢 BGE Small	FP16	218 tok/s	200 tok/s +9%	0.1 GB	Xybrid CLI	editorial	2026-02-10
🗣️ KittenTTS Mini	Q8	217 tok/s	200 tok/s +9%	0.1 GB	Xybrid CLI	community	2026-03-14
🔢 Nomic Embed Text	FP16	216 tok/s	200 tok/s +8%	0.3 GB	Ollama	community	2026-01-17
Qwen 2.5 Coder 0.5B	FP16	215 tok/s	200 tok/s +8%	1.4 GB	Ollama	community	2026-02-19
🗣️ NeuTTS Air	Q8	214 tok/s	182 tok/s +18%	0.9 GB	Xybrid CLI	community	2026-02-08
🔢 BGE Small	Q8	212 tok/s	200 tok/s +6%	0.1 GB	Jan	community	2026-05-12
🗣️ KittenTTS Mini	FP16	211 tok/s	200 tok/s +6%	0.2 GB	Xybrid CLI	community	2026-05-28
Qwen 2.5 Coder 0.5B	Q8	210 tok/s	200 tok/s +5%	0.9 GB	LM Studio	editorial	2026-05-23
🗣️ KittenTTS Nano	Q8	209 tok/s	200 tok/s +5%	0.1 GB	Piper	community	2026-05-28
Qwen 2.5 0.5B	FP16	208 tok/s	200 tok/s +4%	1.3 GB	Xybrid CLI	community	2026-05-26
Wav2Vec2 Base	Q8	208 tok/s	200 tok/s +4%	0.2 GB	MacWhisper	editorial	2026-04-21
Whisper Small	FP16	207 tok/s	200 tok/s +4%	0.6 GB	MacWhisper	editorial	2026-05-09
🔢 all-MiniLM-L6-v2	FP16	203 tok/s	200 tok/s +2%	0.1 GB	Jan	community	2026-04-20
🗣️ KittenTTS Nano	FP16	203 tok/s	200 tok/s +2%	0.1 GB	Piper	community	2026-04-14
Whisper Tiny	FP16	203 tok/s	200 tok/s +2%	0.1 GB	MacWhisper	editorial	2026-02-26
👁️ SmolVLM 500M	Q8	203 tok/s	200 tok/s +2%	0.8 GB	Jan	community	2026-03-06
Bonsai 8B (1-bit)	Q1	202 tok/s	200 tok/s +1%	1.5 GB	Jan	community	2026-05-28
🗣️ OuteTTS 0.3 500M	Q8	201 tok/s	200 tok/s +1%	0.6 GB	Piper	editorial	2026-03-15
Ternary Bonsai 4B	Q2	199 tok/s	200 tok/s +0%	1.1 GB	Jan	community	2026-04-18
Whisper Medium	Q8	199 tok/s	182 tok/s +9%	1.0 GB	Xybrid CLI	community	2026-02-24
Ternary Bonsai 4B	Q2	193 tok/s	200 tok/s -4%	1.1 GB	Xybrid CLI	community	2026-01-17
👁️ SmolVLM 500M	FP16	191 tok/s	200 tok/s -4%	1.4 GB	Ollama	editorial	2026-02-06
🔢 GTE Large	FP16	190 tok/s	200 tok/s -5%	0.7 GB	LM Studio	community	2026-01-14
🗣️ Kokoro 82M	FP16	189 tok/s	200 tok/s -5%	0.4 GB	Xybrid CLI	community	2026-01-18
🎙️ Distil-Whisper Large V3	Q8	189 tok/s	182 tok/s +4%	1.0 GB	Xybrid CLI	community	2026-05-16
Ternary Bonsai 1.7B	Q2	184 tok/s	200 tok/s -8%	0.4 GB	Jan	community	2026-05-12
🗣️ OuteTTS 0.3 500M	FP16	183 tok/s	200 tok/s -8%	1.2 GB	Xybrid CLI	community	2026-04-08
Wav2Vec2 Base	FP16	182 tok/s	200 tok/s -9%	0.3 GB	Xybrid CLI	community	2026-03-28
🎙️ Distil-Whisper Large V3	FP16	177 tok/s	182 tok/s -3%	1.6 GB	Xybrid CLI	community	2026-04-02
🗣️ NeuTTS Air	FP16	174 tok/s	182 tok/s -4%	1.6 GB	Xybrid CLI	community	2026-04-05
Bonsai Image 4B	Q1	174 tok/s	152 tok/s +14%	1.5 GB	Jan	community	2026-01-01
Whisper Medium	FP16	170 tok/s	182 tok/s -7%	1.8 GB	MacWhisper	community	2026-03-16
Bonsai Image 4B	Q2	169 tok/s	152 tok/s +11%	2.0 GB	Ollama	community	2026-03-04
Ternary Bonsai 8B	Q2	161 tok/s	171 tok/s -6%	2.0 GB	LM Studio	editorial	2026-01-04
Ternary Bonsai 8B	Q2	153 tok/s	171 tok/s -11%	2.0 GB	Ollama	community	2026-02-24
Qwen 3.5 0.8B	Q8	153 tok/s	152 tok/s +1%	1.0 GB	LM Studio	community	2026-04-21
TinyLlama 1.1B	Q8	143 tok/s	124 tok/s +15%	1.3 GB	Jan	community	2026-02-08
Qwen 3.5 0.8B	FP16	142 tok/s	152 tok/s -7%	1.9 GB	Jan	community	2026-02-22
Gemma 3 1B	Q8	142 tok/s	124 tok/s +15%	1.4 GB	Jan	community	2026-03-13
🎨 Stable Diffusion Turbo	Q8	132 tok/s	130 tok/s +2%	1.4 GB	Jan	community	2026-05-16
Gemma 3 1B	FP16	130 tok/s	124 tok/s +5%	2.4 GB	LM Studio	community	2026-01-13
LFM2.5 1.2B	Q8	129 tok/s	117 tok/s +10%	1.6 GB	Xybrid CLI	editorial	2026-02-22
TinyLlama 1.1B	FP16	122 tok/s	124 tok/s -2%	2.4 GB	Xybrid CLI	editorial	2026-03-28
🎨 Stable Diffusion Turbo	FP16	122 tok/s	130 tok/s -6%	2.5 GB	Ollama	community	2026-03-17
LFM2.5 1.2B	FP16	120 tok/s	117 tok/s +3%	2.7 GB	LM Studio	editorial	2026-05-08
Llama 3.2 1B	Q8	107 tok/s	101 tok/s +6%	1.8 GB	Jan	community	2026-01-17
Phi-4 Mini	Q5	105 tok/s	35 tok/s +200%	2.9 GB	Ollama	editorial	2026-02-15
Llama 3.2 1B	FP16	100 tok/s	101 tok/s -1%	3.0 GB	Xybrid CLI	editorial	2026-04-03
🧠 DeepSeek R1 Distill 1.5B	Q8	99 tok/s	83 tok/s +19%	1.9 GB	Xybrid CLI	community	2026-01-06
SmolLM2 1.7B	Q8	94 tok/s	80 tok/s +18%	2.0 GB	Ollama	editorial	2026-05-11
Qwen 2.5 Coder 1.5B	Q8	93 tok/s	85 tok/s +9%	2.0 GB	Ollama	community	2026-03-15
StableLM 2 1.6B	Q8	89 tok/s	83 tok/s +7%	2.3 GB	Ollama	editorial	2026-01-06
🗣️ Dia 1.6B	Q8	88 tok/s	83 tok/s +6%	2.0 GB	Xybrid CLI	community	2026-03-14
Whisper Large V3	Q8	88 tok/s	88 tok/s +0%	2.0 GB	Xybrid CLI	editorial	2026-05-23
Whisper Large V3	FP16	86 tok/s	88 tok/s -2%	3.7 GB	Xybrid CLI	community	2026-04-18
👁️ Moondream 2B	Q8	86 tok/s	74 tok/s +16%	2.2 GB	LM Studio	editorial	2026-01-15
🗣️ Dia 1.6B	FP16	82 tok/s	83 tok/s -1%	3.7 GB	Xybrid CLI	editorial	2026-04-06
SmolLM2 1.7B	FP16	81 tok/s	80 tok/s +1%	4.2 GB	Xybrid CLI	community	2026-02-25
Qwen 2.5 Coder 1.5B	FP16	80 tok/s	85 tok/s -6%	3.9 GB	Xybrid CLI	community	2026-05-14
Llama 3.2 3B	Q8	78 tok/s	41 tok/s +90%	3.8 GB	LM Studio	editorial	2026-02-18
👁️ Moondream 2B	FP16	78 tok/s	74 tok/s +5%	4.4 GB	Jan	community	2026-02-08
Qwen 3.5 2B	Q8	78 tok/s	65 tok/s +20%	3.0 GB	Ollama	editorial	2026-03-01
🧠 DeepSeek R1 Distill 1.5B	FP16	76 tok/s	83 tok/s -8%	3.8 GB	Jan	community	2026-04-08
StableLM 2 1.6B	FP16	75 tok/s	83 tok/s -10%	4.2 GB	Jan	community	2026-03-25
Mistral 7B	Q4	71 tok/s	19 tok/s +274%	4.3 GB	LM Studio	editorial	2026-02-15
Gemma 4 E2B	Q8	70 tok/s	61 tok/s +15%	3.2 GB	Jan	community	2026-04-22
🎨 Stable Diffusion 3.5 Medium	Q8	69 tok/s	59 tok/s +17%	3.2 GB	LM Studio	community	2026-01-13
Qwen 3.5 2B	FP16	67 tok/s	65 tok/s +3%	5.4 GB	Jan	community	2026-04-21
Gemma 4 E2B	FP16	66 tok/s	61 tok/s +8%	4.7 GB	LM Studio	community	2026-03-21
🎨 Stable Diffusion 3.5 Medium	FP16	65 tok/s	59 tok/s +10%	5.5 GB	Xybrid CLI	community	2026-04-22
Qwen 2.5 7B	Q4	64 tok/s	18 tok/s +256%	4.6 GB	Ollama	community	2026-03-01
👨‍💻 StarCoder2 3B	Q8	55 tok/s	44 tok/s +25%	3.6 GB	Xybrid CLI	community	2026-02-09
Qwen 2.5 3B	Q8	52 tok/s	43 tok/s +21%	3.7 GB	Ollama	community	2026-03-28
Qwen 2.5 Coder 3B	Q8	52 tok/s	43 tok/s +21%	4.3 GB	Xybrid CLI	community	2026-04-18
👨‍💻 StarCoder2 3B	FP16	47 tok/s	44 tok/s +7%	6.8 GB	Ollama	community	2026-01-08
Llama 3.2 3B	Q8	45 tok/s	41 tok/s +10%	4.5 GB	Xybrid CLI	community	2026-01-18
Qwen 2.5 3B	FP16	43 tok/s	43 tok/s +0%	8.1 GB	Jan	community	2026-05-26
🎨 SDXL Turbo	Q8	43 tok/s	42 tok/s +2%	4.5 GB	Jan	community	2026-05-11
Llama 3.2 3B	FP16	41 tok/s	41 tok/s +0%	8.2 GB	Jan	community	2026-04-19
Gemma 3 12B	Q4	40 tok/s	11 tok/s +264%	7.3 GB	LM Studio	editorial	2026-03-05
Qwen 2.5 Coder 3B	FP16	40 tok/s	43 tok/s -7%	7.4 GB	Ollama	editorial	2026-03-13
🎨 SDXL Turbo	FP16	40 tok/s	42 tok/s -5%	7.6 GB	Ollama	community	2026-03-12
Phi-4 Mini	Q8	40 tok/s	35 tok/s +14%	5.7 GB	Jan	community	2026-01-01
Phi-4 Mini	FP16	39 tok/s	35 tok/s +11%	8.6 GB	Ollama	community	2026-03-04
Qwen 3.5 4B	Q8	36 tok/s	33 tok/s +9%	5.3 GB	Xybrid CLI	community	2026-05-16
Gemma 4 E4B	Q8	35 tok/s	30 tok/s +17%	5.9 GB	LM Studio	community	2026-05-11
Gemma 3n E2B	Q8	34 tok/s	31 tok/s +10%	5.1 GB	Ollama	community	2026-01-04
Gemma 3 4B	Q8	32 tok/s	31 tok/s +3%	5.8 GB	Ollama	community	2026-02-11
Gemma 4 E4B	FP16	30 tok/s	30 tok/s +0%	11.5 GB	LM Studio	editorial	2026-02-26
Gemma 3n E2B	FP16	29 tok/s	31 tok/s -6%	9.5 GB	Jan	community	2026-03-24
Qwen 3.5 4B	FP16	29 tok/s	33 tok/s -12%	9.4 GB	LM Studio	editorial	2026-03-13
Gemma 3 4B	FP16	28 tok/s	31 tok/s -10%	11.0 GB	LM Studio	community	2026-05-25
Gemma 3n E4B	Q8	24 tok/s	20 tok/s +20%	8.4 GB	Jan	community	2026-02-09
👨‍💻 DeepSeek Coder 6.7B	Q8	24 tok/s	20 tok/s +20%	8.4 GB	Jan	community	2026-03-17
Qwen 2.5 7B	Q8	22 tok/s	18 tok/s +22%	8.8 GB	Ollama	community	2026-04-08
🧠 DeepSeek R1 Distill 7B	Q8	22 tok/s	18 tok/s +22%	8.9 GB	Ollama	community	2026-04-08
👁️ LLaVA 1.6 7B	Q8	22 tok/s	19 tok/s +16%	9.4 GB	LM Studio	editorial	2026-02-23
Mistral 7B	Q8	21 tok/s	19 tok/s +11%	9.7 GB	LM Studio	community	2026-01-20
Qwen 2.5 Coder 7B	Q8	21 tok/s	18 tok/s +17%	9.5 GB	Jan	community	2026-02-07
🧠 DeepSeek R1 Distill 7B	FP16	20 tok/s	18 tok/s +11%	17.0 GB	Xybrid CLI	community	2026-02-09
Llama 3.1 8B	Q8	20 tok/s	17 tok/s +18%	8.9 GB	Ollama	community	2026-03-02
🧠 DeepSeek R1 Distill 8B	Q8	20 tok/s	17 tok/s +18%	9.5 GB	LM Studio	editorial	2026-05-11
Phi-4 Medium	Q6	20 tok/s	18 tok/s +11%	11.4 GB	Jan	community	2026-04-20
Qwen 2.5 VL 7B	Q8	20 tok/s	17 tok/s +18%	9.1 GB	Xybrid CLI	community	2026-01-06
Gemma 3n E4B	FP16	19 tok/s	20 tok/s -5%	14.8 GB	Xybrid CLI	community	2026-04-06
👁️ LLaVA 1.6 7B	FP16	19 tok/s	19 tok/s +0%	17.8 GB	Ollama	community	2026-03-15
LFM2.5 8B A1B	Q8	19 tok/s	16 tok/s +19%	11.3 GB	LM Studio	editorial	2026-03-02
LFM2.5 8B A1B	FP16	18 tok/s	16 tok/s +13%	19.6 GB	Ollama	community	2026-01-16
Mistral 7B	FP16	18 tok/s	19 tok/s -5%	17.3 GB	Ollama	community	2026-03-12
Phi-4 Medium	Q8	18 tok/s	18 tok/s +0%	16.4 GB	Ollama	community	2026-02-06
👨‍💻 DeepSeek Coder 6.7B	FP16	18 tok/s	20 tok/s -10%	15.1 GB	Ollama	community	2026-02-12
Qwen 2.5 Coder 7B	FP16	18 tok/s	18 tok/s +0%	17.2 GB	LM Studio	editorial	2026-01-14
Llama 3.1 8B	FP16	17 tok/s	17 tok/s +0%	20.9 GB	Jan	community	2026-04-21
🧠 DeepSeek R1 Distill 8B	FP16	17 tok/s	17 tok/s +0%	18.0 GB	Ollama	community	2026-02-10
Qwen 3.5 9B	Q8	17 tok/s	15 tok/s +13%	10.2 GB	Jan	community	2026-05-28
🧠 Qwen3 8B	Q8	17 tok/s	16 tok/s +6%	11.3 GB	Jan	community	2026-02-22
Qwen 2.5 7B	FP16	16 tok/s	18 tok/s -11%	19.1 GB	Jan	community	2026-02-22
Qwen 3.5 9B	FP16	16 tok/s	15 tok/s +7%	19.6 GB	Ollama	community	2026-04-08
Qwen 2.5 VL 7B	FP16	16 tok/s	17 tok/s -6%	18.5 GB	Jan	community	2026-04-09
Gemma 4 26B A4B	Q6	14 tok/s	13 tok/s +8%	23.5 GB	LM Studio	editorial	2026-02-10
🧠 Qwen3 8B	FP16	14 tok/s	16 tok/s -12%	17.6 GB	LM Studio	community	2026-04-21
Gemma 4 26B A4B	Q5	14 tok/s	13 tok/s +8%	21.3 GB	Xybrid CLI	community	2026-05-13
👨‍💻 Laguna XS.2	Q5	13 tok/s	11 tok/s +18%	25.7 GB	Jan	community	2026-02-10
🎨 FLUX.1 Schnell	Q8	13 tok/s	11 tok/s +18%	13.9 GB	Ollama	community	2026-02-26
Qwen 3.5 35B A3B	Q5	12 tok/s	11 tok/s +9%	26.9 GB	Jan	community	2026-01-11
👨‍💻 Laguna XS.2	Q6	12 tok/s	11 tok/s +9%	29.0 GB	LM Studio	editorial	2026-05-11
🎨 FLUX.1 Schnell	FP16	12 tok/s	11 tok/s +9%	30.5 GB	Xybrid CLI	community	2026-01-05
Gemma 3 12B	Q8	12 tok/s	11 tok/s +9%	15.5 GB	Xybrid CLI	community	2026-01-21
Mistral Nemo 12B	Q8	12 tok/s	11 tok/s +9%	15.7 GB	Xybrid CLI	community	2026-05-12
Qwen 3.5 35B A3B	Q4	12 tok/s	11 tok/s +9%	21.3 GB	Xybrid CLI	community	2026-05-09
Gemma 3 12B	FP16	11 tok/s	11 tok/s +0%	29.0 GB	LM Studio	community	2026-04-06
Gemma 4 31B	Q5	11 tok/s	11 tok/s +0%	25.5 GB	Xybrid CLI	community	2026-05-23
Mistral Nemo 12B	FP16	10 tok/s	11 tok/s -9%	28.3 GB	Ollama	editorial	2026-04-19
Gemma 4 31B	Q6	10 tok/s	11 tok/s -9%	27.7 GB	LM Studio	editorial	2026-02-09

← All benchmarks

MacBook Pro M4 Pro 36GB