Type
Reasoning
Reasoning
Parameters
3.8B
7.24B
Context
128K
32K
Min tier
High
High
Runs on
47 / 47 devices
47 / 47 devices
Quant Phi-4 Mini Mistral 7B
FP16 7.8 GB 14.7 GB Phi-4 Mini smaller
Q8 4.4 GB 8.2 GB Phi-4 Mini smaller
Q6 3.1 GB 5.9 GB Phi-4 Mini smaller
Q5 2.7 GB 4.9 GB Phi-4 Mini smaller
Q4 2.2 GB 4.1 GB Phi-4 Mini smaller
Q3 1.7 GB 3.1 GB Phi-4 Mini smaller
Q2 1.2 GB 2.3 GB Phi-4 Mini smaller
Device Phi-4 Mini Mistral 7B
💻
MacBook Air M4 macOS
Runs well FP16 · ~15 tok/s Runs well Q8 · ~15 tok/s
💻
MacBook Air M3 macOS
Runs well FP16 · ~13 tok/s Runs well Q8 · ~12 tok/s
💻
MacBook Air M2 macOS
Runs well Q8 · ~23 tok/s Tight fit Q5 · ~20 tok/s
💻
MacBook Pro M4 Pro macOS
Runs great FP16 · ~35 tok/s Runs great FP16 · ~19 tok/s
💻
MacBook Air M1 macOS
Runs well Q8 · ~16 tok/s Tight fit Q5 · ~14 tok/s
💻
MacBook Air M1 macOS
Runs well FP16 · ~9 tok/s Runs well Q8 · ~8 tok/s
💻
MacBook Pro M1 macOS
Runs well FP16 · ~9 tok/s Runs well Q8 · ~8 tok/s
💻
MacBook Pro M1 Pro macOS
Runs well FP16 · ~26 tok/s Runs well Q8 · ~24 tok/s
💻
MacBook Pro M1 Pro macOS
Runs great FP16 · ~26 tok/s Runs well FP16 · ~14 tok/s
💻
MacBook Pro M1 Max macOS
Runs great FP16 · ~51 tok/s Runs well FP16 · ~27 tok/s
💻
MacBook Pro M1 Max macOS
Runs great FP16 · ~51 tok/s Runs great FP16 · ~27 tok/s
💻
MacBook Pro M2 Pro macOS
Runs well FP16 · ~26 tok/s Runs well Q8 · ~24 tok/s
💻
MacBook Pro M2 Pro macOS
Runs great FP16 · ~26 tok/s Runs well FP16 · ~14 tok/s
💻
MacBook Pro M2 Max macOS
Runs great FP16 · ~51 tok/s Runs well FP16 · ~27 tok/s
💻
MacBook Pro M2 Max macOS
Runs great FP16 · ~51 tok/s Runs great FP16 · ~27 tok/s
💻
MacBook Pro M3 Pro macOS
Runs well FP16 · ~19 tok/s Runs well Q8 · ~18 tok/s
💻
MacBook Pro M3 Pro macOS
Runs great FP16 · ~19 tok/s Runs great FP16 · ~10 tok/s
💻
MacBook Pro M3 Max macOS
Runs great FP16 · ~51 tok/s Runs great FP16 · ~27 tok/s
💻
MacBook Pro M3 Max macOS
Runs great FP16 · ~51 tok/s Runs great FP16 · ~27 tok/s
📱
iPhone 16 Pro iOS
Tight fit Q6 · ~15 tok/s Tight fit Q3 · ~15 tok/s
📱
iPhone 15 iOS
Tight fit Q5 · ~11 tok/s Tight fit Q2 · ~13 tok/s
📱
Galaxy S25 Ultra Android
Tight fit Q8 · ~12 tok/s Tight fit Q5 · ~11 tok/s
📱
Galaxy S24 Android
Tight fit Q6 · ~14 tok/s Tight fit Q3 · ~14 tok/s
📱
Pixel 9 Pro Android
Runs well Q8 · ~11 tok/s Tight fit Q6 · ~8 tok/s
🎮
Steam Deck OLED Linux
Runs well FP16 · ~11 tok/s Runs well Q8 · ~11 tok/s
🖥️
Gaming PC (RTX 4070) Windows
Runs well FP16 · ~65 tok/s Runs well Q8 · ~61 tok/s
🖥️
Gaming PC (RTX 3060) Windows
Runs well FP16 · ~46 tok/s Runs well Q8 · ~44 tok/s
🖥️
Gaming PC (RTX 4080) Windows
Runs great FP16 · ~92 tok/s Runs great Q8 · ~87 tok/s
🖥️
Gaming PC (RTX 4090) Windows
Runs great FP16 · ~129 tok/s Runs well FP16 · ~69 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~26 tok/s Runs well FP16 · ~14 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~35 tok/s Runs great FP16 · ~19 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~35 tok/s Runs great FP16 · ~19 tok/s
📱
iPad Pro M4 iOS
Runs well Q8 · ~19 tok/s Tight fit Q6 · ~14 tok/s
🖥️
Mac Mini M1 macOS
Runs well Q8 · ~16 tok/s Tight fit Q5 · ~14 tok/s
🖥️
Mac Mini M1 macOS
Runs well FP16 · ~9 tok/s Runs well Q8 · ~8 tok/s
🖥️
Mac Mini M2 macOS
Runs well Q8 · ~23 tok/s Tight fit Q5 · ~20 tok/s
🖥️
Mac Mini M2 Pro macOS
Runs well FP16 · ~26 tok/s Runs well Q8 · ~24 tok/s
🖥️
Mac Mini M2 Pro macOS
Runs great FP16 · ~26 tok/s Runs well FP16 · ~14 tok/s
🖥️
Mac Mini M4 macOS
Runs well FP16 · ~15 tok/s Runs well Q8 · ~15 tok/s
🖥️
Mac Mini M4 macOS
Runs great FP16 · ~15 tok/s Runs well FP16 · ~8 tok/s
🖥️
Mac Mini M4 Pro macOS
Runs great FP16 · ~35 tok/s Tight fit FP16 · ~19 tok/s
🖥️
Mac Mini M4 Pro macOS
Runs great FP16 · ~35 tok/s Runs great FP16 · ~19 tok/s
🖥️
Mac Studio M4 Max macOS
Runs great FP16 · ~70 tok/s Runs great FP16 · ~37 tok/s
🖥️
Mac Pro M2 Ultra macOS
Runs great FP16 · ~103 tok/s Runs great FP16 · ~54 tok/s
💻
Snapdragon X Elite Laptop Windows
Runs well FP16 · ~17 tok/s Runs well Q8 · ~17 tok/s
📱
OnePlus 13 Android
Runs well Q8 · ~12 tok/s Tight fit Q6 · ~9 tok/s
🍓
Raspberry Pi 5 Linux
Runs well Q8 · ~7 tok/s Tight fit Q6 · ~5 tok/s

Both models run on 47 of 47 devices. Phi-4 Mini has a larger context window (128K vs 32K). Mistral 7B is the larger model and may produce better quality outputs, while Phi-4 Mini is lighter on resources. For memory-constrained devices, Phi-4 Mini is smaller at its lowest quant (1.2 GB vs 2.3 GB).