Which runs better on your device? Side-by-side comparison of specs, quantization sizes, and device compatibility.
| Quant | Llama 4 Scout | Qwen 2.5 7B | |
|---|---|---|---|
| FP16 | 218.0 GB | 15.5 GB | Qwen 2.5 7B smaller |
| Q8 | 115.0 GB | 8.1 GB | Qwen 2.5 7B smaller |
| Q6 | 84.0 GB | 5.9 GB | Qwen 2.5 7B smaller |
| Q5 | 75.0 GB | 5.1 GB | Qwen 2.5 7B smaller |
| Q4 | 63.0 GB | 4.4 GB | Qwen 2.5 7B smaller |
| Q3 | 52.0 GB | 3.3 GB | Qwen 2.5 7B smaller |
| Q2 | 42.0 GB | 2.5 GB | Qwen 2.5 7B smaller |
| Device | Llama 4 Scout | Qwen 2.5 7B |
|---|---|---|
| ๐ป MacBook Air M4 macOS | Too heavy | Runs well Q8 ยท ~15 tok/s |
| ๐ป MacBook Air M3 macOS | Too heavy | Runs well Q8 ยท ~12 tok/s |
| ๐ป MacBook Air M2 macOS | Too heavy | Tight fit Q5 ยท ~20 tok/s |
| ๐ป MacBook Pro M4 Pro macOS | Too heavy | Runs great FP16 ยท ~18 tok/s |
| ๐ป MacBook Air M1 macOS | Too heavy | Tight fit Q5 ยท ~13 tok/s |
| ๐ป MacBook Air M1 macOS | Too heavy | Runs well Q8 ยท ~8 tok/s |
| ๐ป MacBook Pro M1 macOS | Too heavy | Runs well Q8 ยท ~8 tok/s |
| ๐ป MacBook Pro M1 Pro macOS | Too heavy | Runs well Q8 ยท ~25 tok/s |
| ๐ป MacBook Pro M1 Pro macOS | Too heavy | Runs well FP16 ยท ~13 tok/s |
| ๐ป MacBook Pro M1 Max macOS | Too heavy | Runs well FP16 ยท ~26 tok/s |
| ๐ป MacBook Pro M1 Max macOS | Tight fit Q2 ยท ~10 tok/s | Runs great FP16 ยท ~26 tok/s |
| ๐ป MacBook Pro M2 Pro macOS | Too heavy | Runs well Q8 ยท ~25 tok/s |
| ๐ป MacBook Pro M2 Pro macOS | Too heavy | Runs well FP16 ยท ~13 tok/s |
| ๐ป MacBook Pro M2 Max macOS | Too heavy | Runs well FP16 ยท ~26 tok/s |
| ๐ป MacBook Pro M2 Max macOS | Tight fit Q2 ยท ~10 tok/s | Runs great FP16 ยท ~26 tok/s |
| ๐ป MacBook Pro M3 Pro macOS | Too heavy | Runs well Q8 ยท ~19 tok/s |
| ๐ป MacBook Pro M3 Pro macOS | Too heavy | Runs great FP16 ยท ~10 tok/s |
| ๐ป MacBook Pro M3 Max macOS | Too heavy | Runs great FP16 ยท ~26 tok/s |
| ๐ป MacBook Pro M3 Max macOS | Tight fit Q4 ยท ~6 tok/s | Runs great FP16 ยท ~26 tok/s |
| ๐ฑ iPhone 16 Pro iOS | Too heavy | Tight fit Q3 ยท ~14 tok/s |
| ๐ฑ iPhone 15 iOS | Too heavy | Tight fit Q2 ยท ~12 tok/s |
| ๐ฑ Galaxy S25 Ultra Android | Too heavy | Tight fit Q4 ยท ~12 tok/s |
| ๐ฑ Galaxy S24 Android | Too heavy | Tight fit Q3 ยท ~13 tok/s |
| ๐ฑ Pixel 9 Pro Android | Too heavy | Tight fit Q6 ยท ~8 tok/s |
| ๐ฎ Steam Deck OLED Linux | Too heavy | Runs well Q8 ยท ~11 tok/s |
| ๐ฅ๏ธ Gaming PC (RTX 4070) Windows | Too heavy | Runs well Q8 ยท ~62 tok/s |
| ๐ฅ๏ธ Gaming PC (RTX 3060) Windows | Too heavy | Runs well Q8 ยท ~44 tok/s |
| ๐ฅ๏ธ Gaming PC (RTX 4080) Windows | Too heavy | Runs great Q8 ยท ~89 tok/s |
| ๐ฅ๏ธ Gaming PC (RTX 4090) Windows | Tight fit Q2 ยท ~1 tok/s | Runs well FP16 ยท ~65 tok/s |
| ๐ค Atom 1 Linux | Too heavy | Runs well FP16 ยท ~13 tok/s |
| ๐ค Atom 1 Linux | Tight fit Q2 ยท ~7 tok/s | Runs great FP16 ยท ~18 tok/s |
| ๐ค Atom 1 Linux | Tight fit Q6 ยท ~3 tok/s | Runs great FP16 ยท ~18 tok/s |
| ๐ฑ iPad Pro M4 iOS | Too heavy | Tight fit Q6 ยท ~14 tok/s |
| ๐ฅ๏ธ Mac Mini M1 macOS | Too heavy | Tight fit Q5 ยท ~13 tok/s |
| ๐ฅ๏ธ Mac Mini M1 macOS | Too heavy | Runs well Q8 ยท ~8 tok/s |
| ๐ฅ๏ธ Mac Mini M2 macOS | Too heavy | Tight fit Q5 ยท ~20 tok/s |
| ๐ฅ๏ธ Mac Mini M2 Pro macOS | Too heavy | Runs well Q8 ยท ~25 tok/s |
| ๐ฅ๏ธ Mac Mini M2 Pro macOS | Too heavy | Runs well FP16 ยท ~13 tok/s |
| ๐ฅ๏ธ Mac Mini M4 macOS | Too heavy | Runs well Q8 ยท ~15 tok/s |
| ๐ฅ๏ธ Mac Mini M4 macOS | Too heavy | Runs well FP16 ยท ~8 tok/s |
| ๐ฅ๏ธ Mac Mini M4 Pro macOS | Too heavy | Tight fit FP16 ยท ~18 tok/s |
| ๐ฅ๏ธ Mac Mini M4 Pro macOS | Too heavy | Runs great FP16 ยท ~18 tok/s |
| ๐ฅ๏ธ Mac Studio M4 Max macOS | Tight fit Q2 ยท ~13 tok/s | Runs great FP16 ยท ~35 tok/s |
| ๐ฅ๏ธ Mac Pro M2 Ultra macOS | Tight fit Q8 ยท ~7 tok/s | Runs great FP16 ยท ~52 tok/s |
| ๐ป Snapdragon X Elite Laptop Windows | Too heavy | Runs well Q8 ยท ~17 tok/s |
| ๐ฑ OnePlus 13 Android | Too heavy | Tight fit Q6 ยท ~9 tok/s |
| ๐ Raspberry Pi 5 Linux | Too heavy | Tight fit Q6 ยท ~5 tok/s |
Qwen 2.5 7B fits more devices (47 vs 8). Llama 4 Scout is the larger model and may produce better quality outputs, while Qwen 2.5 7B is lighter on resources. For memory-constrained devices, Qwen 2.5 7B is smaller at its lowest quant (2.5 GB vs 42.0 GB).