Type
Chat
Chat
Parameters
3.09B
3.21B
Context
32K
128K
Min tier
Mid
Mid
Runs on
47 / 47 devices
47 / 47 devices
Quant Qwen 2.5 3B Llama 3.2 3B
FP16 6.4 GB 6.6 GB Qwen 2.5 3B smaller
Q8 3.5 GB 3.6 GB Qwen 2.5 3B smaller
Q6 2.5 GB 2.6 GB Qwen 2.5 3B smaller
Q5 2.2 GB 2.3 GB Qwen 2.5 3B smaller
Q4 1.8 GB 1.9 GB Qwen 2.5 3B smaller
Q3 1.4 GB 1.4 GB
Q2 1.1 GB 1.1 GB
Device Qwen 2.5 3B Llama 3.2 3B
💻
MacBook Air M4 macOS
Runs great FP16 · ~19 tok/s Runs great FP16 · ~18 tok/s
💻
MacBook Air M3 macOS
Runs great FP16 · ~16 tok/s Runs great FP16 · ~15 tok/s
💻
MacBook Air M2 macOS
Runs well Q8 · ~29 tok/s Runs well Q8 · ~28 tok/s
💻
MacBook Pro M4 Pro macOS
Runs great FP16 · ~43 tok/s Runs great FP16 · ~41 tok/s
💻
MacBook Air M1 macOS
Runs well Q8 · ~20 tok/s Runs well Q8 · ~19 tok/s
💻
MacBook Air M1 macOS
Runs great FP16 · ~11 tok/s Runs great FP16 · ~10 tok/s
💻
MacBook Pro M1 macOS
Runs great FP16 · ~11 tok/s Runs great FP16 · ~10 tok/s
💻
MacBook Pro M1 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
💻
MacBook Pro M1 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
💻
MacBook Pro M1 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
💻
MacBook Pro M1 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
💻
MacBook Pro M2 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
💻
MacBook Pro M2 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
💻
MacBook Pro M2 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
💻
MacBook Pro M2 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
💻
MacBook Pro M3 Pro macOS
Runs great FP16 · ~23 tok/s Runs great FP16 · ~23 tok/s
💻
MacBook Pro M3 Pro macOS
Runs great FP16 · ~23 tok/s Runs great FP16 · ~23 tok/s
💻
MacBook Pro M3 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
💻
MacBook Pro M3 Max macOS
Runs great FP16 · ~63 tok/s Runs great FP16 · ~61 tok/s
📱
iPhone 16 Pro iOS
Tight fit Q8 · ~14 tok/s Tight fit Q8 · ~13 tok/s
📱
iPhone 15 iOS
Tight fit Q6 · ~12 tok/s Tight fit Q6 · ~11 tok/s
📱
Galaxy S25 Ultra Android
Runs well Q8 · ~15 tok/s Runs well Q8 · ~15 tok/s
📱
Galaxy S24 Android
Runs well Q6 · ~17 tok/s Runs well Q6 · ~16 tok/s
📱
Pixel 9 Pro Android
Tight fit FP16 · ~7 tok/s Tight fit FP16 · ~7 tok/s
🎮
Steam Deck OLED Linux
Runs great FP16 · ~14 tok/s Runs great FP16 · ~13 tok/s
🖥️
Gaming PC (RTX 4070) Windows
Runs great FP16 · ~79 tok/s Runs well FP16 · ~76 tok/s
🖥️
Gaming PC (RTX 3060) Windows
Runs great FP16 · ~56 tok/s Runs well FP16 · ~55 tok/s
🖥️
Gaming PC (RTX 4080) Windows
Runs great FP16 · ~112 tok/s Runs great FP16 · ~109 tok/s
🖥️
Gaming PC (RTX 4090) Windows
Runs great FP16 · ~158 tok/s Runs great FP16 · ~153 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~32 tok/s Runs great FP16 · ~31 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~43 tok/s Runs great FP16 · ~41 tok/s
🤖
Atom 1 Linux
Runs great FP16 · ~43 tok/s Runs great FP16 · ~41 tok/s
📱
iPad Pro M4 iOS
Tight fit FP16 · ~13 tok/s Tight fit FP16 · ~13 tok/s
🖥️
Mac Mini M1 macOS
Runs well Q8 · ~20 tok/s Runs well Q8 · ~19 tok/s
🖥️
Mac Mini M1 macOS
Runs great FP16 · ~11 tok/s Runs great FP16 · ~10 tok/s
🖥️
Mac Mini M2 macOS
Runs well Q8 · ~29 tok/s Runs well Q8 · ~28 tok/s
🖥️
Mac Mini M2 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
🖥️
Mac Mini M2 Pro macOS
Runs great FP16 · ~31 tok/s Runs great FP16 · ~30 tok/s
🖥️
Mac Mini M4 macOS
Runs great FP16 · ~19 tok/s Runs great FP16 · ~18 tok/s
🖥️
Mac Mini M4 macOS
Runs great FP16 · ~19 tok/s Runs great FP16 · ~18 tok/s
🖥️
Mac Mini M4 Pro macOS
Runs great FP16 · ~43 tok/s Runs great FP16 · ~41 tok/s
🖥️
Mac Mini M4 Pro macOS
Runs great FP16 · ~43 tok/s Runs great FP16 · ~41 tok/s
🖥️
Mac Studio M4 Max macOS
Runs great FP16 · ~85 tok/s Runs great FP16 · ~83 tok/s
🖥️
Mac Pro M2 Ultra macOS
Runs great FP16 · ~125 tok/s Runs great FP16 · ~121 tok/s
💻
Snapdragon X Elite Laptop Windows
Runs great FP16 · ~21 tok/s Runs well FP16 · ~21 tok/s
📱
OnePlus 13 Android
Tight fit FP16 · ~8 tok/s Tight fit FP16 · ~8 tok/s
🍓
Raspberry Pi 5 Linux
Runs great Q8 · ~9 tok/s Runs great Q8 · ~9 tok/s

Both models run on 47 of 47 devices. Llama 3.2 3B has a larger context window (128K vs 32K). Llama 3.2 3B is the larger model and may produce better quality outputs, while Qwen 2.5 3B is lighter on resources.