Qwen 2.5 14B Instruct — Apple Silicon Benchmarks
Measured inference speed for Qwen 2.5 14B Instruct across 52 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.
Quantizations measured: Q4_K - Medium
52Benchmark rows
52Chip tiers covered
36.7Fastest avg tok/s (M3 Ultra (80-core GPU, 256 GB))
—Minimum RAM observed
Benchmark results for Qwen 2.5 14B Instruct
Rows sorted by avg tok/s descending. Click source badge to see original measurement page.
Chips with published results for Qwen 2.5 14B Instruct
M3 Ultra (80-core GPU, 256 GB)M2 Ultra (76-core GPU, 128 GB)M3 Ultra (80-core GPU, 512 GB)M3 Ultra (60-core GPU, 96 GB)M5 Max (32-core GPU, 36 GB)M2 Ultra (60-core GPU, 64 GB)M1 Ultra (64-core GPU, 128 GB)M4 Max (40-core GPU, 48 GB)M4 Max (40-core GPU, 128 GB)M1 Ultra (48-core GPU, 128 GB)M4 Max (40-core GPU, 64 GB)M3 Max (40-core GPU, 128 GB)M2 Max (38-core GPU, 96 GB)M4 Max (32-core GPU, 36 GB)M2 Max (38-core GPU, 64 GB)M3 Max (30-core GPU, 96 GB)M2 Max (38-core GPU, 32 GB)M1 Max (32-core GPU, 32 GB)M3 Max (30-core GPU, 36 GB)M1 Max (32-core GPU, 64 GB)M4 Pro (20-core GPU, 64 GB)M4 Pro (20-core GPU, 24 GB)M4 Pro (20-core GPU, 48 GB)M1 Max (24-core GPU, 32 GB)M4 Pro (16-core GPU, 48 GB)M4 Pro (16-core GPU, 64 GB)M4 Pro (16-core GPU, 24 GB)M1 Max (24-core GPU, 64 GB)M2 Max (30-core GPU, 64 GB)M2 Pro (19-core GPU, 32 GB)M3 Max (40-core GPU, 64 GB)M2 Pro (16-core GPU, 16 GB)M3 Pro (14-core GPU, 36 GB)M3 Pro (18-core GPU, 36 GB)M1 Pro (16-core GPU, 16 GB)M3 Pro (14-core GPU, 18 GB)M3 Pro (18-core GPU, 18 GB)M1 Pro (16-core GPU, 32 GB)M5 (10-core GPU, 32 GB)M1 Pro (14-core GPU, 16 GB)M1 Pro (14-core GPU, 32 GB)M4 (10-core GPU, 24 GB)M4 (10-core GPU, 16 GB)M4 (10-core GPU, 32 GB)M2 (10-core GPU, 16 GB)M1 Ultra (GPU count not published, 128 GB)M2 (10-core GPU, 24 GB)M4 (8-core GPU, 16 GB)M2 (8-core GPU, 16 GB)M3 (10-core GPU, 24 GB)M1 (8-core GPU, 16 GB)M1 (7-core GPU, 16 GB)
Data
benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export