← All benchmarks

Qwen 3 30B A3B — Apple Silicon Benchmarks

Measured inference speed for Qwen 3 30B A3B across 2 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.

Quantizations measured: Q4, Q5, Q6, Q4_K_M, Q8

5Benchmark rows
2Chip tiers covered
92.1Fastest avg tok/s (M4 Max (40-core GPU, 64 GB))
16.12 GBMinimum RAM observed

Benchmark results for Qwen 3 30B A3B

Rows sorted by avg tok/s descending. Click source badge to see original measurement page.

ChipQuantRAM req.ContextAvg tok/sPrompt tok/sRuntimeSource
M4 Max (40-core GPU, 64 GB)Q416.1 GB2k92.1 tok/s822.6 tok/sMLXref
M4 Max (40-core GPU, 64 GB)Q518.1 GB2k84.9 tok/s819.8 tok/sMLXref
M4 Max (40-core GPU, 64 GB)Q621.9 GB2k76.7 tok/s817.6 tok/sMLXref
M4 Max (128 GB)Q4_K_M10k70.2 tok/sLM Studioref
M4 Max (40-core GPU, 64 GB)Q829.8 GB2k52.6 tok/s772.6 tok/sMLXref

benchmarks.json — full dataset  ·  models.json — model summaries  ·  benchmarks.csv — CSV export

See all models →