Apple Silicon vs IBM POWER8: A Tale of Two Architectures Running LLMs in 2026

Thu, 14 May 2026 00:00:00 +0000

Apple Silicon vs IBM POWER8: A Tale of Two Architectures Running LLMs in 2026

Last week I published benchmarks of running Qwen 2.5 7B on a 2016 IBM POWER8. The results were surprisingly good — 6.81 tokens/s on CPU-only inference with 80 threads hammering away.

But then came the inevitable question: How does it compare to modern hardware?

So I ran the same benchmarks on my daily driver: a Mac Studio with Apple M2 Max. Same model (Qwen 2.5 7B Q4_K_M), same quantization, different decade. Here’s what I found.

Benchmark on debene.dev

Apple Silicon vs IBM POWER8: A Tale of Two Architectures Running LLMs in 2026

Apple Silicon vs IBM POWER8: A Tale of Two Architectures Running LLMs in 2026