🚀 New Releases

I Fired Up LLMs on Intel's NPU — Shocking Load Times and CPU Wins

Intel promises NPUs will turbocharge on-device AI, but my ThinkPad test? A 96-second model load slog where CPU smoked it. Here's the raw truth.

Lenovo ThinkPad running LLM on Intel NPU with Task Manager graph showing activity

⚡ Key Takeaways

  • NPU load times crush usability at 96s vs CPU's 5s. 𝕏
  • llama.cpp crushes OpenVINO backends at 22 tok/s. 𝕏
  • Special export flags (--sym, group-size 128) are mandatory for NPU success. 𝕏
Published by

theAIcatchup

Ship faster. Build smarter.

Worth sharing?

Get the best Developer Tools stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

Stay in the loop

The week's most important stories from theAIcatchup, delivered once a week.