🤖 AI Dev Tools

3866 Tokens/Second: Asthenosphere Unleashes AMD NPU's Full Fury

Picture this: an AMD Ryzen NPU churning out AI responses at 3866 effective tokens per second, no CPU or GPU in sight. Asthenosphere just turned your laptop into a speculative decoding beast.

DevTools Feed Apr 03, 2026 4 min read

Asthenosphere performance logs showing 3866 tok/s on AMD Phoenix NPU tiles

⚡ Key Takeaways

Asthenosphere achieves 3866 effective tok/s on AMD NPU with zero CPU/GPU usage.
Full 12-tile transformer pipeline enables speculative decoding at 91.8% acceptance.
Edge AI shift incoming: NPUs like Phoenix XDNA redefine on-device inference.

Published by

DevTools Feed

Ship faster. Build smarter.

#AI Inference #AMD NPU #Asthenosphere #Speculative Decoding

Worth sharing?

Get the best Developer Tools stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

⚡ Key Takeaways

The 60-Second TL;DR

DevTools Feed

Share this article

Worth sharing?

Related Stories

8:43 to AI-Generated Dungeons on a Phone: My On-Device Roguelike Experiment

On-Device AI Tries to Build a Roguelike RPG: 8 Minutes Per Dungeon, and Counting

QodoAI Turns GitHub PRs into AI Brainstorms

Prompts to Harnesses: AI's Quiet Language Revolution

Stay in the loop