🧠 FastFlowLM
NPU-only runtime for local LLMs
Fast, power-efficient, and 100% offline.
🧪 Test Drive (Remote Demo)
🚀 Don’t have a Ryzen™ AI PC? Instantly try FastFlowLM on a live AMD Ryzen™ AI 5 340 NPU with 32 GB memory (spec) — no setup needed.
✨ Now with Gemma3:4b (the first NPU-only VLM!) supported here.
🌐 Launch Now: https://open-webui.testdrive-fastflowlm.com/
🔐 Login: guest@flm.npu
🔑 Password: 0000
📺 Watch this short video to see how to try the remote demo in just a few clicks.
Alternatively, sign up with your own credentials instead of using the shared guest account. ⚠️ Some universities or companies may block access to the test drive site. If it doesn’t load over Wi-Fi, try switching to a cellular network.
Real-time demo powered by FastFlowLM + Open WebUI — no downloads, no installs.
Try optimized LLM models:gemma3:4b
,qwen3:4b
, etc. — all accelerated on NPU.
⚠️ Please note:
- FastFlowLM is designed for single-user local use. This remote demo machine may experience short wait times when multiple users access it concurrently — please be patient.
- When switching models, it may take longer time to replace the model in memory.
- Large prompts (30k+ tokens) and VLM (gemma3:4b) may take longer — but it works! 🙂
📚 Sections
🚀 Installation
Quick 5‑minute setup guide for Windows.
🛠️ Instructions
Run FastFlowLM using the CLI (interactive mode) or local server mode.
📊 Benchmarks
Real-time performance comparisons vs AMD’s official stack and other tools.
🧩 Models
Supported models, quantization formats, and compatibility details.