🧠 FastFlowLM

NPU-only runtime for local LLMs
Fast, power-efficient, and 100% offline.


🧪 Test Drive (Remote Demo)

🚀 Don’t have a Ryzen™ AI PC? Instantly try FastFlowLM on a live AMD Ryzen™ AI 5 340 NPU with 32 GB memory (spec) — no setup needed.

✨ Now with Gemma3:4b (the first NPU-only VLM!) supported here.

🌐 Launch Now: https://open-webui.testdrive-fastflowlm.com/
🔐 Login: guest@flm.npu
🔑 Password: 0000

📺 Watch this short video to see how to try the remote demo in just a few clicks.

Alternatively, sign up with your own credentials instead of using the shared guest account. ⚠️ Some universities or companies may block access to the test drive site. If it doesn’t load over Wi-Fi, try switching to a cellular network.
Real-time demo powered by FastFlowLM + Open WebUI — no downloads, no installs.
Try optimized LLM models: gemma3:4b, qwen3:4b, etc. — all accelerated on NPU.

⚠️ Please note:

  • FastFlowLM is designed for single-user local use. This remote demo machine may experience short wait times when multiple users access it concurrently — please be patient.
  • When switching models, it may take longer time to replace the model in memory.
  • Large prompts (30k+ tokens) and VLM (gemma3:4b) may take longer — but it works! 🙂

📚 Sections

🚀 Installation

Quick 5‑minute setup guide for Windows.

🛠️ Instructions

Run FastFlowLM using the CLI (interactive mode) or local server mode.

📊 Benchmarks

Real-time performance comparisons vs AMD’s official stack and other tools.

🧩 Models

Supported models, quantization formats, and compatibility details.