🧠 FastFlowLM

NPU-only runtime for local LLMs
Fast, power-efficient, and 100% offline.

🧪 Test Drive (Remote Demo)

🚀 Skip the setup — experience FastFlowLM instantly on a live AMD Ryzen™ AI 5 340 NPU with 32 GB memory (more spec):

🌐 Launch Now: https://open-webui.testdrive-fastflowlm.com/
🔐 Login: guest@flm.npu
🔑 Password: 0000

Real-time demo powered by FastFlowLM + Open WebUI — no downloads, no installs.
Upload your own .txt files to test extended context prompts.
Try three optimized LLaMA models: llama3.2:1B, llama3.2:3B, and llama3.1:8B — all accelerated on NPU.

📝 Note: Large prompts (30k+ tokens) may take longer on the 8B model — but it works. Try 👉 Download a sample txt, containing over 38k token.

📚 Sections

🚀 Installation

Quick 5‑minute setup guide for Windows.

🛠️ Instructions

Run FastFlowLM using the CLI (interactive mode) or local server mode.

📊 Benchmarks

Real-time performance comparisons vs AMD’s official stack and other tools.

🧩 Models

Supported models, quantization formats, and compatibility details.