🧠 FastFlowLM

NPU-only runtime for local LLMs
Fast, power-efficient, and 100% offline.


πŸ§ͺ Test Drive (Remote Demo)

πŸš€ Skip the setup β€” experience FastFlowLM instantly on a live AMD Ryzenβ„’ AI 5 340 NPU with 32β€―GB memory (more spec):

🌐 Launch Now: https://open-webui.testdrive-fastflowlm.com/
πŸ” Login: guest@flm.npu
πŸ”‘ Password: 0000

Real-time demo powered by FastFlowLM + Open WebUI β€” no downloads, no installs.
Upload your own .txt files to test extended context prompts.
Try three optimized LLaMA models: llama3.2:1B, llama3.2:3B, and llama3.1:8B β€” all accelerated on NPU.

πŸ“ Note: Large prompts (30k+ tokens) may take longer on the 8B model β€” but it works. Try πŸ‘‰ Download a sample txt, containing over 38k token.


πŸ“š Sections

πŸš€ Installation

Quick 5‑minute setup guide for Windows.

πŸ› οΈ Instructions

Run FastFlowLM using the CLI (interactive mode) or local server mode.

πŸ“Š Benchmarks

Real-time performance comparisons vs AMD’s official stack and other tools.

🧩 Models

Supported models, quantization formats, and compatibility details.