⚙️ System Requirements
- 🧠 Memory: 32 GB RAM or higher recommended
- ⚡ CPU/NPU: AMD Ryzen AI laptop with XDNA2 NPU
- 🖥️ OS: Windows 11
While FastFlowLM can run with 16 GB RAM, complex models (e.g., 3B or 8B) may require up to 32 GB for optimal performance and longer context length (more kv cache).
🚨 CRITICAL: NPU Driver Requirpement
You must have AMD NPU driver version 32.0.203.258 or later installed for FastFlowLM to work correctly.
- Check via:
Task Manager → Performance → NPU
or
Device Manager → NPU
💾 Installation (Windows)
A packaged Windows installer is available here:
flm-setup.exe
If you see “Windows protected your PC”, click More info, then select Run anyway.
For version history and changelog, see the release notes.
🚀 NPU Power Mode
For optimal performance, set the NPU power mode to performance or turbo.
Open PowerShell and go to:
cd C:\Windows\System32\AMD\
Then, run
.\xrt-smi configure --pmode turbo
For more details about NPU power mode, refer to the AMD XRT SMI Documentation.
🧪 Quick Test (CLI Mode)
After installation, do a quick test to see if FastFlowLM is properly installed. Open PowerShell, and run a model in terminal (CLI or Interactive Mode):
flm run llama3.2:1B
Requires internet access to HuggingFace to pull (download) the optimized model kernel. The model will be automatically downloaded to the folder:
C:\Users\<USER>\Documents\flm\models\
. ⚠️ If HuggingFace is not directly accessible in your region, you can manually download the model (e.g., hf-mirror) and place it in the directory.