🧠 Using Microsoft AI Toolkit with FastFlowLM in VS Code

This guide explains how to run FastFlowLM locally on Windows and connect it to Microsoft AI Toolkit in Visual Studio Code.

✅ 1. Install Visual Studio Code (Windows)

Go to: https://code.visualstudio.com
Download the User Installer for Windows
Run the installer:
- ✅ Check “Add to PATH”
- ✅ (Optional) Create a desktop icon
- ✅ Accept the license agreement
Complete the installation

You’ll now see the AI Toolkit icon on the sidebar.

Download & install FastFlowLM: (../../install.md)

flm pull llama3.2:1b

flm list

You should see models like llama3.2:1b listed.

In VS Code, open the AI Toolkit panel
Navigate to Models → Catalog
Click ➕ Add Your Own Model
In the top bar, select Add Custom Model
Enter OpenAI compatible chat completion endpoint URI:
```
http://localhost:52625/v1/chat/completions
```
Click Enter
Enter the exact model name as in the API:
```
llama3.2:1b
```
Click Enter
Enter display model name:
```
flm-llama3.2:1b
```
Click Enter
Enter API key:
```
dummy
```
Click Enter
You will now see the model under My Models

Open powershell, enter

flm serve llama3.2:1b

What are the benefits of local inference?

To remove a previously added model from the My Models section in the AI Toolkit:

🧹 This action removes the model’s reference from the AI Toolkit interface but does not delete the model files from local disk.