đź§© Model Card: whisper-large-v3-turbo
- Type: Speach-to-Text (ASR: Automatic Speech Recognition)
- Think: No
- Base Model: openai/whisper-large-v3-turbo
- Max Context Length: NA
- Default Context Length: NA
▶️ Run with FastFlowLM in PowerShell:
ASR model requires to use with an LLM (load concurrently) for both CLI and Server Modes.
CLI Mode
Start with ASR enabled:
flm run gemma3:4b --asr 1
# Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).
Then, type (replace filename.mp3
with your audio file path):
/input "path\to\audio_sample.mp3" summarize it
Server Mode
Start with ASR enabled:
flm serve gemma3:4b --asr 1
# Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).
Send audio to POST /v1/audio/transcriptions
via any OpenAI Client or Open WebUI.
see more API details here → /v1/audio/
Example 1: OpenAI Client
# Import the official OpenAI Python SDK (FastFlowLM mirrors the OpenAI API schema)
from openai import OpenAI
# Initialize the client to point at your local FastFlowLM server
# - base_url: FastFlowLM's local OpenAI-compatible REST endpoint
# - api_key: Dummy token; FastFlowLM typically doesn't enforce auth, but the client requires a string
client = OpenAI(
base_url="http://localhost:52625/v1", # FastFlowLM local API endpoint
api_key="flm", # Placeholder key
)
# Open the audio file in binary mode and create a transcription request
# - model: name of the speech-to-text model exposed by FLM (e.g., "whisper-v3")
# - file: file-like object pointing to your audio
with open("audio.mp3", "rb") as f:
resp = client.audio.transcriptions.create(
model="whisper-v3",
file=f,
)
# Print the transcribed text returned by the server
print(resp.text)
Example 2: Open WebUI
- Follow Open WebUI setup guide.
- In the bottom-left corner, click User icon, then select Settings.
- In the bottom panel, open Admin Settings.
- In the left sidebar, navigate to Audio.
- Set Speech-to-Text Engine to OpenAI.
- Enter:
API Base URL:
http://host.docker.internal:52625/v1
API KEY: flm (any value works)
STT Model: whisper-large-v3-turbo (type in the model name; can be different) - Save the setting.
- You’re ready to upload audio files! (Choose an LLM to load and use concurrently)