đź§© Model Card: whisper-large-v3-turbo

  • Type: Speach-to-Text (ASR: Automatic Speech Recognition)
  • Think: No
  • Base Model: openai/whisper-large-v3-turbo
  • Max Context Length: NA
  • Default Context Length: NA

▶️ Run with FastFlowLM in PowerShell:

ASR model requires to use with an LLM (load concurrently) for both CLI and Server Modes.

CLI Mode

Start with ASR enabled:

flm run gemma3:4b --asr 1 
# Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).

Then, type (replace filename.mp3 with your audio file path):

/input "path\to\audio_sample.mp3" summarize it

Server Mode

Start with ASR enabled:

flm serve gemma3:4b --asr 1 
# Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).

Send audio to POST /v1/audio/transcriptions via any OpenAI Client or Open WebUI.

see more API details here → /v1/audio/

Example 1: OpenAI Client


# Import the official OpenAI Python SDK (FastFlowLM mirrors the OpenAI API schema)
from openai import OpenAI

# Initialize the client to point at your local FastFlowLM server
# - base_url: FastFlowLM's local OpenAI-compatible REST endpoint
# - api_key: Dummy token; FastFlowLM typically doesn't enforce auth, but the client requires a string
client = OpenAI(
    base_url="http://localhost:52625/v1",  # FastFlowLM local API endpoint
    api_key="flm",                         # Placeholder key
)

# Open the audio file in binary mode and create a transcription request
# - model: name of the speech-to-text model exposed by FLM (e.g., "whisper-v3")
# - file: file-like object pointing to your audio
with open("audio.mp3", "rb") as f:
    resp = client.audio.transcriptions.create(
        model="whisper-v3",
        file=f,
    )

# Print the transcribed text returned by the server
print(resp.text)

Example 2: Open WebUI

  • Follow Open WebUI setup guide.
  • In the bottom-left corner, click User icon, then select Settings.
  • In the bottom panel, open Admin Settings.
  • In the left sidebar, navigate to Audio.
  • Set Speech-to-Text Engine to OpenAI.
  • Enter:

    API Base URL: http://host.docker.internal:52625/v1
    API KEY: flm (any value works)
    STT Model: whisper-large-v3-turbo (type in the model name; can be different)

  • Save the setting.
  • You’re ready to upload audio files! (Choose an LLM to load and use concurrently)