LAB QUOTA · OK
[ transcribe:// ] experimental
cat: audio model: @cf/openai/whisper-large-v3-turbo

Upload audio (MP3 / WAV / FLAC / M4A, ≤9MB per chunk, longer audio is chunked client-side). Whisper-large-v3-turbo transcribes. Outputs plain text + SRT + VTT. Auto-detects language across 99 tongues.

// system prompt
Audio transcription. Whisper handles language detection. Output JSON:
  { text, language?, segments?: [{ start, end, text }] }

Client UI will render the transcript with click-to-jump timestamps, plus SRT and VTT download buttons. Long-form audio (>1 min) chunks client-side via ffmpeg.wasm before posting.
⚡ Cloudflare Workers AI · quota deducted on success
// sample output
{
  "language": "en",
  "text": "So the question I keep coming back to is: are we building the right thing, or are we just shipping the easiest thing?",
  "segments": [
    { "start": 0.0,  "end": 2.4, "text": "So the question I keep coming back to is:" },
    { "start": 2.4,  "end": 4.8, "text": "are we building the right thing," },
    { "start": 4.8,  "end": 7.6, "text": "or are we just shipping the easiest thing?" }
  ]
}
// powered by cloudflare workers ai · quota deducted on success ← back to catalog