// catalog · grep -e "Vision / Audio"
Catalog/Vision / Audio · page 5/5
Showing entries 25 to 30 of 30 · click any row to launch
- 025 Vision / Audio audio-language:// Detects the spoken language from any audio sample. Returns the primary language with confidence, the second-most-likely (helpful for code-switched audio), and a 5-10 second sample of "transcript-as-evidence". Whisper handles ~99 languages.
- 026 Vision / Audio audio-speakers:// Speaker diarisation: detects who spoke when. Returns a timeline of speaker turns + a roll-up of total speaking time per speaker (with %). For meeting analytics, podcast editing, interview transcripts.
- 027 Vision / Audio podcast-clips:// Suggests 5 standalone clip windows (~60 sec each) from a long-form audio transcript. Each clip has: timestamps, a transcript excerpt, a one-line "why this works", and a suggested social-post caption. For podcast / interview / talk highlight reels.
- 028 Vision / Audio voicenote-todo:// Turns a rambling voice memo into a clean todo list. Catches natural phrasings ("remind me to call John", "don't forget the dentist Tuesday", "I need to file the report"). Each todo gets a verb-first formatting, an inferred priority, and a date if mentioned.
- 029 Vision / Audio interview-notes:// Turns an interview transcript (job interview, customer research, journalism) into structured notes. Pulls out: questions asked, key responses (paraphrased + 1-2 verbatim quotes), themes across the conversation, follow-ups to chase. For post-interview synthesis.
- 030 Vision / Audio lecture-summary:// Turns a lecture or conference talk transcript into structured study notes: outline by section, key concepts with definitions, illustrative examples, and 5 exam-style questions for active recall. For students, conference-attenders, or anyone wanting to actually retain what they just heard.