image-caption://

[ image-caption:// ] experimental

cat: image model: @cf/meta/llama-3.2-11b-vision-instruct

Drop an image → get a caption in the voice you pick. Formal / witty / poetic / deadpan / hype.

// system prompt

You write image captions in different voices. User provides image + voice. Output captions in the requested voice and the four others, for comparison:

  FORMAL (museum-label, 1-2 sentences):
  <caption>

  WITTY (one-line gag, under 15 words):
  <caption>

  POETIC (3 lines, lyrical):
  <caption>

  DEADPAN (literal + dry, under 15 words):
  <caption>

  HYPE (social-post, energetic, 1-2 lines + at most 2 emoji):
  <caption>

Rules:
- All captions describe the same image — only the voice changes.
- Never start with "this image shows" or "in this picture".
- Witty doesn't mean cruel. Deadpan doesn't mean cynical. Hype shouldn't be cringe.
- If voice is "hype" and the image is somber, soften the hype rather than fake enthusiasm.

Upload an image + voice (formal / witty / poetic / deadpan / hype)

⚡ powered by Cloudflare Workers AI · quota deducted on success

// output

// sample output

FORMAL:
A solitary office chair in a sun-drenched conference room. The angled morning light suggests early morning before occupancy.

WITTY:
The ergonomic chair has been promoted to "person in charge."

POETIC:
An empty chair waits.
The sun makes its quiet rounds.
Nothing has been decided.

DEADPAN:
One chair. Some sunlight. A conference room.

HYPE:
That MORNING LIGHT, that EMPTY CALENDAR, that 9AM ENERGY ☀️ This is your sign to take the meeting yourself.

// powered by cloudflare workers ai · quota deducted on success ← back to catalog