lucataco/video-caption 🖼️ → 📝

▶️ 1.4K runs 📅 Aug 2025 ⚙️ Cog 0.14.0
gpt-5 video-captioning visual-analysis

About

gpt-5 wrapper to caption a video

Example Output

Output

  • The scene takes place in a living room with a patterned gold sofa against a peach wall.
  • At first the couch is empty. A woman in a striped dress walks in from the right while a man enters from the left, both approaching the couch.
  • They pause in front of the couch. The woman stands centered, hands clasped in front of her, looking a bit tense. The man stands on her left, slightly turned toward the couch.
  • The woman sits down first, upright with knees together and hands folded on her lap, facing forward.
  • The man lowers himself to sit beside her. As he does, his head band and sunglasses become visible; he settles into a relaxed posture with hands on his knees.
  • Once both are seated, they remain mostly still, facing forward. The woman keeps a composed, slightly anxious expression, while the man sits casually, wearing dark glasses and a headband.
  • The final moments show them seated side by side without further action, suggesting a calm pause or anticipation of something off-screen.

Performance Metrics

15.75s Prediction Time
16.22s Total Time
Input Parameters
video (required) Type: string
Input video file to analyze
Output Schema

Output

Type: string

Version Details
Version ID
b9e44c91690ad39597abd09cee647d48bbf580bfb86c81b81bad10014d0da9d7
Version Created
August 21, 2025
Run on Replicate →