cuuupid/qwen2-vl-2b 🖼️🔢📝 → 📝

▶️ 674 runs 📅 Aug 2024 ⚙️ Cog 0.9.20 🔗 GitHub 📄 Paper ⚖️ License

video-analysis video-auto-captioning video-qa video-to-text

Performance

16.4sTypical run time

674Total runs

About

SOTA open-source model for chatting with videos and the newest model in the Qwen family

Example Output

Prompt:

"Describe the video in great detail."

Output

The video features a woman standing behind a podium, speaking to the camera. She is wearing a blue shirt and appears to be giving a presentation or lecture. The woman's facial expression suggests that she is engaged and passionate about the topic she is discussing. The background of the video is not visible, but it can be assumed that it is an indoor setting, possibly a conference room or lecture hall. The woman's speech is not audible, but her body language and gestures suggest that she is using hand movements to emphasize certain points. Overall, the video seems to be a formal presentation or lecture, with the woman as the main speaker.

Performance Metrics

16.38s Prediction Time

16.39s Total Time

All Input Parameters

{
  "video": "https://replicate.delivery/pbxt/LXVISWYD8Od0I7w6EW5VIO3sycOIcukn6H26wrkaOX95RK7E/dod_classification_training.mp4",
  "width": 128,
  "height": 128,
  "prompt": "Describe the video in great detail.",
  "max_tokens": 512,
  "temperature": 0.7,
  "max_duration": 180
}

Input Parameters

video (required) Type: string: Video to process
width Type: integerDefault: 128Range: 128 - 2048: Width for the video
height Type: integerDefault: 128Range: 128 - 2048: Height for the video
prompt Type: stringDefault: Describe the video.: Prompt to use for the video
max_tokens Type: integerDefault: 128Range: 1 - 8192: Maximum number of tokens to generate
temperature Type: numberDefault: 0.7Range: 0.01 - 1: Temperature for the model (0.7 is a good default).
max_duration Type: numberDefault: 60Range: 1 - 768: Maximum duration of the video in seconds (above 360, may run out of VRAM).
repetition_penalty Type: numberDefault: 1.1Range: 0.01 - 1.5: Repetition penalty for the model (1.1 is a good default).

Output Schema

Output

Type: string

Version Details

Version ID: b3e77005f19950db4f8564bbedd5670a3e1bd293f1fefd0088d10d8e4d083dd4
Version Created: August 31, 2024

Run on Replicate →