uncensored-com/llava-next-video 🖼️📝🔢 → 📝

▶️ 3.6K runs 📅 Jul 2024 ⚙️ Cog 0.9.12 🔗 GitHub ⚖️ License
video-auto-captioning video-question-answering video-to-text

About

Input a video. Ask anything about it

Example Output

Prompt:

"Describe this video and the people in it in detail"

Output

The video captures a group of four women enjoying a day at the beach. They are all wearing bikinis and are standing on a wooden pier that extends over the ocean. The women are holding hands, suggesting a sense of camaraderie and shared enjoyment. The water around them is calm, with gentle waves lapping at the pier's edge. The sky above them is clear and blue, indicating a sunny day. The women are facing the camera, their smiles radiating joy and relaxation. The pier they are standing on is weathered and wooden, with visible signs of age and use, adding a rustic charm to the scene. The ocean in the background is a deep blue, its surface rippling with small waves that add a sense of tranquility to the scene. The women's bikinis are colorful, adding a vibrant touch to the serene setting. The overall atmosphere is one of leisure and enjoyment, with the women seemingly taking a break from their daily routines to appreciate the beauty of nature and each other's company.

Performance Metrics

7.98s Prediction Time
67.37s Total Time
All Input Parameters
{
  "video": "https://www.shutterstock.com/shutterstock/videos/1104246027/preview/stock-footage-woman-friends-and-jumping-in-sea-holding-hands-for-fun-holiday-weekend-adventure-or-vacation-in.mp4",
  "prompt": "Describe this video and the people in it in detail",
  "num_frames": 10,
  "max_new_tokens": 500
}
Input Parameters
video (required) Type: string
Input video file
prompt Type: stringDefault: Why is this video funny?
Prompt for the model
num_frames Type: integerDefault: 8
Number of frames to sample from the video
max_new_tokens Type: integerDefault: 50
Maximum number of tokens to generate
Output Schema

Output

Type: string

Example Execution Logs
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/transformers/feature_extraction_utils.py:142: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:274.)
return torch.tensor(value)
We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
Version Details
Version ID
46eebc3684238da4c77ed67920e8f3cca934d6d544b8d7cc3e52604b3073740b
Version Created
July 8, 2024
Run on Replicate →