n1jl0091/video-llava-7b-hf_replicate_n1jl0091 ๐Ÿ”ข๐Ÿ–ผ๏ธ๐Ÿ“ โ†’ ๐Ÿ“

โ–ถ๏ธ 112 runs ๐Ÿ“… Nov 2024 โš™๏ธ Cog 0.12.0 ๐Ÿ”— GitHub ๐Ÿ“„ Paper โš–๏ธ License
video-auto-captioning video-question-answering video-to-text

About

Upload an image or video, and Video-LLaVa will give you a text description of what it "sees."

Example Output

Output

In this video, a woman is standing in a kitchen and preparing food.ะช

Performance Metrics

4.07s Prediction Time
131.50s Total Time
All Input Parameters
{
  "top_p": 0.9,
  "videos": [
    "https://replicate.delivery/pbxt/Lzl3gqYd6ExXDlkvvpAwtQWhWzIOtCiYW1ztjoHvaVVFNEzt/3325978-hd_1920_1080_24fps.mp4"
  ],
  "prompts": [
    "What is happening in this video?"
  ],
  "num_frames": 10,
  "temperature": 0.1,
  "max_new_tokens": 500
}
Input Parameters
top_p Type: numberDefault: 0.9
videos (required) Type: array
prompts (required) Type: array
num_frames Type: integerDefault: 10
temperature Type: numberDefault: 0.1
max_new_tokens Type: integerDefault: 500
Output Schema

Output

Type: array โ€ข Items Type: string

Example Execution Logs
Starting prediction for /tmp/tmpt59elgyh3325978-hd_1920_1080_24fps.mp4 at 18:03:50
Using 10 frames
Expanding inputs for image tokens in Video-LLaVa should be done in processing. Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.44.
Expanding inputs for image tokens in Video-LLaVa should be done in processing. Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.47.
Total prediction time for /tmp/tmpt59elgyh3325978-hd_1920_1080_24fps.mp4: 3.42s
Version Details
Version ID
ff284eb7daa7ace568fe353efecc4728c1f1844771462d7ec3b4844741270ddf
Version Created
November 18, 2024
Run on Replicate โ†’