
lucataco/apollo-7b
Answer questions about videos and generate detailed captions from a video input. Accepts a video and a natural-language...
Found 18 models (showing 1-18)
Answer questions about videos and generate detailed captions from a video input. Accepts a video and a natural-language...
Generate text descriptions and answers from a video input. Accepts a video and a natural-language prompt to perform vide...
Caption videos and answer open-ended questions about their content. Accept one or more video inputs plus a list of natur...
Add TikTok-style captions to videos. Accepts a video and outputs a captioned video with subtitles burned in using Whispe...
Caption videos and answer questions about their content. Accepts a video and a natural-language prompt and outputs text...
Analyze videos and generate text descriptions, answers, and summaries from a prompt. Accepts a video and an instruction,...
Caption videos. Provide a video and an optional instruction prompt to produce a single text output for captioning, summa...
Answer questions and generate detailed descriptions from a video input. Provide a video and a text prompt to get caption...
Transcribe speech from online videos into text. Accepts a video input from supported sites and returns a JSON transcript...
Transcribe speech to text from audio files or HLS m3u8 streams with optional word-level timestamps. Accept audio uploads...
Caption and answer questions about videos. Takes a video and a text prompt and returns text, enabling detailed descripti...
Add karaoke-style subtitles to a video. Takes a video as input, auto-transcribes speech with Whisper, and outputs a capt...
Generate text descriptions and answers from a video input. Accepts a video and an optional prompt to perform video capti...
Edit videos by editing their transcript. Input a video and a target transcription to automatically cut out segments not...
Generate timestamped subtitles from an audio or video file. Transcribe speech to text and return structured segments wit...
Create training-ready video datasets with automatic captions from YouTube links or uploaded video files. Extract and seg...
Transcribe or translate speech from video links or audio files into text with optional word- or chunk-level timestamps....
Transcribe audio or video into text with stabilized timestamps. Accepts an audio or video input and returns either an AS...