jd7h/edit-video-by-editing-text
Edit videos by editing the transcript. Input a video and either transcribe it to text, or supply a desired transcript to...
Found 30 models (showing 21-30)
Edit videos by editing the transcript. Input a video and either transcribe it to text, or supply a desired transcript to...
Generate text and multimodal analyses from text, image, and video inputs. Handle very long contexts (around 1M tokens) f...
Analyze images or video and generate text captions, answers, and summaries. Accepts single or multiple images or a video...
Extract structured data, answer visual questions, and summarize videos from images and videos. Accepts 1–4 images or a v...
Caption images and videos and answer visual questions. Accepts an optional image or video plus a text prompt and returns...
Transcribe speech from online videos into timestamped text. Accepts a video URL (YouTube and other supported sites) and...
Transcribe audio or video to text. Accepts an audio or video input and returns a JSON transcript or ASS subtitles, lever...
Transcribe or translate speech from audio files and videos to text. Accept audio or video input and return a transcript...
Hold multi-turn, multimodal conversations grounded in images, audio, video, and text, returning answers as text and opti...
Generate and reason over text from prompts, with optional image, audio, and video inputs. Produce answers, explanations,...