
bytedance/sa2va-26b-video
Segment objects in video from a natural-language instruction. Accepts a video and a text prompt describing the target ob...
Found 6 models (showing 1-6)
Segment objects in video from a natural-language instruction. Accepts a video and a text prompt describing the target ob...
Segment objects in video using natural-language instructions. Accepts a video and a text prompt (e.g., βthe person weari...
Segment objects in a video from natural-language instructions. Takes a video and a text prompt (referring expression) an...
Segment and track objects in videos from point prompts. Provide a video plus click coordinates, foreground/background la...
Remove backgrounds from videos. Takes a video as input and outputs a virtual green-screen video, an alpha matte, or a fo...
Segment objects across a video from a first-frame mask or a SAM point, and optionally remove them via video inpainting w...