🤖 Model 🎥

bytedance/sa2va-26b-video
Segment objects in video from a natural-language instruction. Accepts a video and a text prompt describing the target ob...
Found 3 models (showing 1-3)
Segment objects in video from a natural-language instruction. Accepts a video and a text prompt describing the target ob...
Segment objects in a video from natural-language instructions. Takes a video and a text prompt (referring expression) an...
Segment objects and regions in images using natural-language prompts. Input an image and a text prompt; output a segment...