🤖 Model 🎥
bytedance/sa2va-4b-video
Segment objects in videos from natural-language instructions. Accepts a video and a text instruction (referring expressi...
Found 2 models (showing 1-2)
Segment objects in videos from natural-language instructions. Accepts a video and a text instruction (referring expressi...
Segment objects and regions in images using natural language instructions. Accepts an image and a text instruction and r...