
bytedance/sa2va-26b-image
Segment objects and regions in images from natural-language instructions and answer visual questions. Takes an image and...
Found 38 models (showing 1-20)
Segment objects and regions in images from natural-language instructions and answer visual questions. Takes an image and...
Segment objects and answer questions from an input image using natural-language instructions. Provide an image and an in...
Segment objects in images from a text instruction, returning a segmentation overlay image and a textual response. Suppor...
Segment objects in images from a bounding box prompt. Provide an image plus bounding box coordinates (x0, y0, x1, y1) an...
Encode images using the SAM (Segment Anything) ViT-H model, which is designed for image segmentation tasks.
Segment objects and regions in images using natural-language prompts. Input an image and a text prompt; output a segment...
Segment hair in photos. Takes an input image and returns a hair segmentation result as a mask or overlaid image, enablin...
Segment clothing and faces in images and return binary mask images. Accepts a single image input and outputs masks for a...
Segment clothing in images, isolating topwear or bottomwear. Accepts an image and a clothing category (topwear or bottom...
Segment clothing in images. Accepts an image and outputs a mask image isolating garment regions, with a selector to targ...
Segment clothing in outfit photos and generate clean masks for the selected garment piece (top or bottom). Accepts an im...
Segment the pelvis in hip X-ray images and return a per-pixel probability map and mask. Performs binary semantic segment...
Segment images into per-pixel ADE20K categories using Mask2Former, returning a color-coded segmentation map and a list o...
Segment islands and coastlines in images. Takes a single image as input and outputs a segmentation mask highlighting lit...
Refine satellite-derived shoreline masks from an image and a corresponding mask, then output a cleaned shoreline visuali...
Reconstruct part-aware 3D assets from a single image. Generate whole and exploded meshes, per-part bounding-box visualiz...
Segment roads in images. Accepts a single image and outputs an image with road regions highlighted as a segmentation mas...
Detect glare regions on card photos and return a pixel-level segmentation mask. Input: an image containing a card; Outpu...
Remove backgrounds from images. Accepts an image and uses ECCV 2022 Dichotomous Image Segmentation (DIS) to segment the...
Remove backgrounds from images. Accepts a single image input and returns a foreground cutout with a transparent backgrou...