Mask2Former — Universal Image Segmentation by Meta 🖼️ → ❓
Mask2Former by Meta AI Research is a universal segmentation model. It replaces the need for separate panoptic, instance, and semantic segmentation pipelines with one architecture that handles all three.
About
Mask2Former is Meta's unified architecture for image segmentation. It handles panoptic, instance, and semantic segmentation with a single model, instead of requiring separate specialized networks for each task.
The model uses masked attention to focus on specific regions during prediction, which improves both accuracy and efficiency compared to earlier approaches like MaskFormer.
What it can do
- Panoptic segmentation — label every pixel as either a "thing" (countable object) or "stuff" (background region)
- Instance segmentation — detect and separate individual objects, even overlapping ones
- Semantic segmentation — classify every pixel by category without distinguishing instances
Mask2Former achieves state-of-the-art results across all three tasks on standard benchmarks like COCO and ADE20K.
Example Output
Output
[object Object]
Performance Metrics
2.90s
Prediction Time
111.19s
Total Time
Input Parameters
- image
- Input image for segmentation. Output will be the concatenation of Panoptic segmentation (top), instance segmentation (middle), and semantic segmentation (bottom).
Output Schema
Version Details
- Version ID
97c0c2edeeb7c120c2859dca4fdee58d185131f79c857ba519e3a5cb7cdd7c66- Version Created
- February 20, 2022