cjwbw/cutie 🖼️🔢📝✓ → ❓

▶️ 347 runs 📅 Oct 2023 ⚙️ Cog 0.8.3 🔗 GitHub 📄 Paper ⚖️ License
video-inpainting video-segmentation

About

Video Object Segmentation, combined with SAM and ProPainter

Example Output

Output

Performance Metrics

179.85s Prediction Time
181.11s Total Time
All Input Parameters
{
  "mask": null,
  "video": "https://replicate.delivery/pbxt/JkyEeO01eGR6lq7Rt15UQXHUbTLjqki5bWrPkEuRFw22mxH1/video.mp4",
  "max_frames": 300,
  "mask_with_SAM": "550 | 270",
  "show_overlay_video": false,
  "inpaint_with_propainter": true
}
Input Parameters
mask Type: string
Provide the mask for the first frame. You can leave this blank and use SAM to generate the mask below.
video (required) Type: string
Input video
max_frames Type: integer
Number of frames to process. Leave this blank to process the entire video.
mask_with_SAM Type: string
Use SAM to generate mask, ignored if a mask_file is provided above. Provide coordinates of the object of interest in the format of `x | y`.
show_overlay_video Type: booleanDefault: false
Output the video that overlays the mask on each frame.
inpaint_with_propainter Type: booleanDefault: false
Remove the masked objects (inpaint) with ProPainter.
Output Schema

Output

Example Execution Logs
Will use SAM to segment object at [550.0, 270.0].
Masks detected: [False  True]
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (854, 480) to (864, 480) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
[swscaler @ 0x70b7c00] Warning: data is not aligned! This can lead to a speed loss
Pretrained flow completion model has loaded...
Pretrained ProPainter has loaded...
Network [InpaintGenerator] was created. Total number of parameters: 39.4 million. To see the architecture, do print(network).
Processing: frames_dir [300 frames]...
  0%|          | 0/60 [00:00<?, ?it/s]
  2%|▏         | 1/60 [00:00<00:46,  1.26it/s]
  3%|▎         | 2/60 [00:01<00:50,  1.14it/s]
  5%|▌         | 3/60 [00:02<00:52,  1.08it/s]
  7%|▋         | 4/60 [00:03<00:52,  1.07it/s]
  8%|▊         | 5/60 [00:04<00:53,  1.04it/s]
 10%|█         | 6/60 [00:05<00:52,  1.02it/s]
 12%|█▏        | 7/60 [00:06<00:52,  1.00it/s]
 13%|█▎        | 8/60 [00:07<00:52,  1.01s/it]
 15%|█▌        | 9/60 [00:08<00:52,  1.03s/it]
 17%|█▋        | 10/60 [00:09<00:52,  1.05s/it]
 18%|█▊        | 11/60 [00:11<00:51,  1.06s/it]
 20%|██        | 12/60 [00:12<00:51,  1.06s/it]
 22%|██▏       | 13/60 [00:13<00:50,  1.07s/it]
 23%|██▎       | 14/60 [00:14<00:49,  1.08s/it]
 25%|██▌       | 15/60 [00:15<00:48,  1.09s/it]
 27%|██▋       | 16/60 [00:16<00:48,  1.09s/it]
 28%|██▊       | 17/60 [00:17<00:47,  1.10s/it]
 30%|███       | 18/60 [00:18<00:46,  1.10s/it]
 32%|███▏      | 19/60 [00:19<00:45,  1.10s/it]
 33%|███▎      | 20/60 [00:20<00:44,  1.11s/it]
 35%|███▌      | 21/60 [00:22<00:43,  1.11s/it]
 37%|███▋      | 22/60 [00:23<00:42,  1.11s/it]
 38%|███▊      | 23/60 [00:24<00:41,  1.11s/it]
 40%|████      | 24/60 [00:25<00:40,  1.12s/it]
 42%|████▏     | 25/60 [00:26<00:39,  1.12s/it]
 43%|████▎     | 26/60 [00:27<00:38,  1.12s/it]
 45%|████▌     | 27/60 [00:28<00:36,  1.12s/it]
 47%|████▋     | 28/60 [00:29<00:35,  1.12s/it]
 48%|████▊     | 29/60 [00:30<00:34,  1.12s/it]
 50%|█████     | 30/60 [00:32<00:33,  1.12s/it]
 52%|█████▏    | 31/60 [00:33<00:32,  1.12s/it]
 53%|█████▎    | 32/60 [00:34<00:31,  1.12s/it]
 55%|█████▌    | 33/60 [00:35<00:30,  1.11s/it]
 57%|█████▋    | 34/60 [00:36<00:28,  1.11s/it]
 58%|█████▊    | 35/60 [00:37<00:27,  1.11s/it]
 60%|██████    | 36/60 [00:38<00:26,  1.11s/it]
 62%|██████▏   | 37/60 [00:39<00:25,  1.11s/it]
 63%|██████▎   | 38/60 [00:40<00:24,  1.10s/it]
 65%|██████▌   | 39/60 [00:42<00:23,  1.10s/it]
 67%|██████▋   | 40/60 [00:43<00:22,  1.10s/it]
 68%|██████▊   | 41/60 [00:44<00:21,  1.11s/it]
 70%|███████   | 42/60 [00:45<00:19,  1.11s/it]
 72%|███████▏  | 43/60 [00:46<00:18,  1.11s/it]
 73%|███████▎  | 44/60 [00:47<00:17,  1.10s/it]
 75%|███████▌  | 45/60 [00:48<00:16,  1.10s/it]
 77%|███████▋  | 46/60 [00:49<00:15,  1.10s/it]
 78%|███████▊  | 47/60 [00:50<00:14,  1.10s/it]
 80%|████████  | 48/60 [00:51<00:13,  1.09s/it]
 82%|████████▏ | 49/60 [00:53<00:11,  1.09s/it]
 83%|████████▎ | 50/60 [00:54<00:10,  1.09s/it]
 85%|████████▌ | 51/60 [00:55<00:09,  1.09s/it]
 87%|████████▋ | 52/60 [00:56<00:08,  1.09s/it]
 88%|████████▊ | 53/60 [00:57<00:07,  1.09s/it]
 90%|█████████ | 54/60 [00:58<00:06,  1.10s/it]
 92%|█████████▏| 55/60 [00:59<00:05,  1.08s/it]
 93%|█████████▎| 56/60 [01:00<00:04,  1.07s/it]
 95%|█████████▌| 57/60 [01:01<00:03,  1.05s/it]
 97%|█████████▋| 58/60 [01:02<00:02,  1.04s/it]
 98%|█████████▊| 59/60 [01:03<00:01,  1.03s/it]
100%|██████████| 60/60 [01:04<00:00,  1.00it/s]
100%|██████████| 60/60 [01:04<00:00,  1.08s/it]
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (854, 480) to (864, 480) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
[swscaler @ 0x65a2f40] Warning: data is not aligned! This can lead to a speed loss
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (854, 480) to (864, 480) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
[swscaler @ 0x6a65f40] Warning: data is not aligned! This can lead to a speed loss
All results are saved in inpaint_out_dir/frames_dir
Inpainting finished!
Version Details
Version ID
65ad7f1b2f6c71ea8cabf8b67ef249799dde651828af197a52cf15c10c4b560b
Version Created
October 30, 2023
Run on Replicate →