zsxkib/mimic-motion 🔢🖼️❓ → 🖼️

▶️ 2.9K runs 📅 Jul 2024 ⚙️ Cog 0.9.12 🔗 GitHub 📄 Paper ⚖️ License
image-to-video motion-transfer video-consistent-character-generation video-to-video

About

MimicMotion: High-quality human motion video generation with pose-guided control

Example Output

Output

Performance Metrics

839.36s Prediction Time
1332.40s Total Time
All Input Parameters
{
  "use_fp16": true,
  "chunk_size": 16,
  "resolution": 576,
  "motion_video": "https://replicate.delivery/pbxt/LD5c2cJou7MsS6J7KMBDfywggKAFCfsc2GUAlo67w4Z8aN30/pose1_trimmed_fixed.mp4",
  "sample_stride": 2,
  "frames_overlap": 6,
  "guidance_scale": 2,
  "noise_strength": 0,
  "denoising_steps": 25,
  "appearance_image": "https://replicate.delivery/pbxt/LD5c2GQlXTIlL1i3ZbVcCybtLlmF4XoPoTnbpCmt38MqMQiS/demo1.jpg",
  "output_frames_per_second": 15
}
Input Parameters
seed Type: integer
Random seed. Leave blank to randomize the seed
chunk_size Type: integerDefault: 16Range: 2 - ∞
Number of frames to generate in each processing chunk
resolution Type: integerDefault: 576Range: 64 - 1024
Height of the output video in pixels. Width is automatically calculated.
motion_video (required) Type: string
Reference video file containing the motion to be mimicked
sample_stride Type: integerDefault: 2Range: 1 - ∞
Interval for sampling frames from the reference video. Higher values skip more frames.
frames_overlap Type: integerDefault: 6Range: 0 - ∞
Number of overlapping frames between chunks for smoother transitions
guidance_scale Type: numberDefault: 2Range: 0.1 - 10
Strength of guidance towards the reference. Higher values adhere more closely to the reference but may reduce creativity.
noise_strength Type: numberDefault: 0Range: 0 - 1
Strength of noise augmentation. Higher values add more variation but may reduce coherence with the reference.
denoising_steps Type: integerDefault: 25Range: 1 - 100
Number of denoising steps in the diffusion process. More steps can improve quality but increase processing time.
appearance_image (required) Type: string
Reference image file for the appearance of the generated video
checkpoint_version Default: v1-1
Choose the checkpoint version to use
output_frames_per_second Type: integerDefault: 15Range: 1 - 60
Frames per second of the output video. Affects playback speed.
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
Using seed: 57946
[!] (<class 'cog.types.Path'>) ref_video=/tmp/tmpss56yw9_pose1_trimmed_fixed.mp4, [!] (<class 'cog.types.Path'>) ref_image=/tmp/tmp2r0k6ce_demo1.jpg, [!] (<class 'int'>) resolution=576, [!] (<class 'int'>) num_frames=16, [!] (<class 'int'>) frames_overlap=6, [!] (<class 'int'>) num_inference_steps=25, [!] (<class 'float'>) noise_aug_strength=0.0, [!] (<class 'float'>) guidance_scale=2.0, [!] (<class 'int'>) sample_stride=2, [!] (<class 'int'>) fps=15, [!] (<class 'int'>) seed=57946, [!] (<class 'bool'>) use_fp16=True
  0%|          | 0/200 [00:00<?, ?it/s]
  0%|          | 1/200 [00:01<05:38,  1.70s/it]
  1%|          | 2/200 [00:03<05:31,  1.67s/it]
  2%|▏         | 3/200 [00:05<05:28,  1.67s/it]
  2%|▏         | 4/200 [00:06<05:26,  1.66s/it]
  2%|▎         | 5/200 [00:08<05:24,  1.66s/it]
  3%|▎         | 6/200 [00:09<05:22,  1.66s/it]
  4%|▎         | 7/200 [00:11<05:20,  1.66s/it]
  4%|▍         | 8/200 [00:13<05:19,  1.66s/it]
  4%|▍         | 9/200 [00:14<05:18,  1.67s/it]
  5%|▌         | 10/200 [00:16<05:16,  1.67s/it]
  6%|▌         | 11/200 [00:18<05:15,  1.67s/it]
  6%|▌         | 12/200 [00:20<05:13,  1.67s/it]
  6%|▋         | 13/200 [00:21<05:12,  1.67s/it]
  7%|▋         | 14/200 [00:23<05:10,  1.67s/it]
  8%|▊         | 15/200 [00:25<05:09,  1.67s/it]
  8%|▊         | 16/200 [00:26<05:08,  1.67s/it]
  8%|▊         | 17/200 [00:28<05:06,  1.68s/it]
  9%|▉         | 18/200 [00:30<05:04,  1.67s/it]
 10%|▉         | 19/200 [00:31<05:02,  1.67s/it]
 10%|█         | 20/200 [00:33<05:01,  1.67s/it]
 10%|█         | 21/200 [00:35<04:59,  1.67s/it]
 11%|█         | 22/200 [00:36<04:58,  1.67s/it]
 12%|█▏        | 23/200 [00:38<04:56,  1.67s/it]
 12%|█▏        | 24/200 [00:40<04:54,  1.67s/it]
 12%|█▎        | 25/200 [00:41<04:53,  1.68s/it]
 13%|█▎        | 26/200 [00:43<04:52,  1.68s/it]
 14%|█▎        | 27/200 [00:45<04:50,  1.68s/it]
 14%|█▍        | 28/200 [00:46<04:48,  1.68s/it]
 14%|█▍        | 29/200 [00:48<04:47,  1.68s/it]
 15%|█▌        | 30/200 [00:50<04:45,  1.68s/it]
 16%|█▌        | 31/200 [00:51<04:43,  1.68s/it]
 16%|█▌        | 32/200 [00:53<04:42,  1.68s/it]
 16%|█▋        | 33/200 [00:55<04:40,  1.68s/it]
 17%|█▋        | 34/200 [00:56<04:38,  1.68s/it]
 18%|█▊        | 35/200 [00:58<04:37,  1.68s/it]
 18%|█▊        | 36/200 [01:00<04:35,  1.68s/it]
 18%|█▊        | 37/200 [01:01<04:33,  1.68s/it]
 19%|█▉        | 38/200 [01:03<04:32,  1.68s/it]
 20%|█▉        | 39/200 [01:05<04:31,  1.68s/it]
 20%|██        | 40/200 [01:07<04:29,  1.68s/it]
 20%|██        | 41/200 [01:08<04:27,  1.68s/it]
 21%|██        | 42/200 [01:10<04:26,  1.69s/it]
 22%|██▏       | 43/200 [01:12<04:24,  1.69s/it]
 22%|██▏       | 44/200 [01:13<04:23,  1.69s/it]
 22%|██▎       | 45/200 [01:15<04:21,  1.69s/it]
 23%|██▎       | 46/200 [01:17<04:19,  1.69s/it]
 24%|██▎       | 47/200 [01:18<04:18,  1.69s/it]
 24%|██▍       | 48/200 [01:20<04:16,  1.69s/it]
 24%|██▍       | 49/200 [01:22<04:14,  1.69s/it]
 25%|██▌       | 50/200 [01:23<04:13,  1.69s/it]
 26%|██▌       | 51/200 [01:25<04:11,  1.69s/it]
 26%|██▌       | 52/200 [01:27<04:09,  1.69s/it]
 26%|██▋       | 53/200 [01:28<04:08,  1.69s/it]
 27%|██▋       | 54/200 [01:30<04:06,  1.69s/it]
 28%|██▊       | 55/200 [01:32<04:05,  1.69s/it]
 28%|██▊       | 56/200 [01:34<04:03,  1.69s/it]
 28%|██▊       | 57/200 [01:35<04:01,  1.69s/it]
 29%|██▉       | 58/200 [01:37<04:00,  1.69s/it]
 30%|██▉       | 59/200 [01:39<03:58,  1.69s/it]
 30%|███       | 60/200 [01:40<03:56,  1.69s/it]
 30%|███       | 61/200 [01:42<03:55,  1.69s/it]
 31%|███       | 62/200 [01:44<03:53,  1.69s/it]
 32%|███▏      | 63/200 [01:45<03:51,  1.69s/it]
 32%|███▏      | 64/200 [01:47<03:49,  1.69s/it]
 32%|███▎      | 65/200 [01:49<03:48,  1.69s/it]
 33%|███▎      | 66/200 [01:50<03:46,  1.69s/it]
 34%|███▎      | 67/200 [01:52<03:45,  1.69s/it]
 34%|███▍      | 68/200 [01:54<03:43,  1.69s/it]
 34%|███▍      | 69/200 [01:56<03:41,  1.69s/it]
 35%|███▌      | 70/200 [01:57<03:39,  1.69s/it]
 36%|███▌      | 71/200 [01:59<03:38,  1.69s/it]
 36%|███▌      | 72/200 [02:01<03:36,  1.69s/it]
 36%|███▋      | 73/200 [02:02<03:35,  1.69s/it]
 37%|███▋      | 74/200 [02:04<03:33,  1.69s/it]
 38%|███▊      | 75/200 [02:06<03:31,  1.69s/it]
 38%|███▊      | 76/200 [02:07<03:30,  1.69s/it]
 38%|███▊      | 77/200 [02:09<03:28,  1.69s/it]
 39%|███▉      | 78/200 [02:11<03:26,  1.69s/it]
 40%|███▉      | 79/200 [02:12<03:24,  1.69s/it]
 40%|████      | 80/200 [02:14<03:23,  1.69s/it]
 40%|████      | 81/200 [02:16<03:21,  1.69s/it]
 41%|████      | 82/200 [02:18<03:19,  1.69s/it]
 42%|████▏     | 83/200 [02:19<03:18,  1.69s/it]
 42%|████▏     | 84/200 [02:21<03:16,  1.70s/it]
 42%|████▎     | 85/200 [02:23<03:15,  1.70s/it]
 43%|████▎     | 86/200 [02:24<03:13,  1.70s/it]
 44%|████▎     | 87/200 [02:26<03:11,  1.70s/it]
 44%|████▍     | 88/200 [02:28<03:10,  1.70s/it]
 44%|████▍     | 89/200 [02:29<03:08,  1.70s/it]
 45%|████▌     | 90/200 [02:31<03:06,  1.70s/it]
 46%|████▌     | 91/200 [02:33<03:04,  1.70s/it]
 46%|████▌     | 92/200 [02:35<03:03,  1.70s/it]
 46%|████▋     | 93/200 [02:36<03:01,  1.70s/it]
 47%|████▋     | 94/200 [02:38<02:59,  1.70s/it]
 48%|████▊     | 95/200 [02:40<02:58,  1.70s/it]
 48%|████▊     | 96/200 [02:41<02:56,  1.70s/it]
 48%|████▊     | 97/200 [02:43<02:54,  1.70s/it]
 49%|████▉     | 98/200 [02:45<02:53,  1.70s/it]
 50%|████▉     | 99/200 [02:46<02:51,  1.70s/it]
 50%|█████     | 100/200 [02:48<02:49,  1.70s/it]
 50%|█████     | 101/200 [02:50<02:47,  1.70s/it]
 51%|█████     | 102/200 [02:51<02:46,  1.70s/it]
 52%|█████▏    | 103/200 [02:53<02:44,  1.70s/it]
 52%|█████▏    | 104/200 [02:55<02:42,  1.70s/it]
 52%|█████▎    | 105/200 [02:57<02:41,  1.70s/it]
 53%|█████▎    | 106/200 [02:58<02:39,  1.70s/it]
 54%|█████▎    | 107/200 [03:00<02:37,  1.70s/it]
 54%|█████▍    | 108/200 [03:02<02:36,  1.70s/it]
 55%|█████▍    | 109/200 [03:03<02:34,  1.70s/it]
 55%|█████▌    | 110/200 [03:05<02:32,  1.70s/it]
 56%|█████▌    | 111/200 [03:07<02:31,  1.70s/it]
 56%|█████▌    | 112/200 [03:08<02:29,  1.70s/it]
 56%|█████▋    | 113/200 [03:10<02:27,  1.70s/it]
 57%|█████▋    | 114/200 [03:12<02:26,  1.70s/it]
 57%|█████▊    | 115/200 [03:14<02:24,  1.70s/it]
 58%|█████▊    | 116/200 [03:15<02:22,  1.70s/it]
 58%|█████▊    | 117/200 [03:17<02:20,  1.70s/it]
 59%|█████▉    | 118/200 [03:19<02:19,  1.70s/it]
 60%|█████▉    | 119/200 [03:20<02:17,  1.70s/it]
 60%|██████    | 120/200 [03:22<02:15,  1.70s/it]
 60%|██████    | 121/200 [03:24<02:14,  1.70s/it]
 61%|██████    | 122/200 [03:25<02:12,  1.70s/it]
 62%|██████▏   | 123/200 [03:27<02:10,  1.70s/it]
 62%|██████▏   | 124/200 [03:29<02:08,  1.70s/it]
 62%|██████▎   | 125/200 [03:31<02:07,  1.70s/it]
 63%|██████▎   | 126/200 [03:32<02:05,  1.70s/it]
 64%|██████▎   | 127/200 [03:34<02:03,  1.70s/it]
 64%|██████▍   | 128/200 [03:36<02:02,  1.70s/it]
 64%|██████▍   | 129/200 [03:37<02:00,  1.70s/it]
 65%|██████▌   | 130/200 [03:39<01:58,  1.70s/it]
 66%|██████▌   | 131/200 [03:41<01:57,  1.70s/it]
 66%|██████▌   | 132/200 [03:42<01:55,  1.70s/it]
 66%|██████▋   | 133/200 [03:44<01:53,  1.70s/it]
 67%|██████▋   | 134/200 [03:46<01:52,  1.70s/it]
 68%|██████▊   | 135/200 [03:47<01:50,  1.70s/it]
 68%|██████▊   | 136/200 [03:49<01:48,  1.70s/it]
 68%|██████▊   | 137/200 [03:51<01:46,  1.70s/it]
 69%|██████▉   | 138/200 [03:53<01:45,  1.70s/it]
 70%|██████▉   | 139/200 [03:54<01:43,  1.70s/it]
 70%|███████   | 140/200 [03:56<01:41,  1.70s/it]
 70%|███████   | 141/200 [03:58<01:40,  1.70s/it]
 71%|███████   | 142/200 [03:59<01:38,  1.70s/it]
 72%|███████▏  | 143/200 [04:01<01:36,  1.70s/it]
 72%|███████▏  | 144/200 [04:03<01:35,  1.70s/it]
 72%|███████▎  | 145/200 [04:04<01:33,  1.70s/it]
 73%|███████▎  | 146/200 [04:06<01:31,  1.70s/it]
 74%|███████▎  | 147/200 [04:08<01:30,  1.70s/it]
 74%|███████▍  | 148/200 [04:10<01:28,  1.70s/it]
 74%|███████▍  | 149/200 [04:11<01:26,  1.70s/it]
 75%|███████▌  | 150/200 [04:13<01:24,  1.70s/it]
 76%|███████▌  | 151/200 [04:15<01:23,  1.70s/it]
 76%|███████▌  | 152/200 [04:16<01:21,  1.70s/it]
 76%|███████▋  | 153/200 [04:18<01:19,  1.70s/it]
 77%|███████▋  | 154/200 [04:20<01:18,  1.70s/it]
 78%|███████▊  | 155/200 [04:21<01:16,  1.70s/it]
 78%|███████▊  | 156/200 [04:23<01:14,  1.70s/it]
 78%|███████▊  | 157/200 [04:25<01:13,  1.70s/it]
 79%|███████▉  | 158/200 [04:27<01:11,  1.70s/it]
 80%|███████▉  | 159/200 [04:28<01:09,  1.70s/it]
 80%|████████  | 160/200 [04:30<01:07,  1.70s/it]
 80%|████████  | 161/200 [04:32<01:06,  1.70s/it]
 81%|████████  | 162/200 [04:33<01:04,  1.70s/it]
 82%|████████▏ | 163/200 [04:35<01:02,  1.70s/it]
 82%|████████▏ | 164/200 [04:37<01:01,  1.70s/it]
 82%|████████▎ | 165/200 [04:38<00:59,  1.70s/it]
 83%|████████▎ | 166/200 [04:40<00:57,  1.70s/it]
 84%|████████▎ | 167/200 [04:42<00:56,  1.70s/it]
 84%|████████▍ | 168/200 [04:44<00:54,  1.70s/it]
 84%|████████▍ | 169/200 [04:45<00:52,  1.70s/it]
 85%|████████▌ | 170/200 [04:47<00:50,  1.70s/it]
 86%|████████▌ | 171/200 [04:49<00:49,  1.70s/it]
 86%|████████▌ | 172/200 [04:50<00:47,  1.70s/it]
 86%|████████▋ | 173/200 [04:52<00:45,  1.70s/it]
 87%|████████▋ | 174/200 [04:54<00:44,  1.70s/it]
 88%|████████▊ | 175/200 [04:55<00:42,  1.70s/it]
 88%|████████▊ | 176/200 [04:57<00:40,  1.70s/it]
 88%|████████▊ | 177/200 [04:59<00:39,  1.70s/it]
 89%|████████▉ | 178/200 [05:01<00:37,  1.70s/it]
 90%|████████▉ | 179/200 [05:02<00:35,  1.70s/it]
 90%|█████████ | 180/200 [05:04<00:33,  1.70s/it]
 90%|█████████ | 181/200 [05:06<00:32,  1.70s/it]
 91%|█████████ | 182/200 [05:07<00:30,  1.70s/it]
 92%|█████████▏| 183/200 [05:09<00:28,  1.70s/it]
 92%|█████████▏| 184/200 [05:11<00:27,  1.70s/it]
 92%|█████████▎| 185/200 [05:12<00:25,  1.70s/it]
 93%|█████████▎| 186/200 [05:14<00:23,  1.70s/it]
 94%|█████████▎| 187/200 [05:16<00:22,  1.70s/it]
 94%|█████████▍| 188/200 [05:17<00:20,  1.70s/it]
 94%|█████████▍| 189/200 [05:19<00:18,  1.70s/it]
 95%|█████████▌| 190/200 [05:21<00:16,  1.70s/it]
 96%|█████████▌| 191/200 [05:23<00:15,  1.70s/it]
 96%|█████████▌| 192/200 [05:24<00:13,  1.70s/it]
 96%|█████████▋| 193/200 [05:26<00:11,  1.70s/it]
 97%|█████████▋| 194/200 [05:28<00:10,  1.70s/it]
 98%|█████████▊| 195/200 [05:29<00:08,  1.70s/it]
 98%|█████████▊| 196/200 [05:31<00:06,  1.70s/it]
 98%|█████████▊| 197/200 [05:33<00:05,  1.70s/it]
 99%|█████████▉| 198/200 [05:34<00:03,  1.70s/it]
100%|█████████▉| 199/200 [05:36<00:01,  1.70s/it]
100%|██████████| 200/200 [05:38<00:00,  1.70s/it]
100%|██████████| 200/200 [05:38<00:00,  1.69s/it]
Version Details
Version ID
b3edd455f68ec4ccf045da8732be7db837cb8832d1a2459ef057ddcd3ff87dea
Version Created
July 16, 2024
Run on Replicate →