zsxkib/framepack πŸ”’πŸ“πŸ–ΌοΈβœ“ β†’ πŸ–ΌοΈ

▢️ 2.4K runs πŸ“… May 2025 βš™οΈ Cog 0.14.9 πŸ”— GitHub πŸ“„ Paper βš–οΈ License
image-animation image-to-video

About

πŸ•ΉοΈFramePack: video diffusion that feels like image diffusionπŸŽ₯

Example Output

Prompt:

"A woman pets her cat"

Output

Performance Metrics

92.48s Prediction Time
92.49s Total Time
All Input Parameters
{
  "steps": 25,
  "prompt": "A woman pets her cat",
  "mp4_crf": 23,
  "cfg_scale": 1,
  "cfg_rescale": 0,
  "input_image": "https://replicate.delivery/pbxt/N0bNhGBlCkyqqTpaGqHXyY9ZXnugIqhzdCO0N6H9VjvPGxrg/image.png",
  "use_teacache": true,
  "negative_prompt": "",
  "latent_window_size": 9,
  "distilled_cfg_scale": 10,
  "total_video_length_seconds": 3
}
Input Parameters
seed Type: integer
Set for consistent results or leave empty for random
steps Type: integerDefault: 25Range: 1 - 50
More steps give higher quality but take longer
prompt (required) Type: string
Text description of what you want in the video
mp4_crf Type: integerDefault: 23Range: 0 - 51
Controls video compression - lower values give better quality but larger files
cfg_scale Type: numberDefault: 1Range: 1 - 32
Higher values follow the prompt more closely, lower values are more creative
cfg_rescale Type: numberDefault: 0Range: 0 - 1
Reduces oversaturation at high CFG values
input_image (required) Type: string
Initial image to start the video from
use_teacache Type: booleanDefault: true
Makes generation faster with minimal quality impact
negative_prompt Type: stringDefault:
Things you want to avoid in the video
latent_window_size Type: integerDefault: 9Range: 1 - 16
Controls how video chunks are processed - smaller is faster but may reduce quality
distilled_cfg_scale Type: numberDefault: 10Range: 1 - 32
Controls prompt adherence for distilled model components
total_video_length_seconds Type: numberDefault: 3Range: 1 - 60
How long the video should be in seconds
Output Schema

Output

Type: string β€’ Format: uri

Example Execution Logs
Using random seed: 2718993960
Encoding text prompts...
Processing input image...
Encoding initial image with VAE...
Encoding image with CLIP Vision model...
Starting video generation loop...
Processing section 1/2 (padding: 1), is_last_section_logic: False
Sampling step 1/25 for section 1
  0%|          | 0/25 [00:00<?, ?it/s]
Sampling step 2/25 for section 1
  4%|▍         | 1/25 [00:02<00:49,  2.08s/it]
Sampling step 3/25 for section 1
  8%|β–Š         | 2/25 [00:05<01:00,  2.64s/it]
Sampling step 4/25 for section 1
 12%|β–ˆβ–        | 3/25 [00:08<01:02,  2.82s/it]
Sampling step 5/25 for section 1
 16%|β–ˆβ–Œ        | 4/25 [00:09<00:43,  2.08s/it]
Sampling step 6/25 for section 1
 20%|β–ˆβ–ˆ        | 5/25 [00:11<00:41,  2.09s/it]
Sampling step 7/25 for section 1
Sampling step 8/25 for section 1
 24%|β–ˆβ–ˆβ–       | 6/25 [00:12<00:32,  1.70s/it]
Sampling step 9/25 for section 1
 32%|β–ˆβ–ˆβ–ˆβ–      | 8/25 [00:14<00:23,  1.38s/it]
Sampling step 10/25 for section 1
Sampling step 11/25 for section 1
 36%|β–ˆβ–ˆβ–ˆβ–Œ      | 9/25 [00:15<00:20,  1.27s/it]
Sampling step 12/25 for section 1
 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–     | 11/25 [00:17<00:16,  1.18s/it]
Sampling step 13/25 for section 1
Sampling step 14/25 for section 1
 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š     | 12/25 [00:18<00:14,  1.13s/it]
Sampling step 15/25 for section 1
 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ    | 14/25 [00:20<00:12,  1.10s/it]
Sampling step 16/25 for section 1
Sampling step 17/25 for section 1
 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    | 15/25 [00:21<00:10,  1.07s/it]
Sampling step 18/25 for section 1
 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š   | 17/25 [00:23<00:08,  1.06s/it]
Sampling step 19/25 for section 1
 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  | 18/25 [00:24<00:07,  1.04s/it]
Sampling step 20/25 for section 1
 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ  | 19/25 [00:26<00:07,  1.29s/it]
Sampling step 21/25 for section 1
 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  | 20/25 [00:27<00:06,  1.21s/it]
Sampling step 22/25 for section 1
 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 21/25 [00:29<00:05,  1.44s/it]
Sampling step 23/25 for section 1
 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 22/25 [00:32<00:05,  1.88s/it]
Sampling step 24/25 for section 1
 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 23/25 [00:35<00:04,  2.21s/it]
Sampling step 25/25 for section 1
 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 24/25 [00:36<00:01,  1.85s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [00:38<00:00,  1.92s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [00:38<00:00,  1.55s/it]
Updated video saved: /tmp/tmp06zetr8l/250514_190911_051_8874_video.mp4, total pixel frames: 33
Processing section 2/2 (padding: 0), is_last_section_logic: True
Sampling step 1/25 for section 2
  0%|          | 0/25 [00:00<?, ?it/s]
Sampling step 2/25 for section 2
  4%|▍         | 1/25 [00:02<00:50,  2.10s/it]
Sampling step 3/25 for section 2
  8%|β–Š         | 2/25 [00:05<01:01,  2.67s/it]
Sampling step 4/25 for section 2
 12%|β–ˆβ–        | 3/25 [00:08<01:02,  2.84s/it]
Sampling step 5/25 for section 2
 16%|β–ˆβ–Œ        | 4/25 [00:09<00:44,  2.10s/it]
Sampling step 6/25 for section 2
 20%|β–ˆβ–ˆ        | 5/25 [00:11<00:42,  2.10s/it]
Sampling step 7/25 for section 2
Sampling step 8/25 for section 2
 24%|β–ˆβ–ˆβ–       | 6/25 [00:12<00:32,  1.71s/it]
Sampling step 9/25 for section 2
 32%|β–ˆβ–ˆβ–ˆβ–      | 8/25 [00:14<00:23,  1.39s/it]
Sampling step 10/25 for section 2
Sampling step 11/25 for section 2
 36%|β–ˆβ–ˆβ–ˆβ–Œ      | 9/25 [00:15<00:20,  1.28s/it]
Sampling step 12/25 for section 2
 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–     | 11/25 [00:17<00:16,  1.18s/it]
Sampling step 13/25 for section 2
Sampling step 14/25 for section 2
 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š     | 12/25 [00:18<00:14,  1.13s/it]
Sampling step 15/25 for section 2
 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ    | 14/25 [00:20<00:12,  1.10s/it]
Sampling step 16/25 for section 2
Sampling step 17/25 for section 2
 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    | 15/25 [00:21<00:10,  1.07s/it]
Sampling step 18/25 for section 2
 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š   | 17/25 [00:23<00:08,  1.06s/it]
Sampling step 19/25 for section 2
 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  | 18/25 [00:24<00:07,  1.04s/it]
Sampling step 20/25 for section 2
 76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ  | 19/25 [00:26<00:07,  1.29s/it]
Sampling step 21/25 for section 2
 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  | 20/25 [00:27<00:06,  1.21s/it]
Sampling step 22/25 for section 2
 84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 21/25 [00:29<00:05,  1.45s/it]
Sampling step 23/25 for section 2
 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 22/25 [00:32<00:05,  1.89s/it]
Sampling step 24/25 for section 2
 92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 23/25 [00:35<00:04,  2.21s/it]
Sampling step 25/25 for section 2
 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 24/25 [00:36<00:01,  1.85s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [00:38<00:00,  1.93s/it]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 25/25 [00:38<00:00,  1.55s/it]
Updated video saved: /tmp/tmp06zetr8l/250514_190911_051_8874_video.mp4, total pixel frames: 73
Version Details
Version ID
b840152c70e887773e95b24d1f1e8fd2aea448fcf093de801d3627f0a197409f
Version Created
May 14, 2025
Run on Replicate β†’