zsxkib/wan-cakeify 🔢🖼️❓📝 → 🖼️

▶️ 129 runs 📅 Mar 2025 ⚙️ Cog 0.13.7

cake cakeify image-to-video lora lora-style text-to-video

Performance

226.0sTypical run time

129Total runs

About

Example Output

Prompt:

"A cute golden retriever puppy sits peacefully in CAKEIFY style, in grass surrounded by colorful flowers. A hand wearing a black glove enters the frame holding a sharp knife. The knife slowly cuts into the puppy, revealing that it's actually a hyper-realistic prop. The knife continues slicing, exposing layers of moist sponge cake and frosting inside. The cut piece tilts slightly, showing the detailed cake interior while maintaining the puppy's realistic exterior appearance. The final shot shows the partially sliced puppy cake sitting in the grass with flowers, with one perfect slice removed revealing its delicious interior"

Output

Performance Metrics

225.99s Prediction Time

226.73s Total Time

All Input Parameters

{
  "image": "https://replicate.delivery/pbxt/Mh40Tagw3xWduL1PyW2FsBhvN1Dn1sbw1r47t7nnHGH0zImv/Screenshot%202025-03-19%20at%2011.24.14.png",
  "frames": 81,
  "prompt": "A cute golden retriever puppy sits peacefully in CAKEIFY style, in grass surrounded by colorful flowers. A hand wearing a black glove enters the frame holding a sharp knife. The knife slowly cuts into the puppy, revealing that it's actually a hyper-realistic prop. The knife continues slicing, exposing layers of moist sponge cake and frosting inside. The cut piece tilts slightly, showing the detailed cake interior while maintaining the puppy's realistic exterior appearance. The final shot shows the partially sliced puppy cake sitting in the grass with flowers, with one perfect slice removed revealing its delicious interior",
  "fast_mode": "Balanced",
  "resolution": "480p",
  "aspect_ratio": "16:9",
  "sample_shift": 8,
  "sample_steps": 30,
  "negative_prompt": "",
  "lora_strength_clip": 1,
  "sample_guide_scale": 5,
  "lora_strength_model": 1
}

Input Parameters

seed Type: integer: Set a seed for reproducibility. Random by default.
image Type: string: Image to use as a starting frame for image to video generation.
frames Default: 81: The number of frames to generate (1 to 5 seconds)
prompt (required) Type: string: Text prompt for video generation
fast_mode Default: Balanced: Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.
resolution Default: 480p: The resolution of the video. 720p is not supported for 1.3b.
aspect_ratio Default: 16:9: The aspect ratio of the video. 16:9, 9:16, 1:1, etc.
sample_shift Type: numberDefault: 8Range: 0 - 10: Sample shift factor
sample_steps Type: integerDefault: 30Range: 1 - 60: Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts
negative_prompt Type: stringDefault:: Things you do not want to see in your video
replicate_weights Type: string: Replicate LoRA weights to use. Leave blank to use the default weights.
lora_strength_clip Type: numberDefault: 1: Strength of the LORA applied to the CLIP model. 0.0 is no LORA.
sample_guide_scale Type: numberDefault: 5Range: 0 - 10: Higher guide scale makes prompt adherence better, but can reduce variation
lora_strength_model Type: numberDefault: 1: Strength of the LORA applied to the model. 0.0 is no LORA.

Output Schema

Output

Type: array • Items Type: string • Items Format: uri

Example Execution Logs

Random seed set to: 3395008289
2025-03-20T18:05:29Z | INFO  | [ Initiating ] chunk_size=150M dest=/tmp/tmp3ec62_04/weights url=https://replicate.delivery/xezq/HZMSOwxPxXYbCBGeQhCsMD6omrKQ9jfPxA8Xj4QUSPKC5e0oA/trained_model.tar
2025-03-20T18:05:32Z | INFO  | [ Complete ] dest=/tmp/tmp3ec62_04/weights size="359 MB" total_elapsed=2.855s url=https://replicate.delivery/xezq/HZMSOwxPxXYbCBGeQhCsMD6omrKQ9jfPxA8Xj4QUSPKC5e0oA/trained_model.tar
Checking inputs
✅ /tmp/inputs/image.png
====================================
Checking weights
✅ wan_2.1_vae.safetensors exists in ComfyUI/models/vae
✅ umt5_xxl_fp16.safetensors exists in ComfyUI/models/text_encoders
⏳ Downloading wan2.1_i2v_480p_14B_bf16.safetensors to ComfyUI/models/diffusion_models
✅ wan2.1_i2v_480p_14B_bf16.safetensors downloaded to ComfyUI/models/diffusion_models in 21.29s, size: 31270.88MB
✅ 14b_6c163d784671748856bc8aef2c1122df.safetensors exists in loras directory
✅ clip_vision_h.safetensors exists in ComfyUI/models/clip_vision
====================================
Running workflow
[ComfyUI] got prompt
Executing node 55, title: Load Image, class type: LoadImage
Executing node 56, title: Width and height for scaling image to ideal resolution 🪴, class type: Width and height for scaling image to ideal resolution 🪴
Executing node 57, title: 🔧 Image Resize, class type: ImageResize+
Executing node 60, title: Load CLIP Vision, class type: CLIPVisionLoader
[ComfyUI] Requested to load CLIPVisionModelProjection
Executing node 59, title: CLIP Vision Encode, class type: CLIPVisionEncode
[ComfyUI] loaded completely 129691.98263816834 1208.09814453125 True
Executing node 37, title: Load Diffusion Model, class type: UNETLoader
[ComfyUI] model weight dtype torch.float16, manual cast: None
[ComfyUI] model_type FLOW
Executing node 54, title: WanVideo Tea Cache (native), class type: WanVideoTeaCacheKJ
Executing node 49, title: Load LoRA, class type: LoraLoader
[ComfyUI] Requested to load WanTEModel
Executing node 6, title: CLIP Text Encode (Positive Prompt), class type: CLIPTextEncode
[ComfyUI] loaded completely 139319.35861854552 10835.4765625 True
Executing node 58, title: WanImageToVideo, class type: WanImageToVideo
Executing node 48, title: ModelSamplingSD3, class type: ModelSamplingSD3
Executing node 53, title: WanVideo Enhance A Video (native), class type: WanVideoEnhanceAVideoKJ
Executing node 3, title: KSampler, class type: KSampler
[ComfyUI] Requested to load WAN21
[ComfyUI] loaded completely 122348.68833397522 31269.802368164062 True
[ComfyUI]
[ComfyUI] 0%|          | 0/30 [00:00<?, ?it/s]
[ComfyUI] 3%|▎         | 1/30 [00:08<03:59,  8.27s/it]
[ComfyUI] 7%|▋         | 2/30 [00:18<04:26,  9.53s/it]
[ComfyUI] 10%|█         | 3/30 [00:29<04:28,  9.94s/it]
[ComfyUI] TeaCache: Initialized
[ComfyUI]
[ComfyUI] 13%|█▎        | 4/30 [00:41<04:48, 11.09s/it]
[ComfyUI] 20%|██        | 6/30 [00:52<03:13,  8.08s/it]
[ComfyUI] 23%|██▎       | 7/30 [01:03<03:22,  8.81s/it]
[ComfyUI] 30%|███       | 9/30 [01:14<02:35,  7.38s/it]
[ComfyUI] 33%|███▎      | 10/30 [01:14<01:52,  5.65s/it]
[ComfyUI] 37%|███▋      | 11/30 [01:25<02:12,  6.95s/it]
[ComfyUI] 43%|████▎     | 13/30 [01:36<01:47,  6.30s/it]
[ComfyUI] 50%|█████     | 15/30 [01:47<01:29,  5.97s/it]
[ComfyUI] 57%|█████▋    | 17/30 [01:58<01:15,  5.77s/it]
[ComfyUI] 63%|██████▎   | 19/30 [02:08<01:02,  5.65s/it]
[ComfyUI] 70%|███████   | 21/30 [02:19<00:50,  5.58s/it]
[ComfyUI] 77%|███████▋  | 23/30 [02:19<00:26,  3.84s/it]
[ComfyUI] 80%|████████  | 24/30 [02:30<00:30,  5.11s/it]
[ComfyUI] 87%|████████▋ | 26/30 [02:30<00:13,  3.38s/it]
[ComfyUI] 93%|█████████▎| 28/30 [02:41<00:08,  4.05s/it]
[ComfyUI] 97%|█████████▋| 29/30 [02:52<00:05,  5.32s/it]
[ComfyUI] 100%|██████████| 30/30 [03:03<00:00,  6.48s/it]
Executing node 8, title: VAE Decode, class type: VAEDecode
Executing node 50, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] 100%|██████████| 30/30 [03:03<00:00,  6.10s/it]
[ComfyUI] Prompt executed in 201.50 seconds
outputs:  {'50': {'gifs': [{'filename': 'R8_Wan_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 16.0, 'workflow': 'R8_Wan_00001.png', 'fullpath': '/tmp/outputs/R8_Wan_00001.mp4'}]}}
====================================
R8_Wan_00001.png
R8_Wan_00001.mp4

Version Details

Version ID: 995799f72c6f3ee1d96df4f7c6d0e5006d417aa025cbb5ad90d80ddffc68c0be
Version Created: March 20, 2025

Run on Replicate →