zsxkib/wan-squish-1000steps ðĒðžïļâð â ðžïļ
About
Example Output
Prompt:
"SQUISH-IT Cute golden retriever puppy sitting in grass with flowers. Human hands enter the frame and gently begin to squish and mold the puppy like soft dough. The puppy's fluffy fur and form gradually transform into a malleable clay-like substance as the hands shape it. The final shot shows the reshaped puppy-dough creation sitting on the grass surrounded by flowers."
Output
Performance Metrics
226.15s
Prediction Time
319.32s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/MgdQczBaRzH9j7EPPCr8LR56ctlvkmdXIf1GBxGPeErPGjHA/Screenshot%202025-03-19%20at%2011.24.14.png", "frames": 81, "prompt": "SQUISH-IT Cute golden retriever puppy sitting in grass with flowers. Human hands enter the frame and gently begin to squish and mold the puppy like soft dough. The puppy's fluffy fur and form gradually transform into a malleable clay-like substance as the hands shape it. The final shot shows the reshaped puppy-dough creation sitting on the grass surrounded by flowers.", "fast_mode": "Balanced", "resolution": "480p", "aspect_ratio": "16:9", "sample_shift": 8, "sample_steps": 30, "negative_prompt": "", "lora_strength_clip": 1, "sample_guide_scale": 5, "lora_strength_model": 0.8 }
Input Parameters
- seed
- Set a seed for reproducibility. Random by default.
- image
- Image to use as a starting frame for image to video generation.
- frames
- The number of frames to generate (1 to 5 seconds)
- prompt (required)
- Text prompt for video generation
- fast_mode
- Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.
- resolution
- The resolution of the video. 720p is not supported for 1.3b.
- aspect_ratio
- The aspect ratio of the video. 16:9, 9:16, 1:1, etc.
- sample_shift
- Sample shift factor
- sample_steps
- Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts
- negative_prompt
- Things you do not want to see in your video
- replicate_weights
- Replicate LoRA weights to use. Leave blank to use the default weights.
- lora_strength_clip
- Strength of the LORA applied to the CLIP model. 0.0 is no LORA.
- sample_guide_scale
- Higher guide scale makes prompt adherence better, but can reduce variation
- lora_strength_model
- Strength of the LORA applied to the model. 0.0 is no LORA.
Output Schema
Output
Example Execution Logs
Random seed set to: 1598448377 2025-03-19T13:07:58Z | INFO | [ Initiating ] chunk_size=150M dest=/tmp/tmp0j3_5xxe/weights url=https://replicate.delivery/xezq/9DjUprMiHXZkFRQQqy3VbZBxAzV963P08m0RZZwkw1Dt7eMKA/trained_model.tar 2025-03-19T13:08:02Z | INFO | [ Complete ] dest=/tmp/tmp0j3_5xxe/weights size="359 MB" total_elapsed=3.376s url=https://replicate.delivery/xezq/9DjUprMiHXZkFRQQqy3VbZBxAzV963P08m0RZZwkw1Dt7eMKA/trained_model.tar Checking inputs â /tmp/inputs/image.png ==================================== Checking weights â umt5_xxl_fp16.safetensors exists in ComfyUI/models/text_encoders âģ Downloading wan2.1_i2v_480p_14B_bf16.safetensors to ComfyUI/models/diffusion_models â wan2.1_i2v_480p_14B_bf16.safetensors downloaded to ComfyUI/models/diffusion_models in 18.11s, size: 31270.88MB â wan_2.1_vae.safetensors exists in ComfyUI/models/vae â clip_vision_h.safetensors exists in ComfyUI/models/clip_vision â 14b_02c3ebfa71932e569775343580ab386c.safetensors exists in loras directory ==================================== Running workflow [ComfyUI] got prompt Executing node 39, title: Load VAE, class type: VAELoader [ComfyUI] Using pytorch attention in VAE [ComfyUI] Using pytorch attention in VAE [ComfyUI] VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16 Executing node 55, title: Load Image, class type: LoadImage Executing node 56, title: Width and height for scaling image to ideal resolution ðŠī, class type: Width and height for scaling image to ideal resolution ðŠī Executing node 57, title: ð§ Image Resize, class type: ImageResize+ Executing node 60, title: Load CLIP Vision, class type: CLIPVisionLoader [ComfyUI] Requested to load CLIPVisionModelProjection Executing node 59, title: CLIP Vision Encode, class type: CLIPVisionEncode [ComfyUI] loaded completely 141327.4875 1208.09814453125 True Executing node 38, title: Load CLIP, class type: CLIPLoader [ComfyUI] CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16 [ComfyUI] Requested to load WanTEModel Executing node 7, title: CLIP Text Encode (Negative Prompt), class type: CLIPTextEncode [ComfyUI] loaded completely 139855.3869140625 10835.4765625 True Executing node 37, title: Load Diffusion Model, class type: UNETLoader [ComfyUI] model weight dtype torch.float16, manual cast: None [ComfyUI] model_type FLOW Executing node 54, title: WanVideo Tea Cache (native), class type: WanVideoTeaCacheKJ Executing node 49, title: Load LoRA, class type: LoraLoader [ComfyUI] Requested to load WanTEModel Executing node 6, title: CLIP Text Encode (Positive Prompt), class type: CLIPTextEncode [ComfyUI] loaded completely 139853.3869140625 10835.4765625 True Executing node 58, title: WanImageToVideo, class type: WanImageToVideo [ComfyUI] Requested to load WanVAE [ComfyUI] loaded completely 125099.8056602478 242.02829551696777 True Executing node 48, title: ModelSamplingSD3, class type: ModelSamplingSD3 Executing node 53, title: WanVideo Enhance A Video (native), class type: WanVideoEnhanceAVideoKJ Executing node 3, title: KSampler, class type: KSampler [ComfyUI] Requested to load WAN21 [ComfyUI] loaded completely 122640.68833397522 31269.802368164062 True [ComfyUI] [ComfyUI] 0%| | 0/30 [00:00<?, ?it/s] [ComfyUI] 3%|â | 1/30 [00:08<04:03, 8.41s/it] [ComfyUI] 7%|â | 2/30 [00:18<04:26, 9.53s/it] [ComfyUI] 10%|â | 3/30 [00:29<04:27, 9.90s/it] [ComfyUI] TeaCache: Initialized [ComfyUI] [ComfyUI] 13%|ââ | 4/30 [00:41<04:46, 11.04s/it] [ComfyUI] 20%|ââ | 6/30 [00:52<03:12, 8.04s/it] [ComfyUI] 23%|âââ | 7/30 [01:03<03:21, 8.77s/it] [ComfyUI] 30%|âââ | 9/30 [01:14<02:32, 7.28s/it] [ComfyUI] 37%|ââââ | 11/30 [01:24<02:04, 6.55s/it] [ComfyUI] 43%|âââââ | 13/30 [01:35<01:44, 6.14s/it] [ComfyUI] 50%|âââââ | 15/30 [01:46<01:28, 5.88s/it] [ComfyUI] 57%|ââââââ | 17/30 [01:57<01:14, 5.71s/it] [ComfyUI] 63%|âââââââ | 19/30 [02:07<01:01, 5.61s/it] [ComfyUI] 70%|âââââââ | 21/30 [02:18<00:49, 5.54s/it] [ComfyUI] 77%|ââââââââ | 23/30 [02:18<00:26, 3.85s/it] [ComfyUI] 80%|ââââââââ | 24/30 [02:29<00:30, 5.09s/it] [ComfyUI] 87%|âââââââââ | 26/30 [02:29<00:13, 3.38s/it] [ComfyUI] 87%|âââââââââ | 26/30 [02:40<00:13, 3.38s/it] [ComfyUI] 90%|âââââââââ | 27/30 [02:40<00:14, 4.81s/it] [ComfyUI] 97%|ââââââââââ| 29/30 [02:51<00:05, 5.01s/it] [ComfyUI] 100%|ââââââââââ| 30/30 [03:01<00:00, 6.16s/it] Executing node 8, title: VAE Decode, class type: VAEDecode Executing node 50, title: Video Combine ðĨð Ĩð ð Ē, class type: VHS_VideoCombine [ComfyUI] 100%|ââââââââââ| 30/30 [03:01<00:00, 6.06s/it] [ComfyUI] Prompt executed in 204.27 seconds outputs: {'50': {'gifs': [{'filename': 'R8_Wan_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 16.0, 'workflow': 'R8_Wan_00001.png', 'fullpath': '/tmp/outputs/R8_Wan_00001.mp4'}]}} ==================================== R8_Wan_00001.png R8_Wan_00001.mp4
Version Details
- Version ID
eedd0c093a39ec030ceb9a31ddcc6bd705515088fe2d6da867214ac1adba07e5
- Version Created
- March 19, 2025