fofr/wan2.1-with-lora 🔢🖼️❓📝 → 🖼️

▶️ 38.9K runs 📅 Mar 2025 ⚙️ Cog 0.14.1 🔗 GitHub ⚖️ License

image-to-video lora text-to-video

About

Run Wan2.1 14b or 1.3b with a lora

Example Output

Prompt:

"flat color 2d animation of a portrait of woman with white hair and green eyes, dynamic scene"

Output

Performance Metrics

392.32s Prediction Time

443.47s Total Time

All Input Parameters

{
  "model": "14b",
  "frames": 81,
  "prompt": "flat color 2d animation of a portrait of woman with white hair and green eyes, dynamic scene",
  "lora_url": "https://huggingface.co/motimalu/wan-flat-color-v2/resolve/main/wan_flat_color_v2.safetensors",
  "aspect_ratio": "16:9",
  "sample_shift": 8,
  "sample_steps": 30,
  "lora_strength_clip": 1,
  "sample_guide_scale": 5,
  "lora_strength_model": 1
}

Input Parameters

seed Type: integer: Set a seed for reproducibility. Random by default.
image Type: string: Image to use as a starting frame for image to video generation.
model Default: 14b: The model to use. 1.3b is faster, but 14b is better quality. A LORA either works with 1.3b or 14b, depending on the version it was trained on.
frames Default: 81: The number of frames to generate (1 to 5 seconds)
prompt (required) Type: string: Text prompt for video generation
lora_url Type: string: Optional: The URL of a LORA to use
fast_mode Default: Balanced: Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.
resolution Default: 480p: The resolution of the video. 720p is not supported for 1.3b.
aspect_ratio Default: 16:9: The aspect ratio of the video. 16:9, 9:16, 1:1, etc.
sample_shift Type: numberDefault: 8Range: 0 - 10: Sample shift factor
sample_steps Type: integerDefault: 30Range: 1 - 60: Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts
negative_prompt Type: stringDefault:: Things you do not want to see in your video
lora_strength_clip Type: numberDefault: 1: Strength of the LORA applied to the CLIP model. 0.0 is no LORA.
sample_guide_scale Type: numberDefault: 5Range: 0 - 10: Higher guide scale makes prompt adherence better, but can reduce variation
lora_strength_model Type: numberDefault: 1: Strength of the LORA applied to the model. 0.0 is no LORA.

Output Schema

Output

Type: array • Items Type: string • Items Format: uri

Example Execution Logs

Random seed set to: 2381169083
Checking inputs
====================================
Checking weights
Converting LoraLoader node 49 to LoraLoaderFromURL
✅ umt5_xxl_fp16.safetensors exists in ComfyUI/models/text_encoders
✅ wan_2.1_vae.safetensors exists in ComfyUI/models/vae
✅ wan2.1_t2v_14B_bf16.safetensors exists in ComfyUI/models/diffusion_models
====================================
Running workflow
[ComfyUI] got prompt
Executing node 37, title: Load Diffusion Model, class type: UNETLoader
[ComfyUI] model weight dtype torch.bfloat16, manual cast: None
[ComfyUI] model_type FLOW
Executing node 49, title: Load LoRA, class type: LoraLoaderFromURL
[ComfyUI] Requested to load WanTEModel
Executing node 6, title: CLIP Text Encode (Positive Prompt), class type: CLIPTextEncode
[ComfyUI] loaded completely 140527.45920448302 10835.4765625 True
Executing node 48, title: ModelSamplingSD3, class type: ModelSamplingSD3
Executing node 3, title: KSampler, class type: KSampler
[ComfyUI] Requested to load WAN21
[ComfyUI] loaded completely 123801.93451991271 27251.406372070312 True
[ComfyUI]
[ComfyUI] 0%|          | 0/30 [00:00<?, ?it/s]
[ComfyUI] 3%|▎         | 1/30 [00:06<03:00,  6.21s/it]
[ComfyUI] 7%|▋         | 2/30 [00:15<03:37,  7.76s/it]
[ComfyUI] 10%|█         | 3/30 [00:23<03:42,  8.26s/it]
[ComfyUI] 13%|█▎        | 4/30 [00:32<03:40,  8.50s/it]
[ComfyUI] 17%|█▋        | 5/30 [00:41<03:36,  8.64s/it]
[ComfyUI] 20%|██        | 6/30 [00:50<03:29,  8.73s/it]
[ComfyUI] 23%|██▎       | 7/30 [00:59<03:22,  8.78s/it]
[ComfyUI] 27%|██▋       | 8/30 [01:08<03:14,  8.82s/it]
[ComfyUI] 30%|███       | 9/30 [01:17<03:05,  8.85s/it]
[ComfyUI] 33%|███▎      | 10/30 [01:26<02:57,  8.86s/it]
[ComfyUI] 37%|███▋      | 11/30 [01:35<02:48,  8.88s/it]
[ComfyUI] 40%|████      | 12/30 [01:43<02:40,  8.89s/it]
[ComfyUI] 43%|████▎     | 13/30 [01:52<02:31,  8.90s/it]
[ComfyUI] 47%|████▋     | 14/30 [02:01<02:22,  8.90s/it]
[ComfyUI] 50%|█████     | 15/30 [02:10<02:13,  8.90s/it]
[ComfyUI] 53%|█████▎    | 16/30 [02:19<02:04,  8.90s/it]
[ComfyUI] 57%|█████▋    | 17/30 [02:28<01:55,  8.91s/it]
[ComfyUI] 60%|██████    | 18/30 [02:37<01:46,  8.90s/it]
[ComfyUI] 63%|██████▎   | 19/30 [02:46<01:37,  8.91s/it]
[ComfyUI] 67%|██████▋   | 20/30 [02:55<01:29,  8.91s/it]
[ComfyUI] 70%|███████   | 21/30 [03:04<01:20,  8.91s/it]
[ComfyUI] 73%|███████▎  | 22/30 [03:13<01:11,  8.91s/it]
[ComfyUI] 77%|███████▋  | 23/30 [03:21<01:02,  8.91s/it]
[ComfyUI] 80%|████████  | 24/30 [03:30<00:53,  8.91s/it]
[ComfyUI] 83%|████████▎ | 25/30 [03:39<00:44,  8.91s/it]
[ComfyUI] 87%|████████▋ | 26/30 [03:48<00:35,  8.91s/it]
[ComfyUI] 90%|█████████ | 27/30 [03:57<00:26,  8.91s/it]
[ComfyUI] 93%|█████████▎| 28/30 [04:06<00:17,  8.91s/it]
[ComfyUI] 97%|█████████▋| 29/30 [04:15<00:08,  8.91s/it]
[ComfyUI] 100%|██████████| 30/30 [04:27<00:00,  9.71s/it]
Executing node 8, title: VAE Decode, class type: VAEDecode
Executing node 50, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] 100%|██████████| 30/30 [04:27<00:00,  8.90s/it]
[ComfyUI] Prompt executed in 392.13 seconds
outputs:  {'50': {'gifs': [{'filename': 'R8_Wan_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 16.0, 'workflow': 'R8_Wan_00001.png', 'fullpath': '/tmp/outputs/R8_Wan_00001.mp4'}]}}
====================================
R8_Wan_00001.png
R8_Wan_00001.mp4

Version Details

Version ID: c48fa8ec65b13143cb552ab98ea17984eab9d70e9fe99479117de40a2a7f9ed0
Version Created: March 17, 2025

Run on Replicate →