fofr/wan2.1-with-lora 🔢🖼️❓📝 → 🖼️

▶️ 38.9K runs 📅 Mar 2025 ⚙️ Cog 0.14.1 🔗 GitHub ⚖️ License
image-to-video lora text-to-video

About

Run Wan2.1 14b or 1.3b with a lora

Example Output

Prompt:

"flat color 2d animation of a portrait of woman with white hair and green eyes, dynamic scene"

Output

Performance Metrics

392.32s Prediction Time
443.47s Total Time
All Input Parameters
{
  "model": "14b",
  "frames": 81,
  "prompt": "flat color 2d animation of a portrait of woman with white hair and green eyes, dynamic scene",
  "lora_url": "https://huggingface.co/motimalu/wan-flat-color-v2/resolve/main/wan_flat_color_v2.safetensors",
  "aspect_ratio": "16:9",
  "sample_shift": 8,
  "sample_steps": 30,
  "lora_strength_clip": 1,
  "sample_guide_scale": 5,
  "lora_strength_model": 1
}
Input Parameters
seed Type: integer
Set a seed for reproducibility. Random by default.
image Type: string
Image to use as a starting frame for image to video generation.
model Default: 14b
The model to use. 1.3b is faster, but 14b is better quality. A LORA either works with 1.3b or 14b, depending on the version it was trained on.
frames Default: 81
The number of frames to generate (1 to 5 seconds)
prompt (required) Type: string
Text prompt for video generation
lora_url Type: string
Optional: The URL of a LORA to use
fast_mode Default: Balanced
Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.
resolution Default: 480p
The resolution of the video. 720p is not supported for 1.3b.
aspect_ratio Default: 16:9
The aspect ratio of the video. 16:9, 9:16, 1:1, etc.
sample_shift Type: numberDefault: 8Range: 0 - 10
Sample shift factor
sample_steps Type: integerDefault: 30Range: 1 - 60
Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts
negative_prompt Type: stringDefault:
Things you do not want to see in your video
lora_strength_clip Type: numberDefault: 1
Strength of the LORA applied to the CLIP model. 0.0 is no LORA.
sample_guide_scale Type: numberDefault: 5Range: 0 - 10
Higher guide scale makes prompt adherence better, but can reduce variation
lora_strength_model Type: numberDefault: 1
Strength of the LORA applied to the model. 0.0 is no LORA.
Output Schema

Output

Type: arrayItems Type: stringItems Format: uri

Example Execution Logs
Random seed set to: 2381169083
Checking inputs
====================================
Checking weights
Converting LoraLoader node 49 to LoraLoaderFromURL
✅ umt5_xxl_fp16.safetensors exists in ComfyUI/models/text_encoders
✅ wan_2.1_vae.safetensors exists in ComfyUI/models/vae
✅ wan2.1_t2v_14B_bf16.safetensors exists in ComfyUI/models/diffusion_models
====================================
Running workflow
[ComfyUI] got prompt
Executing node 37, title: Load Diffusion Model, class type: UNETLoader
[ComfyUI] model weight dtype torch.bfloat16, manual cast: None
[ComfyUI] model_type FLOW
Executing node 49, title: Load LoRA, class type: LoraLoaderFromURL
[ComfyUI] Requested to load WanTEModel
Executing node 6, title: CLIP Text Encode (Positive Prompt), class type: CLIPTextEncode
[ComfyUI] loaded completely 140527.45920448302 10835.4765625 True
Executing node 48, title: ModelSamplingSD3, class type: ModelSamplingSD3
Executing node 3, title: KSampler, class type: KSampler
[ComfyUI] Requested to load WAN21
[ComfyUI] loaded completely 123801.93451991271 27251.406372070312 True
[ComfyUI]
[ComfyUI] 0%|          | 0/30 [00:00<?, ?it/s]
[ComfyUI] 3%|▎         | 1/30 [00:06<03:00,  6.21s/it]
[ComfyUI] 7%|▋         | 2/30 [00:15<03:37,  7.76s/it]
[ComfyUI] 10%|█         | 3/30 [00:23<03:42,  8.26s/it]
[ComfyUI] 13%|█▎        | 4/30 [00:32<03:40,  8.50s/it]
[ComfyUI] 17%|█▋        | 5/30 [00:41<03:36,  8.64s/it]
[ComfyUI] 20%|██        | 6/30 [00:50<03:29,  8.73s/it]
[ComfyUI] 23%|██▎       | 7/30 [00:59<03:22,  8.78s/it]
[ComfyUI] 27%|██▋       | 8/30 [01:08<03:14,  8.82s/it]
[ComfyUI] 30%|███       | 9/30 [01:17<03:05,  8.85s/it]
[ComfyUI] 33%|███▎      | 10/30 [01:26<02:57,  8.86s/it]
[ComfyUI] 37%|███▋      | 11/30 [01:35<02:48,  8.88s/it]
[ComfyUI] 40%|████      | 12/30 [01:43<02:40,  8.89s/it]
[ComfyUI] 43%|████▎     | 13/30 [01:52<02:31,  8.90s/it]
[ComfyUI] 47%|████▋     | 14/30 [02:01<02:22,  8.90s/it]
[ComfyUI] 50%|█████     | 15/30 [02:10<02:13,  8.90s/it]
[ComfyUI] 53%|█████▎    | 16/30 [02:19<02:04,  8.90s/it]
[ComfyUI] 57%|█████▋    | 17/30 [02:28<01:55,  8.91s/it]
[ComfyUI] 60%|██████    | 18/30 [02:37<01:46,  8.90s/it]
[ComfyUI] 63%|██████▎   | 19/30 [02:46<01:37,  8.91s/it]
[ComfyUI] 67%|██████▋   | 20/30 [02:55<01:29,  8.91s/it]
[ComfyUI] 70%|███████   | 21/30 [03:04<01:20,  8.91s/it]
[ComfyUI] 73%|███████▎  | 22/30 [03:13<01:11,  8.91s/it]
[ComfyUI] 77%|███████▋  | 23/30 [03:21<01:02,  8.91s/it]
[ComfyUI] 80%|████████  | 24/30 [03:30<00:53,  8.91s/it]
[ComfyUI] 83%|████████▎ | 25/30 [03:39<00:44,  8.91s/it]
[ComfyUI] 87%|████████▋ | 26/30 [03:48<00:35,  8.91s/it]
[ComfyUI] 90%|█████████ | 27/30 [03:57<00:26,  8.91s/it]
[ComfyUI] 93%|█████████▎| 28/30 [04:06<00:17,  8.91s/it]
[ComfyUI] 97%|█████████▋| 29/30 [04:15<00:08,  8.91s/it]
[ComfyUI] 100%|██████████| 30/30 [04:27<00:00,  9.71s/it]
Executing node 8, title: VAE Decode, class type: VAEDecode
Executing node 50, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] 100%|██████████| 30/30 [04:27<00:00,  8.90s/it]
[ComfyUI] Prompt executed in 392.13 seconds
outputs:  {'50': {'gifs': [{'filename': 'R8_Wan_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 16.0, 'workflow': 'R8_Wan_00001.png', 'fullpath': '/tmp/outputs/R8_Wan_00001.mp4'}]}}
====================================
R8_Wan_00001.png
R8_Wan_00001.mp4
Version Details
Version ID
c48fa8ec65b13143cb552ab98ea17984eab9d70e9fe99479117de40a2a7f9ed0
Version Created
March 17, 2025
Run on Replicate →