lucataco/hunyuan-steamboat-willie 🔢📝❓✓🖼️ → 🖼️

▶️ 117 runs 📅 Jan 2025 ⚙️ Cog 0.13.6

About

HunyuanVideo finetune of Walt Disney's: 1928 Steamboat Willie

Example Output

Prompt:

"In the style of SWR. A black and white animated scene featuring a mouse, dressed in a sailor outfit. The mouse is standing at the helm of a ship, holding the steering wheel with both hands."

Output

Performance Metrics

306.64s Prediction Time

306.65s Total Time

All Input Parameters

{
  "crf": 19,
  "steps": 50,
  "width": 960,
  "height": 544,
  "prompt": "In the style of SWR. A black and white animated scene featuring a mouse, dressed in a sailor outfit. The mouse is standing at the helm of a ship, holding the steering wheel with both hands.",
  "lora_url": "",
  "flow_shift": 9,
  "frame_rate": 24,
  "num_frames": 49,
  "force_offload": true,
  "lora_strength": 0.9,
  "guidance_scale": 6,
  "denoise_strength": 1
}

Input Parameters

crf Type: integerDefault: 19Range: 0 - 51: CRF (quality) for H264 encoding. Lower values = higher quality.
seed Type: integer: Set a seed for reproducibility. Random by default.
steps Type: integerDefault: 50Range: 1 - 150: Number of diffusion steps.
width Type: integerDefault: 640Range: 64 - 1536: Width for the generated video.
height Type: integerDefault: 360Range: 64 - 1024: Height for the generated video.
prompt Type: stringDefault:: The text prompt describing your video scene.
lora_url Type: stringDefault:: A URL pointing to your LoRA .safetensors file or a Hugging Face repo (e.g. 'user/repo' - uses the first .safetensors file).
scheduler Default: DPMSolverMultistepScheduler: Algorithm used to generate the video frames.
flow_shift Type: integerDefault: 9Range: 0 - 20: Video continuity factor (flow).
frame_rate Type: integerDefault: 16Range: 1 - 60: Video frame rate.
num_frames Type: integerDefault: 33Range: 1 - 1440: How many frames (duration) in the resulting video.
enhance_end Type: numberDefault: 1Range: 0 - 1: When to end enhancement in the video. Must be greater than enhance_start.
enhance_start Type: numberDefault: 0Range: 0 - 1: When to start enhancement in the video. Must be less than enhance_end.
force_offload Type: booleanDefault: true: Whether to force model layers offloaded to CPU.
lora_strength Type: numberDefault: 1Range: -10 - 10: Scale/strength for your LoRA.
enhance_double Type: booleanDefault: true: Apply enhancement across frame pairs.
enhance_single Type: booleanDefault: true: Apply enhancement to individual frames.
enhance_weight Type: numberDefault: 0.3Range: 0 - 2: Strength of the video enhancement effect.
guidance_scale Type: numberDefault: 6Range: 0 - 30: Overall influence of text vs. model.
denoise_strength Type: numberDefault: 1Range: 0 - 2: Controls how strongly noise is applied each step.
replicate_weights Type: string: A .tar file containing LoRA weights from replicate.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Random seed set to: 332011597
Checking inputs
====================================
Checking weights
✅ hunyuan_video_720_fp8_e4m3fn.safetensors exists in ComfyUI/models/diffusion_models
✅ hunyuan_video_vae_bf16.safetensors exists in ComfyUI/models/vae
====================================
Running workflow
[ComfyUI] got prompt
Executing node 30, title: HunyuanVideo TextEncode, class type: HyVideoTextEncode
[ComfyUI] llm prompt attention_mask shape: torch.Size([1, 161]), masked tokens: 43
[ComfyUI] clipL prompt attention_mask shape: torch.Size([1, 77]), masked tokens: 44
[ComfyUI] Input (height, width, video_length) = (544, 960, 49)
Executing node 3, title: HunyuanVideo Sampler, class type: HyVideoSampler
[ComfyUI] Sampling 49 frames in 13 latents at 960x544 with 50 inference steps
[ComfyUI]
[ComfyUI] 0%|          | 0/50 [00:00<?, ?it/s]
[ComfyUI] 2%|▏         | 1/50 [00:04<03:16,  4.00s/it]
[ComfyUI] 4%|▍         | 2/50 [00:09<03:55,  4.90s/it]
[ComfyUI] 6%|▌         | 3/50 [00:15<04:03,  5.19s/it]
[ComfyUI] 8%|▊         | 4/50 [00:20<04:04,  5.32s/it]
[ComfyUI] 10%|█         | 5/50 [00:26<04:02,  5.40s/it]
[ComfyUI] 12%|█▏        | 6/50 [00:31<03:59,  5.45s/it]
[ComfyUI] 14%|█▍        | 7/50 [00:37<03:55,  5.48s/it]
[ComfyUI] 16%|█▌        | 8/50 [00:42<03:50,  5.49s/it]
[ComfyUI] 18%|█▊        | 9/50 [00:48<03:45,  5.50s/it]
[ComfyUI] 20%|██        | 10/50 [00:53<03:40,  5.51s/it]
[ComfyUI] 22%|██▏       | 11/50 [00:59<03:35,  5.52s/it]
[ComfyUI] 24%|██▍       | 12/50 [01:04<03:29,  5.52s/it]
[ComfyUI] 26%|██▌       | 13/50 [01:10<03:24,  5.52s/it]
[ComfyUI] 28%|██▊       | 14/50 [01:15<03:18,  5.52s/it]
[ComfyUI] 30%|███       | 15/50 [01:21<03:13,  5.52s/it]
[ComfyUI] 32%|███▏      | 16/50 [01:26<03:07,  5.53s/it]
[ComfyUI] 34%|███▍      | 17/50 [01:32<03:02,  5.53s/it]
[ComfyUI] 36%|███▌      | 18/50 [01:38<02:56,  5.53s/it]
[ComfyUI] 38%|███▊      | 19/50 [01:43<02:51,  5.53s/it]
[ComfyUI] 40%|████      | 20/50 [01:49<02:45,  5.53s/it]
[ComfyUI] 42%|████▏     | 21/50 [01:54<02:40,  5.53s/it]
[ComfyUI] 44%|████▍     | 22/50 [02:00<02:34,  5.53s/it]
[ComfyUI] 46%|████▌     | 23/50 [02:05<02:29,  5.53s/it]
[ComfyUI] 48%|████▊     | 24/50 [02:11<02:23,  5.53s/it]
[ComfyUI] 50%|█████     | 25/50 [02:16<02:18,  5.53s/it]
[ComfyUI] 52%|█████▏    | 26/50 [02:22<02:12,  5.53s/it]
[ComfyUI] 54%|█████▍    | 27/50 [02:27<02:07,  5.53s/it]
[ComfyUI] 56%|█████▌    | 28/50 [02:33<02:01,  5.53s/it]
[ComfyUI] 58%|█████▊    | 29/50 [02:38<01:56,  5.53s/it]
[ComfyUI] 60%|██████    | 30/50 [02:44<01:50,  5.53s/it]
[ComfyUI] 62%|██████▏   | 31/50 [02:49<01:45,  5.53s/it]
[ComfyUI] 64%|██████▍   | 32/50 [02:55<01:39,  5.53s/it]
[ComfyUI] 66%|██████▌   | 33/50 [03:00<01:34,  5.53s/it]
[ComfyUI] 68%|██████▊   | 34/50 [03:06<01:28,  5.53s/it]
[ComfyUI] 70%|███████   | 35/50 [03:12<01:22,  5.53s/it]
[ComfyUI] 72%|███████▏  | 36/50 [03:17<01:17,  5.53s/it]
[ComfyUI] 74%|███████▍  | 37/50 [03:23<01:11,  5.53s/it]
[ComfyUI] 76%|███████▌  | 38/50 [03:28<01:06,  5.53s/it]
[ComfyUI] 78%|███████▊  | 39/50 [03:34<01:00,  5.53s/it]
[ComfyUI] 80%|████████  | 40/50 [03:39<00:55,  5.53s/it]
[ComfyUI] 82%|████████▏ | 41/50 [03:45<00:49,  5.53s/it]
[ComfyUI] 84%|████████▍ | 42/50 [03:50<00:44,  5.53s/it]
[ComfyUI] 86%|████████▌ | 43/50 [03:56<00:38,  5.53s/it]
[ComfyUI] 88%|████████▊ | 44/50 [04:01<00:33,  5.53s/it]
[ComfyUI] 90%|█████████ | 45/50 [04:07<00:27,  5.53s/it]
[ComfyUI] 92%|█████████▏| 46/50 [04:12<00:22,  5.53s/it]
[ComfyUI] 94%|█████████▍| 47/50 [04:18<00:16,  5.53s/it]
[ComfyUI] 96%|█████████▌| 48/50 [04:23<00:11,  5.53s/it]
[ComfyUI] 98%|█████████▊| 49/50 [04:29<00:05,  5.53s/it]
[ComfyUI] 100%|██████████| 50/50 [04:34<00:00,  5.53s/it]
[ComfyUI] 100%|██████████| 50/50 [04:34<00:00,  5.50s/it]
[ComfyUI] Allocated memory: memory=12.301 GB
[ComfyUI] Max allocated memory: max_memory=18.380 GB
[ComfyUI] Max reserved memory: max_reserved=20.250 GB
Executing node 5, title: HunyuanVideo Decode, class type: HyVideoDecode
[ComfyUI]
[ComfyUI] Decoding rows:   0%|          | 0/3 [00:00<?, ?it/s]
[ComfyUI] Decoding rows:  33%|███▎      | 1/3 [00:01<00:03,  1.50s/it]
[ComfyUI] Decoding rows:  67%|██████▋   | 2/3 [00:03<00:01,  1.63s/it]
[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]
[ComfyUI] Decoding rows: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]
[ComfyUI]
[ComfyUI] Blending tiles:   0%|          | 0/3 [00:00<?, ?it/s]
[ComfyUI] Blending tiles:  33%|███▎      | 1/3 [00:00<00:00,  7.33it/s]
Executing node 34, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] Blending tiles: 100%|██████████| 3/3 [00:00<00:00, 17.12it/s]
[ComfyUI] Prompt executed in 296.05 seconds
outputs:  {'34': {'gifs': [{'filename': 'HunyuanVideo_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 24.0, 'workflow': 'HunyuanVideo_00001.png', 'fullpath': '/tmp/outputs/HunyuanVideo_00001.mp4'}]}}
====================================
HunyuanVideo_00001.png
HunyuanVideo_00001.mp4

Version Details

Version ID: 6edfb0911970aed873d8783c8e118cbb6c8d06fb944d99331c6063ffed6d5831
Version Created: January 11, 2025

Run on Replicate →