deepfates/hunyuan-joker 🔢📝❓✓🖼️ → 🖼️

▶️ 89 runs 📅 Jan 2025 ⚙️ Cog 0.13.6

joker-2019 joker-style text-to-video video-lora-training

About

Hunyuan-Video model finetuned on Joker (2019). Trigger word is "JKR". Use "A video in the style of JKR, JKR" at the beginning of your prompt for best results.

Example Output

Prompt:

"A video in the style of JKR, JKR The video clip depicts a beach scene with several people enjoying their time. In the foreground, a man with curly hair and a mustache is wearing glasses and a red and white checkered shirt. He is sitting on a beach chair and appears to be laughing or smiling, looking off to the side. In the background, there are other people sitting on the beach, some under umbrellas, and others lying on towels. The beach is populated with various beachgoers, and the atmosphere seems relaxed and leisurely. The overall scene conveys a sense of a typical day at the beach with people engaging in typical beach activities."

Output

Performance Metrics

147.97s Prediction Time

354.24s Total Time

All Input Parameters

{
  "seed": 12345,
  "steps": 50,
  "width": 640,
  "height": 360,
  "prompt": "A video in the style of JKR, JKR The video clip depicts a beach scene with several people enjoying their time. In the foreground, a man with curly hair and a mustache is wearing glasses and a red and white checkered shirt. He is sitting on a beach chair and appears to be laughing or smiling, looking off to the side. In the background, there are other people sitting on the beach, some under umbrellas, and others lying on towels. The beach is populated with various beachgoers, and the atmosphere seems relaxed and leisurely. The overall scene conveys a sense of a typical day at the beach with people engaging in typical beach activities.",
  "frame_rate": 16,
  "num_frames": 66,
  "lora_strength": 1,
  "guidance_scale": 6
}

Input Parameters

crf Type: integerDefault: 19Range: 0 - 51: CRF (quality) for H264 encoding. Lower values = higher quality.
seed Type: integer: Set a seed for reproducibility. Random by default.
steps Type: integerDefault: 50Range: 1 - 150: Number of diffusion steps.
width Type: integerDefault: 640Range: 64 - 1536: Width for the generated video.
height Type: integerDefault: 360Range: 64 - 1024: Height for the generated video.
prompt Type: stringDefault:: The text prompt describing your video scene.
lora_url Type: stringDefault:: A URL pointing to your LoRA .safetensors file or a Hugging Face repo (e.g. 'user/repo' - uses the first .safetensors file).
scheduler Default: DPMSolverMultistepScheduler: Algorithm used to generate the video frames.
flow_shift Type: integerDefault: 9Range: 0 - 20: Video continuity factor (flow).
frame_rate Type: integerDefault: 16Range: 1 - 60: Video frame rate.
num_frames Type: integerDefault: 33Range: 1 - 1440: How many frames (duration) in the resulting video.
enhance_end Type: numberDefault: 1Range: 0 - 1: When to end enhancement in the video. Must be greater than enhance_start.
enhance_start Type: numberDefault: 0Range: 0 - 1: When to start enhancement in the video. Must be less than enhance_end.
force_offload Type: booleanDefault: true: Whether to force model layers offloaded to CPU.
lora_strength Type: numberDefault: 1Range: -10 - 10: Scale/strength for your LoRA.
enhance_double Type: booleanDefault: true: Apply enhancement across frame pairs.
enhance_single Type: booleanDefault: true: Apply enhancement to individual frames.
enhance_weight Type: numberDefault: 0.3Range: 0 - 2: Strength of the video enhancement effect.
guidance_scale Type: numberDefault: 6Range: 0 - 30: Overall influence of text vs. model.
denoise_strength Type: numberDefault: 1Range: 0 - 2: Controls how strongly noise is applied each step.
replicate_weights Type: string: A .tar file containing LoRA weights from replicate.

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

Seed set to: 12345
⚠️  Adjusted dimensions from 640x360 to 640x368 to satisfy model requirements
⚠️  Adjusted frame count from 66 to 65 to satisfy model requirements
�� USING REPLICATE WEIGHTS (preferred method)
🎯 USING REPLICATE WEIGHTS TAR FILE 🎯
----------------------------------------
📦 Processing replicate weights tar file...
🔄 Will rename LoRA to: replicate_25190143-30c9-42ef-9c30-b6f04950d8c6.safetensors
📂 Extracting tar contents...
✅ Found lora_comfyui.safetensors in tar
✨ Successfully copied LoRA to: ComfyUI/models/loras/replicate_25190143-30c9-42ef-9c30-b6f04950d8c6.safetensors
----------------------------------------
Checking inputs
====================================
Checking weights
✅ hunyuan_video_vae_bf16.safetensors exists in ComfyUI/models/vae
✅ hunyuan_video_720_fp8_e4m3fn.safetensors exists in ComfyUI/models/diffusion_models
====================================
Running workflow
[ComfyUI] got prompt
Executing node 30, title: HunyuanVideo TextEncode, class type: HyVideoTextEncode
[ComfyUI] llm prompt attention_mask shape: torch.Size([1, 161]), masked tokens: 137
[ComfyUI] clipL prompt attention_mask shape: torch.Size([1, 77]), masked tokens: 77
Executing node 41, title: HunyuanVideo Lora Select, class type: HyVideoLoraSelect
Executing node 1, title: HunyuanVideo Model Loader, class type: HyVideoModelLoader
[ComfyUI] model_type FLOW
[ComfyUI] The config attributes {'use_flow_sigmas': True, 'prediction_type': 'flow_prediction'} were passed to FlowMatchDiscreteScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
[ComfyUI] Using accelerate to load and assign model weights to device...
[ComfyUI] Loading LoRA: replicate_25190143-30c9-42ef-9c30-b6f04950d8c6 with strength: 1.0
[ComfyUI] Requested to load HyVideoModel
[ComfyUI] loaded completely 9.5367431640625e+25 12555.953247070312 True
[ComfyUI] Input (height, width, video_length) = (368, 640, 65)
Executing node 3, title: HunyuanVideo Sampler, class type: HyVideoSampler
[ComfyUI] The config attributes {'reverse': True, 'solver': 'euler'} were passed to DPMSolverMultistepScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
[ComfyUI] Sampling 65 frames in 17 latents at 640x368 with 50 inference steps
[ComfyUI] Scheduler config: FrozenDict([('num_train_timesteps', 1000), ('flow_shift', 9.0), ('reverse', True), ('solver', 'euler'), ('n_tokens', None), ('_use_default_values', ['num_train_timesteps', 'n_tokens'])])
[ComfyUI]
[ComfyUI] 0%|          | 0/50 [00:00<?, ?it/s]
[ComfyUI] 2%|▏         | 1/50 [00:02<01:52,  2.30s/it]
[ComfyUI] 4%|▍         | 2/50 [00:04<01:36,  2.02s/it]
[ComfyUI] 6%|▌         | 3/50 [00:06<01:40,  2.15s/it]
[ComfyUI] 8%|▊         | 4/50 [00:08<01:41,  2.21s/it]
[ComfyUI] 10%|█         | 5/50 [00:11<01:40,  2.24s/it]
[ComfyUI] 12%|█▏        | 6/50 [00:13<01:39,  2.26s/it]
[ComfyUI] 14%|█▍        | 7/50 [00:15<01:37,  2.28s/it]
[ComfyUI] 16%|█▌        | 8/50 [00:17<01:35,  2.28s/it]
[ComfyUI] 18%|█▊        | 9/50 [00:20<01:33,  2.29s/it]
[ComfyUI] 20%|██        | 10/50 [00:22<01:31,  2.29s/it]
[ComfyUI] 22%|██▏       | 11/50 [00:24<01:29,  2.29s/it]
[ComfyUI] 24%|██▍       | 12/50 [00:27<01:27,  2.29s/it]
[ComfyUI] 26%|██▌       | 13/50 [00:29<01:24,  2.29s/it]
[ComfyUI] 28%|██▊       | 14/50 [00:31<01:22,  2.29s/it]
[ComfyUI] 30%|███       | 15/50 [00:33<01:20,  2.29s/it]
[ComfyUI] 32%|███▏      | 16/50 [00:36<01:18,  2.29s/it]
[ComfyUI] 34%|███▍      | 17/50 [00:38<01:15,  2.30s/it]
[ComfyUI] 36%|███▌      | 18/50 [00:40<01:13,  2.30s/it]
[ComfyUI] 38%|███▊      | 19/50 [00:43<01:11,  2.30s/it]
[ComfyUI] 40%|████      | 20/50 [00:45<01:08,  2.30s/it]
[ComfyUI] 42%|████▏     | 21/50 [00:47<01:06,  2.30s/it]
[ComfyUI] 44%|████▍     | 22/50 [00:50<01:04,  2.30s/it]
[ComfyUI] 46%|████▌     | 23/50 [00:52<01:01,  2.30s/it]
[ComfyUI] 48%|████▊     | 24/50 [00:54<00:59,  2.30s/it]
[ComfyUI] 50%|█████     | 25/50 [00:56<00:57,  2.30s/it]
[ComfyUI] 52%|█████▏    | 26/50 [00:59<00:55,  2.30s/it]
[ComfyUI] 54%|█████▍    | 27/50 [01:01<00:52,  2.30s/it]
[ComfyUI] 56%|█████▌    | 28/50 [01:03<00:50,  2.30s/it]
[ComfyUI] 58%|█████▊    | 29/50 [01:06<00:48,  2.30s/it]
[ComfyUI] 60%|██████    | 30/50 [01:08<00:45,  2.30s/it]
[ComfyUI] 62%|██████▏   | 31/50 [01:10<00:43,  2.30s/it]
[ComfyUI] 64%|██████▍   | 32/50 [01:13<00:41,  2.30s/it]
[ComfyUI] 66%|██████▌   | 33/50 [01:15<00:39,  2.30s/it]
[ComfyUI] 68%|██████▊   | 34/50 [01:17<00:36,  2.30s/it]
[ComfyUI] 70%|███████   | 35/50 [01:19<00:34,  2.30s/it]
[ComfyUI] 72%|███████▏  | 36/50 [01:22<00:32,  2.30s/it]
[ComfyUI] 74%|███████▍  | 37/50 [01:24<00:29,  2.30s/it]
[ComfyUI] 76%|███████▌  | 38/50 [01:26<00:27,  2.30s/it]
[ComfyUI] 78%|███████▊  | 39/50 [01:29<00:25,  2.30s/it]
[ComfyUI] 80%|████████  | 40/50 [01:31<00:22,  2.30s/it]
[ComfyUI] 82%|████████▏ | 41/50 [01:33<00:20,  2.30s/it]
[ComfyUI] 84%|████████▍ | 42/50 [01:35<00:18,  2.30s/it]
[ComfyUI] 86%|████████▌ | 43/50 [01:38<00:16,  2.30s/it]
[ComfyUI] 88%|████████▊ | 44/50 [01:40<00:13,  2.30s/it]
[ComfyUI] 90%|█████████ | 45/50 [01:42<00:11,  2.30s/it]
[ComfyUI] 92%|█████████▏| 46/50 [01:45<00:09,  2.30s/it]
[ComfyUI] 94%|█████████▍| 47/50 [01:47<00:06,  2.30s/it]
[ComfyUI] 96%|█████████▌| 48/50 [01:49<00:04,  2.30s/it]
[ComfyUI] 98%|█████████▊| 49/50 [01:52<00:02,  2.30s/it]
[ComfyUI] 100%|██████████| 50/50 [01:54<00:00,  2.30s/it]
[ComfyUI] 100%|██████████| 50/50 [01:54<00:00,  2.29s/it]
[ComfyUI] Allocated memory: memory=12.300 GB
[ComfyUI] Max allocated memory: max_memory=15.099 GB
[ComfyUI] Max reserved memory: max_reserved=16.281 GB
Executing node 5, title: HunyuanVideo Decode, class type: HyVideoDecode
[ComfyUI]
[ComfyUI] Decoding rows:   0%|          | 0/2 [00:00<?, ?it/s]
[ComfyUI] Decoding rows:  50%|█████     | 1/2 [00:01<00:01,  1.46s/it]
[ComfyUI] Decoding rows: 100%|██████████| 2/2 [00:02<00:00,  1.24s/it]
[ComfyUI] Decoding rows: 100%|██████████| 2/2 [00:02<00:00,  1.27s/it]
[ComfyUI]
[ComfyUI] Blending tiles:   0%|          | 0/2 [00:00<?, ?it/s]
[ComfyUI] Blending tiles: 100%|██████████| 2/2 [00:00<00:00, 26.04it/s]
[ComfyUI]
[ComfyUI] Decoding rows:   0%|          | 0/2 [00:00<?, ?it/s]
[ComfyUI] Decoding rows:  50%|█████     | 1/2 [00:00<00:00,  2.56it/s]
[ComfyUI] Decoding rows: 100%|██████████| 2/2 [00:00<00:00,  3.04it/s]
[ComfyUI] Decoding rows: 100%|██████████| 2/2 [00:00<00:00,  2.95it/s]
[ComfyUI]
[ComfyUI] Blending tiles:   0%|          | 0/2 [00:00<?, ?it/s]
Executing node 34, title: Video Combine 🎥🅥🅗🅢, class type: VHS_VideoCombine
[ComfyUI] Blending tiles: 100%|██████████| 2/2 [00:00<00:00, 64.97it/s]
[ComfyUI] Prompt executed in 141.79 seconds
outputs:  {'34': {'gifs': [{'filename': 'HunyuanVideo_00001.mp4', 'subfolder': '', 'type': 'output', 'format': 'video/h264-mp4', 'frame_rate': 16.0, 'workflow': 'HunyuanVideo_00001.png', 'fullpath': '/tmp/outputs/HunyuanVideo_00001.mp4'}]}}
====================================
HunyuanVideo_00001.png
HunyuanVideo_00001.mp4

Version Details

Version ID: 7a1025aea09dce5abeeca5bf3555e3031eff118154b1e2e59f71b644bda98757
Version Created: January 23, 2025

Run on Replicate →