tencent/hunyuanvideo-foley 🖼️📝✓🔢 → 🖼️
About
(Research & Non-commercial use only) Text-Video-to-Audio Synthesis: Generate realistic audio from video and text descriptions
Example Output
Prompt:
"splash of water and loud thud as person hits the surface"
Output
Performance Metrics
15.46s
Prediction Time
15.47s
Total Time
All Input Parameters
{ "video": "https://replicate.delivery/pbxt/Ng6XsciCNoSaPKYexAwkVvruQl3uGkMqgXoB5sLcwKNd9Vqs/8_video.mp4", "prompt": "splash of water and loud thud as person hits the surface", "return_audio": false, "guidance_scale": 4.5, "num_inference_steps": 50 }
Input Parameters
- video (required)
- Input video file (e.g., .mp4, .mov)
- prompt
- Optional text prompt describing the scene
- neg_prompt
- Negative prompt to avoid unwanted sounds
- return_audio
- Return audio only
- guidance_scale
- Guidance strength
- num_inference_steps
- Denoising steps
Output Schema
Output
Example Execution Logs
Denoising steps: 0%| | 0/50 [00:00<?, ?it/s] Denoising steps: 2%|▏ | 1/50 [00:00<00:07, 6.57it/s] Denoising steps: 4%|▍ | 2/50 [00:00<00:07, 6.58it/s] Denoising steps: 6%|▌ | 3/50 [00:00<00:07, 6.57it/s] Denoising steps: 8%|▊ | 4/50 [00:00<00:06, 6.58it/s] Denoising steps: 10%|█ | 5/50 [00:00<00:06, 6.58it/s] Denoising steps: 12%|█▏ | 6/50 [00:00<00:06, 6.57it/s] Denoising steps: 14%|█▍ | 7/50 [00:01<00:06, 6.58it/s] Denoising steps: 16%|█▌ | 8/50 [00:01<00:06, 6.56it/s] Denoising steps: 18%|█▊ | 9/50 [00:01<00:06, 6.56it/s] Denoising steps: 20%|██ | 10/50 [00:01<00:06, 6.56it/s] Denoising steps: 22%|██▏ | 11/50 [00:01<00:05, 6.56it/s] Denoising steps: 24%|██▍ | 12/50 [00:01<00:05, 6.56it/s] Denoising steps: 26%|██▌ | 13/50 [00:01<00:05, 6.56it/s] Denoising steps: 28%|██▊ | 14/50 [00:02<00:05, 6.55it/s] Denoising steps: 30%|███ | 15/50 [00:02<00:05, 6.56it/s] Denoising steps: 32%|███▏ | 16/50 [00:02<00:05, 6.56it/s] Denoising steps: 34%|███▍ | 17/50 [00:02<00:05, 6.56it/s] Denoising steps: 36%|███▌ | 18/50 [00:02<00:04, 6.57it/s] Denoising steps: 38%|███▊ | 19/50 [00:02<00:04, 6.55it/s] Denoising steps: 40%|████ | 20/50 [00:03<00:04, 6.55it/s] Denoising steps: 42%|████▏ | 21/50 [00:03<00:04, 6.54it/s] Denoising steps: 44%|████▍ | 22/50 [00:03<00:04, 6.55it/s] Denoising steps: 46%|████▌ | 23/50 [00:03<00:04, 6.54it/s] Denoising steps: 48%|████▊ | 24/50 [00:03<00:03, 6.53it/s] Denoising steps: 50%|█████ | 25/50 [00:03<00:03, 6.54it/s] Denoising steps: 52%|█████▏ | 26/50 [00:03<00:03, 6.55it/s] Denoising steps: 54%|█████▍ | 27/50 [00:04<00:03, 6.56it/s] Denoising steps: 56%|█████▌ | 28/50 [00:04<00:03, 6.57it/s] Denoising steps: 58%|█████▊ | 29/50 [00:04<00:03, 6.57it/s] Denoising steps: 60%|██████ | 30/50 [00:04<00:03, 6.58it/s] Denoising steps: 62%|██████▏ | 31/50 [00:04<00:02, 6.59it/s] Denoising steps: 64%|██████▍ | 32/50 [00:04<00:02, 6.59it/s] Denoising steps: 66%|██████▌ | 33/50 [00:05<00:02, 6.58it/s] Denoising steps: 68%|██████▊ | 34/50 [00:05<00:02, 6.55it/s] Denoising steps: 70%|███████ | 35/50 [00:05<00:02, 6.56it/s] Denoising steps: 72%|███████▏ | 36/50 [00:05<00:02, 6.58it/s] Denoising steps: 74%|███████▍ | 37/50 [00:05<00:01, 6.58it/s] Denoising steps: 76%|███████▌ | 38/50 [00:05<00:01, 6.58it/s] Denoising steps: 78%|███████▊ | 39/50 [00:05<00:01, 6.58it/s] Denoising steps: 80%|████████ | 40/50 [00:06<00:01, 6.58it/s] Denoising steps: 82%|████████▏ | 41/50 [00:06<00:01, 6.57it/s] Denoising steps: 84%|████████▍ | 42/50 [00:06<00:01, 6.57it/s] Denoising steps: 86%|████████▌ | 43/50 [00:06<00:01, 6.57it/s] Denoising steps: 88%|████████▊ | 44/50 [00:06<00:00, 6.58it/s] Denoising steps: 90%|█████████ | 45/50 [00:06<00:00, 6.55it/s] Denoising steps: 92%|█████████▏| 46/50 [00:07<00:00, 6.55it/s] Denoising steps: 94%|█████████▍| 47/50 [00:07<00:00, 6.55it/s] Denoising steps: 96%|█████████▌| 48/50 [00:07<00:00, 6.55it/s] Denoising steps: 98%|█████████▊| 49/50 [00:07<00:00, 6.56it/s] Denoising steps: 100%|██████████| 50/50 [00:07<00:00, 6.56it/s] Denoising steps: 100%|██████████| 50/50 [00:07<00:00, 6.56it/s] 2025-09-08 17:18:55.608 | INFO | hunyuanvideo_foley.utils.media_utils:merge_audio_video:77 - Merging audio '/tmp/output.wav' with video '/tmp/tmp07d6a94i8_video.mp4' 2025-09-08 17:18:55.801 | INFO | hunyuanvideo_foley.utils.media_utils:merge_audio_video:91 - Successfully merged video saved to: /tmp/output.mp4
Version Details
- Version ID
88045928bb97971cffefabfc05a4e55e5bb1c96d475ad4ecc3d229d9169758ae
- Version Created
- September 8, 2025