zsxkib/mmaudio-t4 🔢🖼️📝 → 🖼️
About
Cost-optimized MMAudio V2 (T4 GPU): Add sound to video using this version running on T4 hardware for lower cost. Synthesizes high-quality audio from video content.

Example Output
Prompt:
"waves, storm"
Output
Performance Metrics
37.18s
Prediction Time
340.33s
Total Time
All Input Parameters
{ "video": "https://huggingface.co/hkchengrex/MMAudio/resolve/main/examples/sora_kraken.mp4", "prompt": "waves, storm", "duration": 10, "num_steps": 25, "cfg_strength": 4.5, "negative_prompt": "music" }
Input Parameters
- seed
- Random seed. Use -1 or leave blank to randomize the seed
- image
- Optional image file for image-to-audio generation (experimental)
- video
- Optional video file for video-to-audio generation
- prompt
- Text prompt for generated audio
- duration
- Duration of output in seconds
- num_steps
- Number of inference steps
- cfg_strength
- Guidance strength (CFG)
- negative_prompt
- Negative prompt to avoid certain sounds
Output Schema
Output
Example Execution Logs
Using seed: 30956 Processing video: /tmp/tmpke0qe0x5sora_kraken.mp4 [[33mWARNING [0m]: [33mClip video is too short: 5.00 < 10.00[0m [[33mWARNING [0m]: [33mTruncating to 5.00 sec[0m [[33mWARNING [0m]: [33mSync video is too short: 4.96 < 5.00[0m [[33mWARNING [0m]: [33mTruncating to 4.96 sec[0m
Version Details
- Version ID
330393ae234d739f3261ae389a5506a73a1bae8c77dc6c6faebc5bde78b6e972
- Version Created
- April 2, 2025