zsxkib/audio-flamingo-3 ๐ผ๏ธ๐๐ขโ โ ๐
About
๐งAdvanced audio understanding with step-by-step reasoning๐ฃ

Example Output
Prompt:
"Answer"
Output
Some famous actors who started their careers on Broadway include Meryl Streep, Tom Hanks, and Julia St. Louis-Dreyfus.
Performance Metrics
2.51s
Prediction Time
77.06s
Total Time
All Input Parameters
{ "audio": "https://replicate.delivery/pbxt/NNbVqHojUTdMWRv0siI8RhfccA3lbRlk4Lu1oSWmSHJ8t6xW/voice_0.mp3", "prompt": "Answer", "max_length": 0, "temperature": 0, "system_prompt": "", "enable_thinking": true }
Input Parameters
- audio (required)
- Audio file to analyze. Supports speech, music, and sound effects. Maximum duration: 10 minutes.
- prompt
- Question or instruction about the audio
- end_time
- End time in seconds for audio segment analysis (optional). Must be greater than start_time.
- max_length
- Maximum length of the response in tokens. Use 0 for model default, or specify 50-2048 for custom length.
- start_time
- Start time in seconds for audio segment analysis (optional). Useful for long audio files.
- temperature
- Controls response creativity and randomness. Use 0.0 for deterministic (default), 0.1-0.3 for factual analysis, 0.7-0.9 for creative interpretation.
- system_prompt
- System instructions to customize the model's behavior, output format, or analysis style. Leave empty for default behavior.
- enable_thinking
- Enable detailed chain-of-thought reasoning for complex analysis. False for faster responses, True for deeper insights.
Output Schema
Output
Example Execution Logs
{'from': 'human', 'value': [<llava.media.Sound object at 0x77675cddea70>, '<sound>\nAnswer']} 2025-07-18 14:26:16.848 | WARNING | llava.utils.media:extract_media:262 - Media token '<sound>' found in text: '<sound> Answer'. Removed. /src/llava/mm_utils.py:592: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). masks = torch.tensor(masks[0])
Version Details
- Version ID
419bdd5ed04ba4e4609e66cc5082f6564e9d2c0836f9a286abe74bc20a357b84
- Version Created
- July 18, 2025