avocado/podcast-generator-wan 📝 → ❓

▶️ 6 runs 📅 Oct 2025 ⚙️ Cog 0.14.0
podcast-generation text-generation text-to-video-with-audio

About

Forked from avocado/podcast-generator

Example Output

Output

{"audio":"https://replicate.delivery/xezq/vWaeZAj304TWAiwft6HxBB80xgDFUN8YvQeeBd1NVUr7vFLWB/tmp6kbjz4_9.mp3","image":"https://replicate.delivery/xezq/Og0Ei5m5epSZGCYfe9hoYjuV4bDXCLIDhXfQyOLgjP45vFLWB/out-0.webp","video":"https://replicate.delivery/xezq/Y1uSOyThPyJNMZuYSQVsmN48ZPNWgfpxps93KTtZ1RAfbxiVA/output.mp4","script":"So yesterday I was at the playground, right? And I learned something super important that I think all grown-ups need to know. I was on the big twisty slide, the really tall one that makes your tummy feel funny, when this kid Tyler started crying because he was scared to go down.

And you know what I did? I told him the secret. You just gotta close your eyes, hold your breath, and think about cookies the whole way down. Works every time! Tyler tried it and guess what happened? He went flying down that slide laughing his head off.

Now here's the thing - I think this works for everything scary that grown-ups do too. Like when my mom has to talk to mean people on the phone, or when dad has to fix the scary noise the car makes. Just close your eyes, hold your breath, and think about cookies.

I mean, it got Tyler down the slide, and last week it helped me eat my vegetables, so basically I'm pretty sure I've figured out life. You're welcome, everybody."}

Performance Metrics

662.00s Prediction Time
662.38s Total Time
All Input Parameters
{
  "voice_id": "Casual_Guy",
  "story_prompt": "a toddler giving questionable advice based on an experience at a playground",
  "podcast_style": "a boy toddler professional studio podcast with a single host speaking into a microphone, modern setup, warm lighting"
}
Input Parameters
voice_id Type: stringDefault: Friendly_Person
Voice ID for the podcast host. Options: Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
story_prompt (required) Type: string
Story prompt to generate a podcast about
podcast_style Type: stringDefault: professional studio podcast with a single host speaking into a microphone, modern setup, warm lighting
Description of the podcast setting/style for image generation
Output Schema
audio Type: stringFormat: uri
Audio
image Type: stringFormat: uri
Image
video Type: stringFormat: uri
Video
script Type: string
Script
Example Execution Logs
Generating podcast script...
Script generated (963 characters)
Generating speech audio...
Audio generated: /tmp/cog-runner-tmp-3494772783/2636c3b8baac5a52/tmp6kbjz4_9.mp3
Generating podcast host image...
Image generated: /tmp/cog-runner-tmp-3494772783/bf48b9294ee1f048/out-0.webp
Getting audio duration...
Audio duration: 52.67 seconds
Splitting audio into 4 chunks...
ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --pkg-config-flags=--static --extra-cflags=-fopenmp --extra-ldflags='-fopenmp -Wl,--allow-multiple-definition -Wl,-z,stack-size=2097152' --toolchain=hardened --disable-debug --disable-shared --disable-ffplay --enable-static --enable-gpl --enable-version3 --enable-fontconfig --enable-gray --enable-iconv --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libdav1d --enable-libdavs2 --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libharfbuzz --enable-libjxl --enable-libkvazaar --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librabbitmq --enable-librav1e --enable-librsvg --enable-librtmp --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libuavs3d --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxevd --enable-libxeve --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-openssl
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
[mp3 @ 0x796df42abac0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '/tmp/cog-runner-tmp-3494772783/2636c3b8baac5a52/tmp6kbjz4_9.mp3':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:52.67, start: 0.000000, bitrate: 128 kb/s
  Stream #0:0: Audio: mp3 (mp3float), 32000 Hz, mono, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Output #0, mp3, to '/tmp/audio_chunk_0.mp3':
  Metadata:
    TSSE            : Lavf61.7.100
  Stream #0:0: Audio: mp3, 32000 Hz, mono, fltp, 128 kb/s
Press [q] to stop, [?] for help
[out#0/mp3 @ 0x796df42a8580] video:0KiB audio:235KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.258127%
size=     235KiB time=00:00:15.01 bitrate= 128.3kbits/s speed=2.24e+03x    
Created chunk 1/4: /tmp/audio_chunk_0.mp3
ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --pkg-config-flags=--static --extra-cflags=-fopenmp --extra-ldflags='-fopenmp -Wl,--allow-multiple-definition -Wl,-z,stack-size=2097152' --toolchain=hardened --disable-debug --disable-shared --disable-ffplay --enable-static --enable-gpl --enable-version3 --enable-fontconfig --enable-gray --enable-iconv --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libdav1d --enable-libdavs2 --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libharfbuzz --enable-libjxl --enable-libkvazaar --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librabbitmq --enable-librav1e --enable-librsvg --enable-librtmp --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libuavs3d --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxevd --enable-libxeve --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-openssl
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
[mp3 @ 0x7d9c15d3aac0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '/tmp/cog-runner-tmp-3494772783/2636c3b8baac5a52/tmp6kbjz4_9.mp3':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:52.67, start: 0.000000, bitrate: 128 kb/s
  Stream #0:0: Audio: mp3 (mp3float), 32000 Hz, mono, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Output #0, mp3, to '/tmp/audio_chunk_1.mp3':
  Metadata:
    TSSE            : Lavf61.7.100
  Stream #0:0: Audio: mp3, 32000 Hz, mono, fltp, 128 kb/s
Press [q] to stop, [?] for help
[out#0/mp3 @ 0x7d9c15d37580] video:0KiB audio:235KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.257509%
size=     236KiB time=00:00:15.02 bitrate= 128.5kbits/s speed=2.24e+03x    
Created chunk 2/4: /tmp/audio_chunk_1.mp3
ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --pkg-config-flags=--static --extra-cflags=-fopenmp --extra-ldflags='-fopenmp -Wl,--allow-multiple-definition -Wl,-z,stack-size=2097152' --toolchain=hardened --disable-debug --disable-shared --disable-ffplay --enable-static --enable-gpl --enable-version3 --enable-fontconfig --enable-gray --enable-iconv --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libdav1d --enable-libdavs2 --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libharfbuzz --enable-libjxl --enable-libkvazaar --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librabbitmq --enable-librav1e --enable-librsvg --enable-librtmp --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libuavs3d --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxevd --enable-libxeve --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-openssl
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
[mp3 @ 0x710708f4cac0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '/tmp/cog-runner-tmp-3494772783/2636c3b8baac5a52/tmp6kbjz4_9.mp3':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:52.67, start: 0.000000, bitrate: 128 kb/s
  Stream #0:0: Audio: mp3 (mp3float), 32000 Hz, mono, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Output #0, mp3, to '/tmp/audio_chunk_2.mp3':
  Metadata:
    TSSE            : Lavf61.7.100
  Stream #0:0: Audio: mp3, 32000 Hz, mono, fltp, 128 kb/s
Press [q] to stop, [?] for help
[out#0/mp3 @ 0x710708f49580] video:0KiB audio:235KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.258127%
size=     235KiB time=00:00:15.00 bitrate= 128.4kbits/s speed=1.92e+03x    
Created chunk 3/4: /tmp/audio_chunk_2.mp3
ffmpeg version 7.1.1 Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --pkg-config-flags=--static --extra-cflags=-fopenmp --extra-ldflags='-fopenmp -Wl,--allow-multiple-definition -Wl,-z,stack-size=2097152' --toolchain=hardened --disable-debug --disable-shared --disable-ffplay --enable-static --enable-gpl --enable-version3 --enable-fontconfig --enable-gray --enable-iconv --enable-lcms2 --enable-libaom --enable-libaribb24 --enable-libass --enable-libbluray --enable-libdav1d --enable-libdavs2 --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libharfbuzz --enable-libjxl --enable-libkvazaar --enable-libmodplug --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librabbitmq --enable-librav1e --enable-librsvg --enable-librtmp --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libuavs3d --enable-libvidstab --enable-libvmaf --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpl --enable-libvpx --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxevd --enable-libxeve --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-openssl
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.101 / 61. 19.101
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
[mp3 @ 0x7cdc5efebac0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '/tmp/cog-runner-tmp-3494772783/2636c3b8baac5a52/tmp6kbjz4_9.mp3':
  Metadata:
    encoder         : Lavf58.29.100
  Duration: 00:00:52.67, start: 0.000000, bitrate: 128 kb/s
  Stream #0:0: Audio: mp3 (mp3float), 32000 Hz, mono, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Output #0, mp3, to '/tmp/audio_chunk_3.mp3':
  Metadata:
    TSSE            : Lavf61.7.100
  Stream #0:0: Audio: mp3, 32000 Hz, mono, fltp, 128 kb/s
Press [q] to stop, [?] for help
[out#0/mp3 @ 0x7cdc5efe8580] video:0KiB audio:120KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.505347%
size=     120KiB time=00:00:07.66 bitrate= 128.6kbits/s speed=3.14e+03x    
Created chunk 4/4: /tmp/audio_chunk_3.mp3
Starting 4 video generation jobs in parallel...
Starting video generation for chunk 1/4...
Chunk 1/4 job started
Starting video generation for chunk 2/4...
Chunk 2/4 job started
Starting video generation for chunk 3/4...
Chunk 3/4 job started
Starting video generation for chunk 4/4...
Chunk 4/4 job started
Waiting for all 4 videos to complete...
Waiting for chunk 1/4...
Chunk 1/4 completed: /tmp/cog-runner-tmp-3494772783/1a1a665addce1a35/output.mp4
Waiting for chunk 2/4...
Chunk 2/4 completed: /tmp/cog-runner-tmp-3494772783/fd4891c5b91e728d/output.mp4
Waiting for chunk 3/4...
Chunk 3/4 completed: /tmp/cog-runner-tmp-3494772783/24ece0ea582576c5/output.mp4
Waiting for chunk 4/4...
Chunk 4/4 completed: /tmp/cog-runner-tmp-3494772783/6bd8919b98f40a43/output.mp4
Merging video chunks together...
Final video merged: /tmp/cog-runner-tmp-3494772783/6f484fc5965778a1/output.mp4
Adding captions to video...
Captions added: /tmp/cog-runner-tmp-3494772783/f2a7f2a728db9acc/output.mp4
Pipeline complete!
Version Details
Version ID
7e8ed72bb89f952c84f0ea66aa0711fb698fce2b0821526db1cde91719bfe05a
Version Created
October 25, 2025
Run on Replicate →