How Audio Production Works
When you submit your script, Chirpy Studio doesn't just generate voice—it creates a complete production:
Voice Generation
Your narration is converted to natural-sounding speech with precise word timing.
Sound Design
Music and sound effects are selected based on your script's cues and mood.
Asset Gathering
The system finds the right audio files—music tracks, ambient sounds, effects.
Mixing & Mastering
Everything is combined professionally—music ducks under speech, levels are balanced, and the final mix is mastered.
Music and Sound Effect Cues
Your script can include cues that tell the system what audio to add. The AI adds these automatically, but you can edit them:
Music Cues
Describe the music you want at specific moments:
Sound Effect Cues
Add specific sounds to enhance your storytelling:
The more descriptive your cues, the better the system can match your vision. Instead of "sad music," try "melancholic cello, slow and intimate."
Sound Design Presets
Each series has a sound design preset that defines its overall audio style. Choose the one that best fits your content:
Documentary
Music at key moments only, moderate sound effects. Clear, informative delivery.
Best for: News, educational content, explainers, journalism
Fiction
Continuous music beds, rich sound effects. Immersive, cinematic feel.
Best for: Stories, drama, narrative podcasts, audio fiction
Meditation
Sparse, quiet music. Minimal effects with breathing sounds. Extra soft and calming.
Best for: Guided meditation, breathing exercises, sleep content, wellness
Interview
Music at key moments, light effects. Focus on dialogue clarity.
Best for: Conversations, Q&A content, discussion-based shows
Educational
Moderate music and effects. Engaging but not distracting.
Best for: Learning content, tutorials, how-to guides, courses
Customizing Sound Design
Presets are a starting point. You can customize individual settings for more control:
Music Policy
- • Continuous or at key moments
- • Allow music under dialogue
- • Density (sparse to rich)
- • Allow vocal music or not
Volume Levels
- • Music volume
- • Ambient/atmosphere volume
- • Sound effects volume
- • How much music ducks for speech
Sound Palette
- • Preferred instruments
- • Preferred SFX types
- • Styles to avoid
Special Features
- • Include stingers/hits
- • Breathing SFX (meditation)
- • Additional cues beyond script
Most users start with a preset and only customize if something doesn't sound right.
Understanding Ducking
"Ducking" is when background audio automatically gets quieter when someone speaks, then returns to normal during pauses. This ensures your voice is always clear.
Example: Music plays at full volume during your intro. When narration starts, the music smoothly dips down. When you pause, it comes back up slightly. This happens automatically throughout your episode.
Each preset has ducking settings optimized for that style. Meditation content ducks more aggressively; fiction keeps music more present.
Audio Quality
Your final episode is professionally mastered with industry-standard settings:
Loudness target (podcast standard)
Sample rate
MP3 bitrate
These settings meet requirements for all major podcast platforms including Apple Podcasts, Spotify, and Google Podcasts.
The Production Process
When you click "Submit to TTS," here's what happens:
Generate Voice
Your script becomes spoken audio with precise word-by-word timing data.
Generate Sound Design
The system analyzes your script and creates a detailed plan for music and SFX placement.
Gather Assets
Music tracks and sound effects are selected and downloaded based on your cues and preset.
Mix & Master
Everything is combined, levels are balanced, ducking is applied, and the final mix is mastered to broadcast standards.
Tips for Great Audio
Be specific with cues
"Mysterious ambient music" gets better results than just "music." Describe the mood, instruments, and energy you want.
Less is often more
Don't overload your script with cues. Well-placed music at key moments is more effective than constant audio.
Match preset to content
The right preset makes a big difference. Documentary for news, Fiction for stories, Meditation for wellness.
Listen before publishing
Always preview your episode. If something sounds off, edit your script cues and regenerate.
Next Steps
Need help?
If you get stuck or have questions, reach out at support@chirpy.studio