🎙️ Podcast AI

Generate professional, studio-quality podcasts from text scripts using high-performance AI voice models.

📖 Table of Contents

About The Project
✨ Key Features
🎧 Example Output
🚀 Getting Started
- Prerequisites
- Installation & Setup
▶️ Usage
- Basic Usage
- Advanced Usage
📂 Project Structure
🤝 Contributing
📜 License
📧 Contact

🤖 About The Project

Podcast AI is an open-source tool designed to automate audio content creation. It leverages the power of Piper TTS, a fast and high-quality neural text-to-speech system, with efficient ONNX models to transform plain text scripts into engaging, ready-to-publish podcasts.

Whether you're a content creator looking to streamline your workflow or a developer interested in AI-powered audio generation, this project provides a solid foundation.

✨ Key Features

✅ High-Fidelity Speech Synthesis: Generates natural-sounding speech from text using state-of-the-art models.
✅ Fast & Efficient: Built on Piper TTS and ONNX Runtime for rapid, local inference without relying on cloud APIs.
✅ Customizable Voices: Easily switch between different voice models by providing the corresponding .onnx and .json files.
✅ Audio Assembly: Seamlessly combines generated speech segments, intro/outro music, and sound effects.
✅ Cross-Platform: Runs on any system with Python, including Windows, macOS, and Linux.
✅ Fully Open-Source: Free to use, modify, and distribute under the MIT License.

🎧 Example Output

Listen to a sample podcast generated with this tool:

➡️ Listen to final_podcast.mp3

The audio was generated from a simple script like this:

(intro_music)
Welcome to AI Spotlight, the show where we explore the latest breakthroughs in artificial intelligence.
(transition_sound)
Today, we're discussing generative models. These models can create brand new content, from text and images to music and even code. It's a revolution in creativity.
(outro_music)

🚀 Getting Started

Follow these steps to get the project running on your local machine.

Prerequisites

Python 3.8+
pip and venv (usually included with Python)
git for cloning the repository

Installation & Setup

Clone the Repository

git clone https://github.com/santhoshsharuk/podcast-ai.git
cd podcast-ai

Download Core Dependencies & Voice Models Large files like the Piper engine and voice models are hosted on GitHub Releases to keep the repository lightweight.
- Go to the Releases Page.
- Download the latest piper.zip and the desired voice model (e.g., voice-en-us-amy-medium.zip).
- Extract them into the project directory.

Organize Your Files Create a voices directory and place your model files inside. Your project structure should look like this:

.
├── assemble_podcast.py
├── script.txt
├── voices/
│   └── en_US-amy-medium/
│       ├── en_US-amy-medium.onnx
│       └── en_US-amy-medium.onnx.json
├── piper/
│   ├── piper
│   └── ... (other piper files)

Create a Virtual Environment & Install Requirements It's best practice to use a virtual environment to manage dependencies.

# Create and activate the virtual environment
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

# Install the required Python packages
pip install -r requirements.txt

You are now ready to generate your first podcast!

▶️ Usage

The main script assemble_podcast.py reads your script.txt, generates audio for each line, and combines them into a final file.

Basic Usage

Simply run the script with Python:

python assemble_podcast.py

This will:

Read script.txt.
Use the default voice model found in the voices/ directory.
Output the final audio to final_podcast.mp3.
Play a success sound upon completion.

Advanced Usage

You can customize the behavior using command-line arguments for greater flexibility.

python assemble_podcast.py \
  --script my_new_episode.txt \
  --voice voices/en_GB-alan-medium \
  --output episode_01.wav \
  --no-sound

Available Arguments:

Argument	Shorthand	Description	Default
`--script`	`-s`	Path to the input script file.	`script.txt`
`--voice`	`-v`	Path to the voice model directory.	First directory in `voices/`
`--output`	`-o`	Path for the final output audio file.	`final_podcast.mp3`
`--no-sound`		Disable the success sound upon completion.	N/A (flag)

📂 Project Structure

.
├── .gitignore              # Files to ignore for Git
├── assemble_podcast.py     # Main script to generate the podcast
├── LICENSE                 # Project license file
├── README.md               # You are here!
├── requirements.txt        # Python package dependencies
├── script.txt              # Default input text script
├── final_podcast.mp3       # Example output file
├── assets/                 # (Optional) For sounds like intros, outros
│   └── success.wav
├── piper/                  # Piper TTS engine (from releases)
└── voices/                 # Directory for voice models
    └── en_US-amy-medium/
        ├── en_US-amy-medium.onnx
        └── en_US-amy-medium.onnx.json

🤝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project (🍴)
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request (🚀)

Please open an issue first to discuss any major changes you would like to make.

📜 License

This project is licensed under the MIT License. See the LICENSE file for more details.

📧 Contact

Santhosh Sharuk

Generated by Podcast AI - Where words find their voice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Podcast AI

📖 Table of Contents

🤖 About The Project

✨ Key Features

🎧 Example Output

🚀 Getting Started

Prerequisites

Installation & Setup

▶️ Usage

Basic Usage

Advanced Usage

📂 Project Structure

🤝 Contributing

📜 License

📧 Contact

About

Uh oh!

Releases 1

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
assemble_podcast.py		assemble_podcast.py
final_podcast.mp3		final_podcast.mp3
index.html		index.html
podcost banner.png		podcost banner.png
script.txt		script.txt
success.wav		success.wav

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🎙️ Podcast AI

📖 Table of Contents

🤖 About The Project

✨ Key Features

🎧 Example Output

🚀 Getting Started

Prerequisites

Installation & Setup

▶️ Usage

Basic Usage

Advanced Usage

📂 Project Structure

🤝 Contributing

📜 License

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages