Skip to content

santhoshsharuk/-Podcast-AI-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Podcast AI Banner

πŸŽ™οΈ Podcast AI

Generate professional, studio-quality podcasts from text scripts using high-performance AI voice models.

License: MIT Python 3.8+ GitHub release Issues Stars


πŸ“– Table of Contents


πŸ€– About The Project

Podcast AI is an open-source tool designed to automate audio content creation. It leverages the power of Piper TTS, a fast and high-quality neural text-to-speech system, with efficient ONNX models to transform plain text scripts into engaging, ready-to-publish podcasts.

Whether you're a content creator looking to streamline your workflow or a developer interested in AI-powered audio generation, this project provides a solid foundation.


✨ Key Features

  • βœ… High-Fidelity Speech Synthesis: Generates natural-sounding speech from text using state-of-the-art models.
  • βœ… Fast & Efficient: Built on Piper TTS and ONNX Runtime for rapid, local inference without relying on cloud APIs.
  • βœ… Customizable Voices: Easily switch between different voice models by providing the corresponding .onnx and .json files.
  • βœ… Audio Assembly: Seamlessly combines generated speech segments, intro/outro music, and sound effects.
  • βœ… Cross-Platform: Runs on any system with Python, including Windows, macOS, and Linux.
  • βœ… Fully Open-Source: Free to use, modify, and distribute under the MIT License.

🎧 Example Output

Listen to a sample podcast generated with this tool:

➑️ Listen to final_podcast.mp3

The audio was generated from a simple script like this:

(intro_music)
Welcome to AI Spotlight, the show where we explore the latest breakthroughs in artificial intelligence.
(transition_sound)
Today, we're discussing generative models. These models can create brand new content, from text and images to music and even code. It's a revolution in creativity.
(outro_music)

πŸš€ Getting Started

Follow these steps to get the project running on your local machine.

Prerequisites

  • Python 3.8+
  • pip and venv (usually included with Python)
  • git for cloning the repository

Installation & Setup

  1. Clone the Repository

    git clone https://github.com/santhoshsharuk/podcast-ai.git
    cd podcast-ai
  2. Download Core Dependencies & Voice Models Large files like the Piper engine and voice models are hosted on GitHub Releases to keep the repository lightweight.

    • Go to the Releases Page.
    • Download the latest piper.zip and the desired voice model (e.g., voice-en-us-amy-medium.zip).
    • Extract them into the project directory.
  3. Organize Your Files Create a voices directory and place your model files inside. Your project structure should look like this:

    .
    β”œβ”€β”€ assemble_podcast.py
    β”œβ”€β”€ script.txt
    β”œβ”€β”€ voices/
    β”‚   └── en_US-amy-medium/
    β”‚       β”œβ”€β”€ en_US-amy-medium.onnx
    β”‚       └── en_US-amy-medium.onnx.json
    β”œβ”€β”€ piper/
    β”‚   β”œβ”€β”€ piper
    β”‚   └── ... (other piper files)
    
  4. Create a Virtual Environment & Install Requirements It's best practice to use a virtual environment to manage dependencies.

    # Create and activate the virtual environment
    python -m venv venv
    source venv/bin/activate  # On Windows, use: venv\Scripts\activate
    
    # Install the required Python packages
    pip install -r requirements.txt

You are now ready to generate your first podcast!


▢️ Usage

The main script assemble_podcast.py reads your script.txt, generates audio for each line, and combines them into a final file.

Basic Usage

Simply run the script with Python:

python assemble_podcast.py

This will:

  • Read script.txt.
  • Use the default voice model found in the voices/ directory.
  • Output the final audio to final_podcast.mp3.
  • Play a success sound upon completion.

Advanced Usage

You can customize the behavior using command-line arguments for greater flexibility.

python assemble_podcast.py \
  --script my_new_episode.txt \
  --voice voices/en_GB-alan-medium \
  --output episode_01.wav \
  --no-sound

Available Arguments:

Argument Shorthand Description Default
--script -s Path to the input script file. script.txt
--voice -v Path to the voice model directory. First directory in voices/
--output -o Path for the final output audio file. final_podcast.mp3
--no-sound Disable the success sound upon completion. N/A (flag)

πŸ“‚ Project Structure

.
β”œβ”€β”€ .gitignore              # Files to ignore for Git
β”œβ”€β”€ assemble_podcast.py     # Main script to generate the podcast
β”œβ”€β”€ LICENSE                 # Project license file
β”œβ”€β”€ README.md               # You are here!
β”œβ”€β”€ requirements.txt        # Python package dependencies
β”œβ”€β”€ script.txt              # Default input text script
β”œβ”€β”€ final_podcast.mp3       # Example output file
β”œβ”€β”€ assets/                 # (Optional) For sounds like intros, outros
β”‚   └── success.wav
β”œβ”€β”€ piper/                  # Piper TTS engine (from releases)
└── voices/                 # Directory for voice models
    └── en_US-amy-medium/
        β”œβ”€β”€ en_US-amy-medium.onnx
        └── en_US-amy-medium.onnx.json

🀝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project (🍴)
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request (πŸš€)

Please open an issue first to discuss any major changes you would like to make.


πŸ“œ License

This project is licensed under the MIT License. See the LICENSE file for more details.


πŸ“§ Contact

Santhosh Sharuk

Gmail GitHub


Generated by Podcast AI - Where words find their voice.

About

This project allows anyone to generate, process, and publish podcasts automatically with the help of AI models (speech synthesis, transcription, and enhancement.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors