Quickstart

Prerequisites

Before you begin, ensure you have the following installed:

Python 3.8+ with pip
Node.js 16+ with npm
FFmpeg (for video processing)
Git (to clone the repository)

You’ll also need API keys from:

Google AI Studio (for Gemini API)
Sarvam AI (for text-to-speech)
Unsplash (for images)

Get Started in 5 Minutes

Clone the repository

git clone https://github.com/Kamal-Nayan-Kumar/AI-Video-Gen
cd AI-Video-Gen

Set up the backend

Navigate to the backend directory and create a virtual environment:

cd backend
python -m venv venv

Activate the virtual environment:

venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Install Manim (animation library):

pip install manim

Configure environment variables

Create a .env file in the backend/ directory:

cp .env.example .env

Edit .env and add your API keys:

GEMINI_API_KEY=your_gemini_api_key_here
SARVAM_API_KEY=your_sarvam_api_key_here
UNSPLASH_ACCESS_KEY=your_unsplash_access_key_here

The backend uses Sarvam AI for text-to-speech. Make sure to get your API key from Sarvam AI.

Start the backend server

With your virtual environment still activated, run:

python app.py

The FastAPI backend will start on http://localhost:8000. You should see:

INFO:     Started server process
INFO:     Uvicorn running on http://0.0.0.0:8000

Set up the frontend

Open a new terminal and navigate to the frontend directory:

cd frontend

Install dependencies:

npm install

Start the development server:

npm run dev

The React frontend will start on http://localhost:5173.

Generate your first presentation

Open your browser and go to http://localhost:5173
Enter a topic (e.g., “Introduction to Machine Learning”)
Set the number of slides (e.g., 5)
Choose a language (English, Hindi, Kannada, or Telugu)
Select a tone (Formal, Casual, or Educational)
Click Generate

The system will:

Generate presentation content using Gemini AI
Create narration scripts for each slide
Generate voice audio using Sarvam AI
Fetch relevant images from Unsplash or create animations with Manim
Compose everything into a final MP4 video

The generation process takes 2-5 minutes depending on the number of slides and complexity.

What’s Next?

Installation Guide

Detailed installation instructions for all platforms

Configuration

Learn about advanced configuration options

API Reference

Explore the FastAPI endpoints

Creating Presentations

Learn how to create and customize presentations

Troubleshooting

Backend server won't start

Make sure:

Your virtual environment is activated
All dependencies are installed: pip install -r requirements.txt
FFmpeg is installed and accessible in your PATH
Port 8000 is not already in use

Frontend can't connect to backend

Verify the backend is running on http://localhost:8000
Check browser console for CORS errors
Ensure your .env file has the correct API keys

Video generation fails

Common issues:

Invalid API keys in .env
FFmpeg not installed or not in PATH
Insufficient disk space in backend/outputs/
API rate limits exceeded

Get Started

Core Features

User Guides

Configuration

Prerequisites

Get Started in 5 Minutes

What’s Next?

Installation Guide

Configuration

API Reference

Creating Presentations

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Core Features

User Guides

Configuration

​Prerequisites

​Get Started in 5 Minutes

​What’s Next?

Installation Guide

Configuration

API Reference

Creating Presentations

​Troubleshooting

Build docs developers (and LLMs) love

Prerequisites

Get Started in 5 Minutes

What’s Next?

Troubleshooting