AI Model Integrations

DecipherIt integrates multiple AI services for different capabilities: OpenAI for embeddings, Google Gemini for content generation, and LemonFox for text-to-speech.

Overview

DecipherIt uses a multi-model approach:

OpenAI - Text embeddings for semantic search (via Qdrant)
Google Gemini - Large language model for content generation (via OpenRouter)
LemonFox AI - High-quality text-to-speech synthesis

Environment Variables

Backend Configuration

Add these to your backend .env file:

backend/.env

# OpenAI API key for embeddings
OPENAI_API_KEY=your_openai_api_key

# OpenRouter API key for Gemini access
OPENROUTER_API_KEY=your_openrouter_api_key

# LemonFox AI API key for TTS
LEMONFOX_API_KEY=your_lemonfox_api_key

OpenAI Integration

Purpose

OpenAI is used exclusively for generating text embeddings that power semantic search through Qdrant vector database.

Configuration

Get Your API Key

Navigate to API Keys section

Create a new API key

Add it to your .env file as OPENAI_API_KEY

Embedding Model

DecipherIt uses text-embedding-3-small for optimal cost and performance:

class QdrantSourceStore:
    def __init__(
        self,
        embedding_model: str = "text-embedding-3-small",
        openai_api_key: Optional[str] = None,
    ):
        # Initialize OpenAI client
        self.openai_client = AsyncOpenAI(api_key=openai_api_key)
        self.embedding_model = embedding_model

Generate Embeddings

Embeddings are created automatically when content is added:

async def _get_embedding(self, text: str) -> List[float]:
    """Get embedding for text using OpenAI."""
    response = await self.openai_client.embeddings.create(
        input=text,
        model=self.embedding_model,
    )
    return response.data[0].embedding

Usage in Semantic Search

Embeddings enable intelligent Q&A capabilities:

backend/agents/chat_agent.py

async def get_relevant_sources(notebook_id: str, query: str):
    # Query is automatically embedded and searched
    results = await qdrant_service.search(query, notebook_id)
    
    output = ""
    for result in results:
        source_info = "Source: Provided Text"
        if result.get('url'):
            page_title = result.get('page_title', '')
            source_info = f"Source: {page_title} ({result['url']})"
        output += f"Content: {result['content_chunk']}\n{source_info}\n---\n"
    
    return output

Text Chunking Strategy

Content is chunked for optimal retrieval:

backend/services/qdrant_service.py

class QdrantSourceStore:
    def __init__(
        self,
        chunk_size: int = 512,
        chunk_overlap: int = 50,
    ):
        self.chunk_size = chunk_size
        self.chunk_overlap = chunk_overlap
    
    def _chunk_text(self, text: str) -> List[str]:
        """Split text into chunks based on chunk size and overlap."""
        tokens = text.split()
        chunk_starts = range(0, len(tokens), self.chunk_size - self.chunk_overlap)
        chunks = [
            " ".join(tokens[i:i + self.chunk_size])
            for i in chunk_starts
            if i + self.chunk_size <= len(tokens)
        ]
        return chunks

Google Gemini Integration

Purpose

Google Gemini powers all content generation through CrewAI agents for research analysis, summaries, FAQs, and mindmaps.

Configuration via OpenRouter

Get OpenRouter API Key

Navigate to API Keys section

Create a new API key

Add it to your .env file as OPENROUTER_API_KEY

LLM Configuration

DecipherIt uses Gemini 2.0 Flash through OpenRouter:

from crewai import LLM
import os
from dotenv import load_dotenv

load_dotenv()

llm = LLM(
    model="openrouter/google/gemini-2.0-flash-001",
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
    temperature=0.01
)

Model Selection: Gemini 2.0 Flash provides the best balance of speed, quality, and cost for research tasks.Temperature: Set to 0.01 for consistent, factual outputs.

Usage in AI Agents

All CrewAI agents use the configured LLM:

backend/agents/topic_research_agent.py

from config import llm

# Research Analyst Agent
researcher = Agent(
    role="Senior Research Analyst & Knowledge Synthesizer",
    goal="Conduct exhaustive analysis of multi-source data",
    backstory="""You are an elite research analyst...""",
    verbose=True,
    llm=llm,  # Uses Gemini 2.0 Flash
)

# Content Writer Agent
content_writer = Agent(
    role="Senior Content Strategist & Research Synthesizer",
    goal="Transform extensive research findings into compelling content",
    backstory="""You are an elite content strategist...""",
    verbose=True,
    llm=llm,  # Uses Gemini 2.0 Flash
)

CrewAI Agents Powered by Gemini

# Research task using Gemini
research_task = Task(
    description="""Synthesize research findings into comprehensive document.
    
    Review ALL scraped content, identify key themes, and create
    structured analysis with proper citations.
    """,
    expected_output="A markdown document with comprehensive research analysis",
    agent=researcher,
    max_retries=5,
)

LemonFox AI Integration

Purpose

LemonFox AI provides high-quality text-to-speech synthesis for podcast-style audio overviews.

Configuration

Get Your API Key

Navigate to API Keys section

Create a new API key

Add it to your .env file as LEMONFOX_API_KEY

TTS Service Setup

The TTS service is configured with connection pooling:

import os
import httpx
from loguru import logger

class TTSService:
    def __init__(self):
        self.api_key = os.environ.get("LEMONFOX_API_KEY")
        if not self.api_key:
            raise ValueError("LEMONFOX_API_KEY environment variable is required")
        
        self.base_url = "https://api.lemonfox.ai/v1/audio/speech"
        self.response_format = "mp3"
        
        # Voice mapping for different speakers
        self.speaker_voices = {
            "Michael": "liam",    # Host voice
            "Sarah": "jessica"  # Guest voice
        }

Voice Configuration

Two distinct voices create podcast-style conversations:

# Voice mapping for transcript speakers
speaker_voices = {
    "Michael": "liam",    # Host voice
    "Sarah": "jessica"    # Guest voice
}

Audio Generation Workflow

Generate Transcript

AI agent creates podcast-style script:

# Step 1: Generate transcript using audio overview agent
transcript = await run_audio_overview_agent(notebook_id)

# Transcript format:
# [
#   {"name": "Michael", "transcript": "Welcome to..."},
#   {"name": "Sarah", "transcript": "Thanks for having me..."}
# ]

Convert to Speech

Concurrent TTS generation with rate limiting:

async def generate_audio_from_transcript(
    self,
    transcript: List[Dict[str, Any]],
    notebook_id: str
) -> bytes:
    # Process segments concurrently (max 5 at a time)
    tasks = []
    for segment in transcript:
        speaker = segment.get("name", "Michael")
        text = segment.get("transcript", "")
        voice = self.speaker_voices.get(speaker, "jessica")
        
        task = self._generate_audio_with_semaphore(text, voice)
        tasks.append(task)
    
    # Execute all TTS requests concurrently
    audio_bytes_list = await asyncio.gather(*tasks)
    
    # Combine audio segments with pauses
    return await self._combine_audio_segments(audio_bytes_list)

Upload to Storage

Audio is uploaded to Cloudflare R2:

# Step 3: Upload audio file to R2
audio_url = await r2_service.upload_audio_file(
    audio_content, notebook_id
)

# Step 4: Update database with audio URL
await notebook_repository.update_audio_overview_url(notebook_id, audio_url)

Performance Optimization

The TTS service uses several optimizations:

backend/services/tts_service.py

class TTSService:
    def __init__(self):
        # Concurrency settings
        self.max_concurrent_requests = 5
        self.semaphore = asyncio.Semaphore(self.max_concurrent_requests)
        
        # Connection pool for reusing HTTP connections
        limits = httpx.Limits(
            max_keepalive_connections=10,
            max_connections=20,
            keepalive_expiry=30.0
        )
        self._client = httpx.AsyncClient(
            timeout=httpx.Timeout(300.0),
            limits=limits,
            http2=True  # Enable HTTP/2 for better performance
        )

Connection Pooling: Reuses HTTP connections for better performanceRate Limiting: Maximum 5 concurrent requests to avoid overwhelming the APIHTTP/2: Enabled for improved request multiplexing

Cost Optimization

OpenAI Embeddings

Uses text-embedding-3-small (most cost-effective)
Only generates embeddings once per content chunk
Chunks limited to 512 tokens for efficiency

Gemini via OpenRouter

Uses Gemini 2.0 Flash (fastest, most cost-effective)
Temperature set to 0.01 for deterministic outputs (fewer retries)
Rate limiting prevents excessive API calls: max_rpm=20

LemonFox TTS

Only generates audio on-demand when user requests
Concurrent processing reduces total time
Audio cached in R2 storage (no regeneration needed)

Troubleshooting

OpenAI Errors

If embeddings fail:

# Verify API key is valid
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Gemini/OpenRouter Errors

If content generation fails:

Check OpenRouter API key is valid
Verify you have credits: openrouter.ai/credits
Check model availability: openrouter.ai/models

LemonFox TTS Errors

If audio generation fails:

Verify LEMONFOX_API_KEY is correct
Check API quotas and limits
Review logs: logs/audio_overview_*.log

Next Steps

Learn about Bright Data Integration
Configure Storage
Explore AI Agents

Get Started

Core Features

Architecture

Self-Hosting

Integrations

AI Model Integrations

Overview

Environment Variables

Backend Configuration

OpenAI Integration

Purpose

Configuration

Usage in Semantic Search

Text Chunking Strategy

Google Gemini Integration

Purpose

Configuration via OpenRouter

Usage in AI Agents

CrewAI Agents Powered by Gemini

LemonFox AI Integration

Purpose

Configuration

Audio Generation Workflow

Performance Optimization

Cost Optimization

OpenAI Embeddings

Gemini via OpenRouter

LemonFox TTS

Troubleshooting

OpenAI Errors

Gemini/OpenRouter Errors

LemonFox TTS Errors

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Features

Architecture

Self-Hosting

Integrations

​Overview

​Environment Variables

​Backend Configuration

​OpenAI Integration

​Purpose

​Configuration

​Usage in Semantic Search

​Text Chunking Strategy

​Google Gemini Integration

​Purpose

​Configuration via OpenRouter

​Usage in AI Agents

​CrewAI Agents Powered by Gemini

​LemonFox AI Integration

​Purpose

​Configuration

​Audio Generation Workflow

​Performance Optimization

​Cost Optimization

​OpenAI Embeddings

​Gemini via OpenRouter

​LemonFox TTS

​Troubleshooting

​OpenAI Errors

​Gemini/OpenRouter Errors

​LemonFox TTS Errors

​Next Steps

Build docs developers (and LLMs) love

Overview

Environment Variables

Backend Configuration

OpenAI Integration

Purpose

Configuration

Usage in Semantic Search

Text Chunking Strategy

Google Gemini Integration

Purpose

Configuration via OpenRouter

Usage in AI Agents

CrewAI Agents Powered by Gemini

LemonFox AI Integration

Purpose

Configuration

Audio Generation Workflow

Performance Optimization

Cost Optimization

OpenAI Embeddings

Gemini via OpenRouter

LemonFox TTS

Troubleshooting

OpenAI Errors

Gemini/OpenRouter Errors

LemonFox TTS Errors

Next Steps