AI providers

YBH Pulse Content uses multiple AI providers for different content generation tasks. Google Gemini is the primary provider, with Anthropic Claude as a fallback.

Supported providers

Google Gemini (primary)

Model: gemini-2.0-flash-exp Used for:

PRF (Podcast Repurposing Framework) generation
Viral hooks extraction
Episode metadata generation
LinkedIn and Instagram post creation
Visual suggestions (infographics, quote cards)
Fact-checking and content validation
Spell checking

Capabilities:

Agentic tool calling
Structured output (JSON mode)
Long context window (over 1M tokens)
Fast response times
Cost-effective for high-volume generation

Get your API key: https://aistudio.google.com/apikey

Anthropic Claude (fallback)

Model: claude-3-5-sonnet-20241022 Used for:

All generation tasks when Google key is not available
Same capabilities as Google Gemini

Format: sk-ant-... Get your API key: https://console.anthropic.com/

OpenRouter (embeddings)

Model: text-embedding-3-small via OpenRouter Used for:

Vector embeddings for Pinecone RAG
Semantic search across brand guidelines
Infographic design database retrieval

Format: sk-or-... Get your API key: https://openrouter.ai/keys

Configuration

Add API keys to .dev.vars

# Primary AI provider
GOOGLE_GENERATIVE_AI_API_KEY=your-google-api-key-here

# Fallback AI provider
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Embeddings provider
OPENROUTER_API_KEY=sk-or-your-key-here

Set production secrets

wrangler secret put GOOGLE_GENERATIVE_AI_API_KEY
wrangler secret put ANTHROPIC_API_KEY
wrangler secret put OPENROUTER_API_KEY

Test connection

Generate a PRF from any episode to verify AI providers are working.

AI generation endpoints

PRF generation

Endpoint: POST /api/ai/prf Input:

{
  "transcript": "Episode transcript text...",
  "model": "gemini-2.0-flash-exp",
  "systemPrompt": "Generate a PRF document..."
}

Output:

{
  "content": "PRF HTML content..."
}

Usage:

import { generatePRF } from '@/services/ai'

const prf = await generatePRF(transcript)

Viral hooks generation

Endpoint: POST /api/ai/hooks Requires: Transcript for fact verification Input:

{
  "prf": "PRF content with transcript...",
  "model": "gemini-2.0-flash-exp",
  "systemPrompt": "Generate viral hooks..."
}

Output:

{
  "content": "Hooks HTML content..."
}

Usage:

import { generateHooks } from '@/services/ai'

const hooks = await generateHooks(
  transcript,
  prf,
  episodeNumber,
  guestName
)

Episode metadata (agentic)

Endpoint: POST /api/agent/metadata Uses: Server-sent events (SSE) for progress updates Input:

{
  "episodeId": "episode-123",
  "episodeNumber": "385",
  "guestName": "Chris Pacifico",
  "transcript": "Full transcript...",
  "prf": "Optional PRF content..."
}

Output (streamed):

// Status event
{
  "type": "status",
  "step": "Analyzing transcript",
  "detail": "Extracting key themes...",
  "progress": 30
}

// Result event
{
  "type": "result",
  "content": {
    "title": "Episode 385: Chris Pacifico on Tech Debt Management",
    "shortDescription": "Chris shares...",
    "longDescription": "On this episode...",
    "keyTakeaways": ["Takeaway 1", "Takeaway 2", "Takeaway 3"],
    "showNotes": [
      { "timestamp": "00:00", "description": "Introduction" },
      { "timestamp": "05:30", "description": "Tech debt discussion" }
    ]
  }
}

Usage:

import { generateEpisodeMetadata } from '@/services/ai'

const metadata = await generateEpisodeMetadata(
  episodeId,
  episodeNumber,
  guestName,
  transcript,
  prf,
  (step, detail, progress) => {
    console.log(`${progress}%: ${step} - ${detail}`)
  }
)

Visual suggestions (parallel generation)

Endpoint: POST /api/agent/suggestions Uses: Server-sent events (SSE) for progress updates Generates: 4 Data Viz + 4 Cinematic + 2 Quote Cards = 10 total Input:

{
  "episodeId": "episode-123",
  "episodeNumber": "385",
  "guestName": "Chris Pacifico",
  "transcript": "Full transcript...",
  "prf": "PRF content...",
  "hooks": "Viral hooks...",
  "history": ["previous-style-1", "previous-style-2"],
  "datavizPrompt": "Custom prompt for data viz...",
  "cinematicPrompt": "Custom prompt for cinematic...",
  "quoteCardsPrompt": "Custom prompt for quote cards..."
}

Output:

{
  "suggestions": [
    {
      "visualType": "dataviz",
      "type": "infographic",
      "sourceText": "Selected text from PRF...",
      "sourceSection": "PRF Section 2",
      "spec": {
        "layout": "Card Grid",
        "template": "Strategic Framework",
        "title": "Tech Debt Management Framework",
        "colorSystem": "Corporate Professional",
        "iconStyle": "isometric",
        "aspectRatio": "16:9",
        "prompt": "Full Kie.ai prompt..."
      }
    }
  ],
  "counts": {
    "dataviz": 4,
    "cinematic": 4,
    "quoteCard": 2
  },
  "errors": {
    "dataviz": null,
    "cinematic": null,
    "quoteCard": null
  }
}

Fact-checking

Endpoint: POST /api/agent/fact-check Purpose: Verify generated content against transcript to prevent hallucinations Input:

{
  "transcript": "Full transcript...",
  "prf": "PRF content...",
  "guestName": "Chris Pacifico",
  "coHostName": "Doug",
  "items": [
    {
      "type": "quote",
      "content": "Tech debt is like credit card debt",
      "source": "Chris Pacifico"
    },
    {
      "type": "statistic",
      "content": "70% of IT budgets go to maintenance",
      "source": "Chris Pacifico"
    }
  ]
}

Output:

{
  "overallScore": 85,
  "passedValidation": true,
  "results": [
    {
      "item": { "type": "quote", "content": "..." },
      "status": "verified",
      "confidence": 95,
      "transcriptEvidence": "Exact quote from transcript..."
    }
  ],
  "summary": "2 items verified, 0 issues found",
  "criticalIssues": []
}

Model selection

The system automatically selects the best available model:

function getModel() {
  if (env.GOOGLE_GENERATIVE_AI_API_KEY) {
    return 'gemini-2.0-flash-exp'
  }
  if (env.ANTHROPIC_API_KEY) {
    return 'claude-3-5-sonnet-20241022'
  }
  throw new Error('No AI provider configured')
}

Cost optimization

Token usage

Google Gemini pricing (as of 2024):

Input: Low cost per 1M tokens
Output: Low cost per 1M tokens
Flash models optimized for speed and cost

Anthropic Claude pricing:

Input: Higher cost per 1M tokens
Output: Higher cost per 1M tokens
Higher quality for complex reasoning

Reduce costs

Truncate long transcripts

For visual suggestions, only send first 3000 characters of transcript:

const truncated = transcript.slice(0, 3000)
const context = truncated.length < transcript.length
  ? `${truncated}\n[...truncated for context window...]`
  : truncated

Cache RAG results

Store Pinecone query results to avoid repeated embedding calls:

const cacheKey = `rag:${query}`
const cached = await cache.get(cacheKey)
if (cached) return cached

const results = await queryKnowledgeBase(query)
await cache.set(cacheKey, results, 3600) // 1 hour
return results

Batch generations

Generate multiple visual suggestions in parallel to amortize prompt costs:

// Generates 10 suggestions in 3 parallel calls
await generateVisualSuggestionsV2(...)

Use Google Gemini as primary

Google Gemini Flash models are significantly cheaper than Claude for equivalent quality on most tasks.

Prompt management

Prompts are stored in the Settings page and can be customized per agent:

Agent types

PRF - Podcast Repurposing Framework generation
Hooks - Viral hooks extraction
LinkedIn - LinkedIn post creation
Instagram - Instagram post creation
Infographic - Infographic spec generation
Thumbnail - YouTube thumbnail spec generation
Timeline - Career timeline spec generation
Suggestions - Visual suggestions orchestration

Preset fields

Prompts support variable interpolation:

{{episodeNumber}} - Episode number
{{guestName}} - Guest name
{{recentStyles}} - Recent visual styles for variety
{{brandVoice}} - YBH brand voice guidelines
{{targetAudience}} - IT leaders (CIOs, CTOs, IT Directors)

Edit prompts

Navigate to Settings > AI Configuration
Select agent type
Edit system prompt
Save changes
Test with new episode

Changes to prompts affect all future generations. Test thoroughly before deploying to production.

Error handling

Retry logic

The system automatically retries failed generations:

try {
  const prf = await generatePRF(transcript)
} catch (error) {
  if (error.message.includes('rate limit')) {
    // Wait and retry
    await sleep(5000)
    return generatePRF(transcript)
  }
  throw error
}

Fallback behavior

Try Google Gemini (if key available)
Fall back to Anthropic Claude (if key available)
Return error if both fail

Common errors

Rate limit exceeded:

Error: Rate limit exceeded. Please try again in 60 seconds.

Solution: Wait and retry, or upgrade API plan. Invalid API key:

Error: API key is invalid or expired.

Solution: Regenerate API key from provider dashboard. Context too long:

Error: Input exceeds maximum context length.

Solution: Truncate transcript or split into chunks.

Monitoring

Track AI usage in application logs:

import { logger } from '@/utils/logger'

logger.debug('[AI] Generating PRF', { episodeId, model })
const prf = await generatePRF(transcript)
logger.info('[AI] PRF generated', { episodeId, length: prf.length })

Best practices

Always verify facts: Use fact-checking endpoint before approving content
Include transcript: Pass transcript to all generation endpoints for accuracy
Monitor token usage: Track costs across Google, Anthropic, and OpenRouter
Test prompts: Validate prompt changes on multiple episodes before deployment
Cache aggressively: Store expensive generation results in Sanity
Use streaming: Implement SSE for long-running operations to show progress

Setup

Integrations

Brand Settings

Supported providers

Google Gemini (primary)

Anthropic Claude (fallback)

OpenRouter (embeddings)

Configuration

AI generation endpoints

PRF generation

Viral hooks generation

Episode metadata (agentic)

Visual suggestions (parallel generation)

Fact-checking

Model selection

Cost optimization

Token usage

Reduce costs

Prompt management

Agent types

Preset fields

Edit prompts

Error handling

Retry logic

Fallback behavior

Common errors

Monitoring

Best practices

Build docs developers (and LLMs) love

Setup

Integrations

Brand Settings

​Supported providers

​Google Gemini (primary)

​Anthropic Claude (fallback)

​OpenRouter (embeddings)

​Configuration

​AI generation endpoints

​PRF generation

​Viral hooks generation

​Episode metadata (agentic)

​Visual suggestions (parallel generation)

​Fact-checking

​Model selection

​Cost optimization

​Token usage

​Reduce costs

​Prompt management

​Agent types

​Preset fields

​Edit prompts

​Error handling

​Retry logic

​Fallback behavior

​Common errors

​Monitoring

​Best practices

Build docs developers (and LLMs) love

Supported providers

Google Gemini (primary)

Anthropic Claude (fallback)

OpenRouter (embeddings)

Configuration

AI generation endpoints

PRF generation

Viral hooks generation

Episode metadata (agentic)

Visual suggestions (parallel generation)

Fact-checking

Model selection

Cost optimization

Token usage

Reduce costs

Prompt management

Agent types

Preset fields

Edit prompts

Error handling

Retry logic

Fallback behavior

Common errors

Monitoring

Best practices