Content Pipeline Architecture
The content generation pipeline transforms raw transcripts into production-ready assets through a series of AI-powered stages, each building on the outputs of previous stages.Stage 1: PRF Generation
PRF (Podcast Repurposing Framework) is the foundation of all downstream content. It’s a structured analysis of the episode that identifies key themes, insights, and quotable moments.Inputs
- Episode transcript - Full conversation text
- Episode metadata - Number, guest name, title
- Brand guidelines - YBH voice and positioning
- Agent configuration - Model selection and system prompt
AI Orchestration
PRF generation uses an agentic workflow with:- RAG (Retrieval Augmented Generation)
- Status Updates
- Model Selection
Before generating, the AI retrieves relevant context:
- Previous PRF examples (for style consistency)
- YBH brand voice guidelines
- IT leadership content patterns
- Episode-specific terminology
- Consistent formatting across episodes
- Adherence to brand voice
- Industry-appropriate language
- Contextual understanding
Output Structure
PRF typically includes:PRF is stored as HTML in Sanity, allowing rich formatting. The TipTap editor preserves headings, lists, bold, italics, and other styling.
Approval & Editing
Before approving PRF:- Review for accuracy - Verify quotes and facts against transcript
- Check brand voice - Ensure “anti-spin” positioning
- Edit for clarity - Simplify jargon, add context
- Format for readability - Use headings, lists, bold
prfApproved: trueflag set- Timestamp recorded (
prfApprovedAt) - Hooks and social posts generation enabled
Stage 2: Hooks Generation
Viral Hooks are short, attention-grabbing statements designed for social media engagement. They extract the most quotable, shareable moments from the episode.Inputs
- PRF document - Themes and quotes
- Episode transcript - For fact verification
- Episode metadata - Guest name, episode number
- Previous hooks - Avoid repetition across episodes
Generation Strategy
Hooks are generated with specific engagement patterns:Contrarian Statement
Challenge conventional wisdom:“Most CIOs think uptime is success. The best ones know it’s just the baseline.”
Unexpected Insight
Reveal surprising truth:“After 380 interviews, the pattern is clear: IT leaders who demand respect before crisis get better results.”
Direct Quote
Quotable soundbite:“The challenge isn’t finding a vendor. It’s finding the one who sucks the least.”
Specific Number
Data-driven hook:“73% of IT leaders say vendor relationships are transactional. Here’s why that’s a problem.”
Fact Verification
Every hook is verified against the transcript:Output Format
Hooks are stored as HTML with formatting:Stage 3: Social Posts Generation
Platform-specific posts tailored for LinkedIn and Instagram, each with unique formatting, tone, and CTAs.LinkedIn Posts
Generate two posts per episode:- Release Day Post
- Follow-Up Post
Published when episode goes live.Structure:Characteristics:
- Announcement tone
- Episode link in comments
- 1-3 hashtags
- ~500-800 characters
Verified Facts Bank
LinkedIn posts include a structured facts bank:Facts bank is not visible in the UI but stored for AI reference. It ensures social posts only use verified content from the transcript.
Instagram Captions
Generate 2-3 captions with visual-first formatting:- Emojis for visual breaks
- Shorter paragraphs (mobile reading)
- 3-5 hashtags per caption
- Clear CTA
- 150-300 characters
Stage 4: Visual Suggestions
Visual asset generation creates infographics, quote cards, and data visualizations tailored to episode content.Parallel Generation
Instead of sequential generation, visual suggestions are created in parallel:- Faster overall generation (30-60 seconds vs 2-3 minutes)
- Independent failure handling (one failure doesn’t block others)
- Progress tracking per stream
Suggestion Types
- Data Visualizations
- Cinematic Infographics
- Quote Cards
Specs for charts, frameworks, and statisticsExample layouts:
- Doom Loop (cyclical problem)
- Quadrant Matrix (2×2 comparison)
- Pyramid (hierarchy)
- Pipeline/Funnel (process flow)
- Card Grid (modular concepts)
Variety Tracking
AI avoids repetition by checking generation history:Stage 5: Image Generation
Once specs are created, images are rendered using Kie.ai Nano Banana Pro.Generation Flow
Resolution & Aspect Ratio
16:9 Landscape
Best for:
- LinkedIn posts
- Blog headers
- YouTube thumbnails
- 1K: 1920×1080
- 2K: 2560×1440
- 4K: 3840×2160
1:1 Square
Best for:
- Instagram posts
- Quote cards
- Profile images
- 1K: 1080×1080
- 2K: 2048×2048
- 4K: 4096×4096
9:16 Portrait
Best for:
- Instagram Stories
- TikTok
- Reels
- 1K: 1080×1920
- 2K: 1440×2560
- 4K: 2160×3840
Stage 6: Video Clips
Short-form video suggestions identify the most engaging moments for TikTok, Reels, and YouTube Shorts.Clip Structure
Each suggestion includes:Example Output
Fact-Checking System
All generated content is validated against the transcript to prevent hallucinations.Fact-Check Agent
Runs automatically during visual suggestions generation:Validation Process
Extract fact-checkable items
From all suggestions:
- Statistics and numbers
- Direct quotes
- Claims and assertions
- Lists and frameworks
Search transcript for evidence
AI searches for supporting evidence:
- Exact quote matches
- Paraphrased statements
- Statistical sources
- Attribution verification
Pipeline Performance
Generation Times
| Stage | Average Time | Range |
|---|---|---|
| PRF | 30-45s | 20-60s |
| Hooks | 20-30s | 15-40s |
| LinkedIn Posts | 25-35s | 20-50s |
| Instagram Captions | 15-25s | 10-30s |
| Visual Suggestions (10) | 45-60s | 30-90s |
| Image Rendering (per image) | 60-120s | 30-180s |
| Video Clips | 20-30s | 15-40s |
| Fact-Check | 15-25s | 10-40s |
Times vary based on transcript length, model selection, and API response times. Claude models are generally faster than GPT-4.
Cost Optimization
- Model selection: Claude 3.5 Sonnet offers best balance of quality and cost
- Parallel generation: Reduces wall-clock time without increasing token usage
- Prompt caching: Reuses transcript analysis across multiple stages (future feature)
- Selective regeneration: Only regenerate specific content, not entire pipeline