Skip to main content

Overview

The AI Highlight Selection feature uses OpenAI’s GPT-4o-mini model to automatically analyze full video transcripts and identify the most engaging 2-minute segments. This eliminates the need for manual video review and ensures highlights are selected based on content quality rather than arbitrary timestamps.

How It Works

The GetHighlight function in Components/LanguageTasks.py:52 orchestrates the entire selection process:
1

Transcript Analysis

The complete timestamped transcription is sent to GPT-4o-mini along with selection criteria.
2

LLM Evaluation

The model analyzes the content and selects a 2-minute segment that is interesting, useful, surprising, controversial, or thought-provoking.
3

Structured Output

Returns a JSON object with start, content, and end fields using function calling for reliable parsing.
4

Validation

The response is validated to ensure valid timestamps and proper segment duration.

Function Signature

Components/LanguageTasks.py
def GetHighlight(Transcription):
    """
    Args:
        Transcription: Timestamped transcript string from transcribeAudio
    
    Returns:
        Tuple[int, int]: (Start, End) timestamps in seconds, or (None, None) on error
    """

Selection Criteria

The system prompt defines what makes a good highlight:
Components/LanguageTasks.py:27-43
system = """
The input contains a timestamped transcription of a video.
Select a 2-minute segment from the transcription that contains something 
interesting, useful, surprising, controversial, or thought-provoking.
The selected text should contain only complete sentences.
Do not cut the sentences in the middle.
The selected text should form a complete thought.
"""
Complete Sentences: The AI is specifically instructed to avoid cutting sentences mid-way, ensuring the selected clip has proper narrative flow.

LLM Configuration

The model is configured with specific parameters for reliable highlight selection:
Components/LanguageTasks.py:56-60
llm = ChatOpenAI(
    model="gpt-4o-mini",  # Cost-effective model
    temperature=1.0,
    api_key=api_key
)
model
string
default:"gpt-4o-mini"
The OpenAI model used for analysis. GPT-4o-mini provides a good balance of cost and quality.
temperature
float
default:"1.0"
Controls creativity in selection. Higher values (1.0) allow more diverse segment choices.
api_key
string
required
OpenAI API key from .env file (OPENAI_API variable).

Structured Output Schema

The response follows a strict Pydantic schema for reliable parsing:
Components/LanguageTasks.py:12-25
class JSONResponse(BaseModel):
    start: float = Field(description="Start time of the clip")
    content: str = Field(description="Highlight Text")
    end: float = Field(description="End time for the highlighted clip")
The LangChain chain enforces this schema using function calling:
Components/LanguageTasks.py:69
chain = prompt | llm.with_structured_output(JSONResponse, method="function_calling")

Response Validation

The function performs comprehensive validation to ensure usable output:
Components/LanguageTasks.py:74-100
# 1. Check for empty response
if not response:
    print("ERROR: LLM returned empty response")
    return None, None

# 2. Validate response structure
if not hasattr(response, 'start') or not hasattr(response, 'end'):
    print(f"ERROR: Invalid response structure: {response}")
    return None, None

# 3. Parse timestamps
try:
    Start = int(response.start)
    End = int(response.end)
except (ValueError, TypeError) as e:
    print(f"ERROR: Could not parse start/end times from response")
    return None, None

# 4. Validate time values
if Start < 0 or End < 0:
    print(f"ERROR: Negative time values - Start: {Start}s, End: {End}s")
    return None, None

if End <= Start:
    print(f"ERROR: Invalid time range - Start: {Start}s, End: {End}s")
    return None, None

Regeneration Workflow

If the AI selects identical start and end times (invalid segment), users are prompted to regenerate:
Components/LanguageTasks.py:109-113
if Start == End:
    Ask = input("Error - Get Highlights again (y/n) -> ").lower()
    if Ask == "y":
        Start, End = GetHighlight(Transcription)
    return Start, End
The regeneration feature allows multiple attempts without restarting the entire pipeline. Simply answer ‘y’ to try again with a different selection.

Error Handling

Comprehensive exception handling ensures graceful failures:
Components/LanguageTasks.py:116-127
except Exception as e:
    print(f"{'='*60}")
    print(f"ERROR IN GetHighlight FUNCTION:")
    print(f"{'='*60}")
    print(f"Exception type: {type(e).__name__}")
    print(f"Exception message: {str(e)}")
    print(f"\nTranscription length: {len(Transcription)} characters")
    print(f"First 200 chars: {Transcription[:200]}...")
    print(f"{'='*60}\n")
    import traceback
    traceback.print_exc()
    return None, None

Customizing Selection Criteria

To modify what the AI considers “interesting”:
1

Edit the System Prompt

Modify the system variable in Components/LanguageTasks.py:27 to change selection criteria.
2

Adjust Temperature

Change temperature in the ChatOpenAI initialization (line 58). Lower values (0.5) make selections more conservative, higher values (1.5) more creative.
3

Switch Models

Replace gpt-4o-mini with gpt-4o or gpt-4-turbo for potentially better quality (higher cost).
Changing the model may affect cost and latency. GPT-4o-mini is optimized for this use case with ~$0.001 per video analysis.

Output Format

After selection, detailed information is logged:
Components/LanguageTasks.py:102-107
print(f"\n{'='*60}")
print(f"SELECTED SEGMENT DETAILS:")
print(f"Time: {Start}s - {End}s ({End-Start}s duration)")
print(f"Content: {response.content}")
print(f"{'='*60}\n")
Example output:
============================================================
SELECTED SEGMENT DETAILS:
Time: 68s - 187s (119s duration)
Content: discusses the importance of proper error handling in production systems...
============================================================

Performance

  • Latency: ~2-5 seconds per video (depends on transcript length)
  • Cost: ~$0.001 per analysis with GPT-4o-mini
  • Accuracy: Selects complete sentences with proper narrative flow
  • Reliability: Structured output with function calling ensures 99%+ valid responses

Build docs developers (and LLMs) love