Skip to main content

Overview

The Transcripts API provides access to complete earnings call transcripts stored in S3. Metadata is stored in PostgreSQL for fast querying, while full transcript text is retrieved from cloud storage.

Get Transcript

Retrieve a complete earnings call transcript. Endpoint: GET /transcript/{ticker}/{year}/{quarter} Authentication: Optional (works without authentication)
cURL
curl https://api.financeagent.com/transcript/TSLA/2024/4

Path Parameters

ticker
string
required
Company ticker symbol (e.g., “TSLA”, “AAPL”)
year
integer
required
Fiscal year (e.g., 2024)
quarter
integer
required
Fiscal quarter (1, 2, 3, or 4)

Response

success
boolean
Whether the transcript was found
ticker
string
Company ticker symbol
company_name
string
Company name
year
integer
Fiscal year
quarter
integer
Fiscal quarter
date
string
Earnings call date
transcript_text
string
Complete transcript text
transcript_length
integer
Length of transcript in characters
source
string
Data source (always “bucket” for S3)
metadata
object
Additional metadata about the transcript

Example Response

{
  "success": true,
  "ticker": "TSLA",
  "company_name": "Tesla Inc.",
  "year": 2024,
  "quarter": 4,
  "date": "2024-01-24",
  "transcript_text": "Operator: Good day, and thank you for standing by...\n\nElon Musk: Thanks for joining us today...",
  "transcript_length": 45232,
  "source": "bucket",
  "metadata": {
    "duration": "60 minutes",
    "participants": ["Elon Musk", "Zachary Kirkhorn"]
  }
}

Get Transcript with Highlights

Retrieve a transcript with specific sections highlighted based on relevant chunks. Endpoint: POST /transcript/with-highlights Authentication: Optional
curl -X POST https://api.financeagent.com/transcript/with-highlights \
  -H "Content-Type: application/json" \
  -d '{
    "ticker": "AAPL",
    "year": 2024,
    "quarter": 3,
    "relevant_chunks": [
      {
        "chunk_id": "chunk-42",
        "char_offset": 15200,
        "chunk_length": 800,
        "chunk_text": "Our Services revenue grew 14% year over year..."
      }
    ]
  }'

Request Body

ticker
string
required
Company ticker symbol
year
integer
required
Fiscal year
quarter
integer
required
Fiscal quarter (1-4)
relevant_chunks
array
required
Array of relevant chunks to highlight

Response

success
boolean
Request status
transcript_text
string
Original transcript text
highlighted_transcript
string
Transcript with HTML <mark> tags around highlighted sections
transcript_length
integer
Length of transcript in characters
metadata
object
Transcript metadata

Highlighted Output Example

Operator: Good day, and thank you for standing by...

<mark class="highlighted-chunk" data-chunk-id="chunk-42">
Our Services revenue grew 14% year over year to reach a new all-time high of $22.3 billion. This growth was driven by strong performance across our services portfolio.
</mark>

We also saw continued momentum in our Products segment...

Highlighting Logic

The API uses a two-step approach to highlight relevant sections:
  1. Primary Method: Uses char_offset and chunk_length for precise positioning
  2. Fallback Method: Performs text search on chunk_text if offset is unavailable

Minimum Highlight Length

Highlighted regions are expanded to a minimum of 1,200 characters with a 400-character look-ahead to provide context.

Error Responses

Transcript Not Found

{
  "detail": "Full earnings transcript not yet available for TSLA 2025 Q1"
}
Status Code: 404 Not Found

Storage Error

{
  "detail": "Could not load transcript from storage"
}
Status Code: 503 Service Unavailable

Use Cases

1. Display Full Transcript

# Fetch and display a complete transcript
response = requests.get(
    "https://api.financeagent.com/transcript/MSFT/2024/2"
)
transcript = response.json()
print(f"Transcript for {transcript['company_name']} Q{transcript['quarter']} {transcript['year']}")
print(transcript['transcript_text'])

2. Highlight Search Results

# After getting search results from chat API, highlight matching sections
relevant_chunks = [
    {
        "chunk_id": "chunk-15",
        "char_offset": 8500,
        "chunk_length": 650,
        "chunk_text": "Cloud revenue increased by 23%..."
    }
]

response = requests.post(
    "https://api.financeagent.com/transcript/with-highlights",
    json={
        "ticker": "MSFT",
        "year": 2024,
        "quarter": 2,
        "relevant_chunks": relevant_chunks
    }
)

html_content = response.json()['highlighted_transcript']
# Render html_content in your application

3. Compare Multiple Transcripts

# Fetch transcripts for trend analysis
quarters = [1, 2, 3, 4]
transcripts = []

for q in quarters:
    response = requests.get(
        f"https://api.financeagent.com/transcript/NVDA/2024/{q}"
    )
    transcripts.append(response.json())

# Analyze or display side-by-side

Caching

Transcript text is cached in-memory (max 50 transcripts) to reduce S3 API calls and improve response times. Metadata queries always hit PostgreSQL for accuracy.

Performance

  • Metadata Query: ~50-100ms (PostgreSQL)
  • Transcript Fetch: ~200-500ms (S3, first request)
  • Cached Transcript: ~10-20ms (in-memory)
  • Highlighting: ~50-100ms (text processing)
For best performance, use char_offset in highlight requests rather than relying on text search fallback.

Build docs developers (and LLMs) love