Overview
The Transcripts API provides access to complete earnings call transcripts stored in S3. Metadata is stored in PostgreSQL for fast querying, while full transcript text is retrieved from cloud storage.Get Transcript
Retrieve a complete earnings call transcript. Endpoint:GET /transcript/{ticker}/{year}/{quarter}
Authentication: Optional (works without authentication)
cURL
Path Parameters
Company ticker symbol (e.g., “TSLA”, “AAPL”)
Fiscal year (e.g., 2024)
Fiscal quarter (1, 2, 3, or 4)
Response
Whether the transcript was found
Company ticker symbol
Company name
Fiscal year
Fiscal quarter
Earnings call date
Complete transcript text
Length of transcript in characters
Data source (always “bucket” for S3)
Additional metadata about the transcript
Example Response
Get Transcript with Highlights
Retrieve a transcript with specific sections highlighted based on relevant chunks. Endpoint:POST /transcript/with-highlights
Authentication: Optional
Request Body
Company ticker symbol
Fiscal year
Fiscal quarter (1-4)
Array of relevant chunks to highlight
Response
Request status
Original transcript text
Transcript with HTML
<mark> tags around highlighted sectionsLength of transcript in characters
Transcript metadata
Highlighted Output Example
Highlighting Logic
The API uses a two-step approach to highlight relevant sections:- Primary Method: Uses
char_offsetandchunk_lengthfor precise positioning - Fallback Method: Performs text search on
chunk_textif offset is unavailable
Minimum Highlight Length
Highlighted regions are expanded to a minimum of 1,200 characters with a 400-character look-ahead to provide context.Error Responses
Transcript Not Found
404 Not Found
Storage Error
503 Service Unavailable
Use Cases
1. Display Full Transcript
2. Highlight Search Results
3. Compare Multiple Transcripts
Caching
Transcript text is cached in-memory (max 50 transcripts) to reduce S3 API calls and improve response times. Metadata queries always hit PostgreSQL for accuracy.Performance
- Metadata Query: ~50-100ms (PostgreSQL)
- Transcript Fetch: ~200-500ms (S3, first request)
- Cached Transcript: ~10-20ms (in-memory)
- Highlighting: ~50-100ms (text processing)
For best performance, use
char_offset in highlight requests rather than relying on text search fallback.