Skip to main content
OpenCouncil automatically transcribes council meeting videos into searchable text with advanced speaker recognition capabilities. The system processes YouTube videos, identifies speakers using voice biometrics, and creates timestamped transcripts with utterance-level precision.

How it works

The transcription system uses a multi-stage pipeline to process meeting videos:
1

Video submission

Provide a YouTube URL for the council meeting video. The system validates that the meeting exists and checks for existing transcripts.
2

Voice preparation

The system collects voiceprints from known council members and builds a custom vocabulary with city names, person names, and party names to improve accuracy.
3

Task processing

A background task downloads the video, extracts audio, and sends it to the transcription service with custom prompts and vocabulary.
4

Speaker identification

The system matches anonymous speaker segments to known council members using voice biometric matching with confidence scores.
5

Data storage

Speaker segments and utterances are stored in the database with timestamps, creating a fully searchable transcript linked to specific people.

Speaker recognition

OpenCouncil uses voice biometrics to automatically identify speakers in meeting recordings:
The system compares speaker audio segments against stored voiceprints for each council member. When a match is found with sufficient confidence, the speaker is automatically tagged with the person’s identity.
const voiceprints: Voiceprint[] = people
  .filter(person => person.voicePrints && person.voicePrints.length > 0)
  .map(person => ({
    personId: person.id,
    voiceprint: person.voicePrints![0].embedding
  }));
Only voiceprints for people relevant to the meeting’s administrative body are used for matching. This improves accuracy by limiting the search space to expected speakers.The system retrieves people based on the meeting’s administrativeBodyId from src/lib/tasks/transcribe.ts:92.
When a speaker cannot be identified, they are labeled as “Unknown Speaker 1”, “Unknown Speaker 2”, etc. These can be manually corrected later through the UI.

Custom vocabulary and prompts

To improve transcription accuracy for Greek council meetings, the system uses:

Custom vocabulary

A list of city-specific terms including:
  • Municipality name
  • Council member names
  • Political party names
  • Local place names

Custom prompts

Context-aware instructions in Greek that describe the meeting:“Αυτή είναι η απομαγνητοφώνηση της συνεδρίασης του δήμου της [City] που έγινε στις [Date].”

Data structure

Transcripts are organized hierarchically:
Each speaker segment represents continuous speech by one person. Segments are split when the speaker changes or when there’s more than 5 seconds of silence.

Speaker segments

Speaker segments group consecutive utterances from the same speaker:
if (currentSpeaker !== utterance.speaker ||
    (currentSegment && utterance.start - currentSegment.endTimestamp! > 5)) {
  // Start a new segment
  speakerSegments.push(currentSegment as SpeakerSegment);
  currentSegment = {
    startTimestamp: utterance.start,
    endTimestamp: utterance.end,
    speakerTagId: utterance.speaker.toString()
  };
}
From src/lib/tasks/transcribe.ts:318-350

Utterances

Each utterance represents a complete sentence or phrase with:
  • startTimestamp - Start time in seconds
  • endTimestamp - End time in seconds
  • text - The transcribed text
  • drift - Audio sync correction value

Configuration

Set up transcription in your .env file:
# Task API for background processing
TASK_API_URL=http://localhost:3005
TASK_API_KEY=your_task_api_key

# Optional: Custom transcription settings
CUSTOM_VOCABULARY=true
VOICE_RECOGNITION=true

Performance optimization

The system uses several optimizations for large meetings:
Speaker segments are created in parallel batches of 50 to reduce database round-trips:
const BATCH_SIZE = 50;
for (let i = 0; i < speakerSegmentsData.length; i += BATCH_SIZE) {
  const batch = speakerSegmentsData.slice(i, i + BATCH_SIZE);
  await Promise.all(batch.map(async (segment) => {
    // Create segment with nested utterances
  }));
}
This provides 10-50x performance improvement over sequential processing.

Reprocessing transcripts

You can force reprocessing of existing transcripts:
Reprocessing with force: true will delete all existing speaker data including manually corrected speaker identifications. Use with caution.
await requestTranscribe(
  youtubeUrl,
  councilMeetingId,
  cityId,
  { force: true }
);
When force: false (default), the system will throw an error if speaker segments already exist.

API reference

Key functions from src/lib/tasks/transcribe.ts:
requestTranscribe
function
Initiates transcription for a meeting video
handleTranscribeResult
function
Processes the transcription result from the task server

Next steps

Search transcripts

Learn how to search across all meeting transcripts

AI summaries

Generate summaries and insights from transcripts

Meeting highlights

Create shareable video clips from transcripts

Notifications

Set up alerts for transcript availability

Build docs developers (and LLMs) love