Full-text search

OpenCouncil provides powerful full-text search across all council meeting transcripts, agenda items, and speaker contributions. The search system uses Elasticsearch with semantic understanding to find relevant content even when exact keywords don’t match.

Search capabilities

Natural language

Search using natural language queries in Greek. The system understands context and extracts filters automatically.

Smart filters

Filter by city, speaker, party, topic, date range, and geographic location with automatic extraction from queries.

Semantic search

Find conceptually similar content even without exact keyword matches using AI-powered semantic understanding.

Geographic search

Search for subjects near specific locations with configurable distance radius.

How search works

The search system uses Elasticsearch with RRF (Reciprocal Rank Fusion) to combine multiple search strategies:

Query processing

Natural language queries are analyzed to extract filters like city names, person names, date ranges, and locations.

Multi-strategy search

The system runs multiple search retrievers in parallel:

Keyword search on subject names, descriptions, and transcripts
Semantic search using AI embeddings for conceptual matching
Geographic search for location-based queries

Result ranking

RRF combines results from all strategies, balancing keyword relevance with semantic similarity.

Data enrichment

Results are enriched with meeting details, speaker information, party affiliations, and location coordinates.

Search query structure

The search API accepts comprehensive query parameters:

import { search } from '@/lib/search';

const results = await search({
  query: 'ποδηλατόδρομος',
  config: {
    size: 10,
    from: 0
  }
});

Automatic filter extraction

The system automatically extracts filters from natural language queries:

City extraction
Date extraction
Location extraction

Queries mentioning city names are automatically filtered:

Query: “Τι συζητήθηκε στην Αθήνα για τα πάρκα;”Extracted: cityIds: ["athens-id"]

The system normalizes city names with diacritics and Greek characters.

Date ranges and relative dates are parsed:

Query: “συνεδριάσεις από 1 Ιανουαρίου 2024”Extracted: dateRange: { start: "2024-01-01", end: ... }

Supports: specific dates, month names, relative terms (“last month”, “this year”).

Place names are geocoded to coordinates:

Query: “κοντά στην πλατεία Συντάγματος”Extracted: locations: [{ point: { lat: 37.975, lon: 23.734 }, radius: 1 }]

Uses Google Geocoding API with configurable default radius.

From src/lib/search/filters.ts

Search ranking

Results are ranked using field boosting to prioritize important content:

{
  multi_match: {
    query: searchQuery,
    fields: [
      'name^4',           // Highest boost - subject titles
      'description^3',    // High boost - detailed descriptions
      'location_text^3'   // High boost when location is mentioned
    ],
    type: 'best_fields',
    operator: 'or'
  }
}

Nested queries for speaker segments and contributions use a boost of 2 to balance transcript content with subject metadata.

RRF parameters

Rank fusion is controlled by:

rank_window_size: 100 (default) - Number of top results to consider from each retriever
rank_constant: 60 (default) - Higher values favor top-ranked results

const results = await search({
  query: 'ανακύκλωση',
  config: {
    rankWindowSize: 150,  // Consider more results
    rankConstant: 40      // More balanced ranking
  }
});

From src/lib/search/query.ts:240-242

Semantic search

Enable AI-powered semantic search for conceptual matching:

How semantic search works

When enabled, the system adds a semantic retriever that searches using AI embeddings:

{
  semantic: {
    query: searchQuery,
    field: 'name.semantic',
    boost: 2.0  // Subject names weighted higher
  }
},
{
  semantic: {
    query: searchQuery,
    field: 'description.semantic',
    boost: 1.5  // Descriptions weighted medium
  }
}

This finds results that are conceptually similar even if they use different words. For example, searching for “κυκλοφοριακή συμφόρηση” (traffic congestion) might also find subjects about “κίνηση” (traffic flow) or “μποτιλιάρισμα” (traffic jams).

Semantic search requires Elasticsearch with text embedding models configured. Ensure your Elasticsearch instance has the appropriate inference endpoint set up.

Search result types

The search API returns different result types based on the detailed configuration:

Light results (default)
Detailed results

Lightweight results for list views:

interface SearchResultLight {
  id: string;
  name: string;
  description: string;
  score: number;
  councilMeeting: { /* meeting details */ };
  location: { /* location with coordinates */ };
  matchedSpeakerSegmentIds: string[];
  // ... other subject fields
}

Fast and efficient for displaying search results in lists.

Full results with transcript text for detail views:

interface SearchResultDetailed extends SearchResultLight {
  speakerSegments: {
    id: string;
    startTimestamp: number;
    endTimestamp: number;
    person: Person | null;
    text: string;        // Full transcript text
    summary: { text: string } | null;
  }[];
  context: string | null;
}

Include full transcript text when you need to display content previews.

Only speaker segments with at least 100 characters and an identified person are included in detailed results.

From src/lib/search/types.ts and src/lib/search/index.ts:242-266

Retry logic

The search system includes automatic retry with exponential backoff:

export async function executeElasticsearchWithRetry<T>(
  operation: () => Promise<T>,
  operationName: string,
  maxRetries = 3
): Promise<T> {
  let lastError: Error;
  
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error) {
      lastError = error as Error;
      
      if (attempt < maxRetries) {
        const delayMs = Math.pow(2, attempt) * 1000; // 2s, 4s, 8s
        await new Promise(resolve => setTimeout(resolve, delayMs));
      }
    }
  }
  
  throw lastError!;
}

From src/lib/search/retry.ts

Failed searches are automatically retried up to 3 times with exponential backoff (2s, 4s, 8s delays).

Configuration

Set up search in your environment:

# Elasticsearch connection
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_API_KEY=your_elasticsearch_api_key
ELASTICSEARCH_INDEX=opencouncil-subjects

# Google Geocoding for location search
GOOGLE_API_KEY=your_google_api_key

Search analytics

The system logs essential search analytics:

logEssential('Search Session Started', {
  query: request.query,
  filters: { cityIds, personIds, partyIds, topicIds, dateRange, hasLocations }
});

logEssential('Search Session Completed', {
  query: request.query,
  results: {
    totalHits,
    resultCount,
    took: `${response.took}ms`,
    topScore
  }
});

From src/lib/search/index.ts:42-94

Search logs use [Search Analytics] prefix and are always visible, unlike verbose debug logs.

Performance tips

Use pagination

Always paginate results to avoid loading too much data:

const results = await search({
  query: searchTerm,
  config: {
    size: 20,  // Results per page
    from: page * 20  // Offset for pagination
  }
});

Filter early

Apply filters to reduce the search space before full-text search:

// Good: Narrow down first
await search({
  query: 'budget',
  cityIds: ['city-123'],
  dateRange: { start: '2024-01-01', end: '2024-12-31' }
});

// Less efficient: Search everything
await search({ query: 'budget' });

Use light results

Only request detailed results when you need full transcript text:

// For list views
await search({ query, config: { detailed: false } });

// Only for detail pages
await search({ query, config: { detailed: true } });

API reference

async function

required

Main search function from src/lib/search/index.ts

Show Parameters

request

SearchRequest

required

query: string - The search query
cityIds: string[] - Filter by cities
personIds: string[] - Filter by speakers
partyIds: string[] - Filter by parties
topicIds: string[] - Filter by topics
dateRange: Date range filter with start and end timestamps
locations: LocationQuery[] - Geographic filters
config.size: number - Results per page (default: 10)
config.from: number - Offset for pagination (default: 0)
config.detailed: boolean - Include full text (default: false)
config.enableSemanticSearch: boolean - Use AI search (default: false)

Show Returns

{
  results: SearchResultLight[] | SearchResultDetailed[],
  total: number
}

Next steps

Transcription

Learn how meetings are transcribed for search

AI summaries

Generate summaries from search results

Notifications

Get notified about new content matching your searches

API reference

Explore the full search API

Get Started

Core Features

Deployment

Development

Guides

Search capabilities

Natural language

Smart filters

Semantic search

Geographic search

How search works

Search query structure

Automatic filter extraction

Search ranking

RRF parameters

Semantic search

Search result types

Retry logic

Configuration

Search analytics

Performance tips

API reference

Next steps

Transcription

AI summaries

Notifications

API reference

Build docs developers (and LLMs) love

Get Started

Core Features

Deployment

Development

Guides

​Search capabilities

Natural language

Smart filters

Semantic search

Geographic search

​How search works

​Search query structure

​Automatic filter extraction

​Search ranking

​RRF parameters

​Semantic search

​Search result types

​Retry logic

​Configuration

​Search analytics

​Performance tips

​API reference

​Next steps

Transcription

AI summaries

Notifications

API reference

Build docs developers (and LLMs) love

Search capabilities

How search works

Search query structure

Automatic filter extraction

Search ranking

RRF parameters

Semantic search

Search result types

Retry logic

Configuration

Search analytics

Performance tips

API reference

Next steps