Querying data

The MCP server provides two complementary tools for accessing resource data: query_resource_data for fast querying via the Tabular API, and download_and_parse_resource for direct file downloads.

Choosing the right tool

query_resource_data

Best for:

CSV and XLSX files
Files within size limits
Filtering and sorting
Quick data previews
Pagination through results

Limitations:

CSV files ≤ 100 MB
XLSX files ≤ 12.5 MB
Only tabular formats

download_and_parse_resource

Best for:

JSON and JSONL files
Large files (>100 MB)
Files exceeding Tabular API limits
External URLs
Full dataset analysis

Limitations:

Requires full download
Supports: CSV, CSV.GZ, JSON, JSONL
Default: 500 MB size limit

Using query_resource_data

The query_resource_data tool queries data through the Tabular API without downloading files.

Basic usage

query_resource_data(
    question: str,                  # Description of what you're looking for (required)
    resource_id: str,               # Resource ID from list_dataset_resources (required)
    page: int = 1,                  # Page number for pagination
    page_size: int = 20,            # Rows per page (max: 200)
    filter_column: str = None,      # Column name to filter
    filter_value: str = None,       # Value to filter by
    filter_operator: str = "exact", # Filter comparison operator
    sort_column: str = None,        # Column name to sort by
    sort_direction: str = "asc"     # Sort direction: "asc" or "desc"
)

Preview strategy

Always start with a small page_size to preview the data structure:

# First, preview the structure
query_resource_data(
    question="What columns are available?",
    resource_id="3b6b2281-b9d9-4959-ae9d-c2c166dff118",
    page_size=20  # Default
)

Filtering data

The Tabular API supports six filter operators:

exact: Exact match (default)
contains: Substring match
less: Less than or equal (≤)
greater: Greater than or equal (≥)
strictly_less: Strictly less than (<)
strictly_greater: Strictly greater than (>)

Filter examples

# Exact match
query_resource_data(
    question="Find records for Paris",
    resource_id="resource-id",
    filter_column="ville",
    filter_value="Paris",
    filter_operator="exact"
)

# Substring search
query_resource_data(
    question="Find cities containing 'Saint'",
    resource_id="resource-id",
    filter_column="ville",
    filter_value="Saint",
    filter_operator="contains"
)

# Numeric comparison
query_resource_data(
    question="Find records with population over 100000",
    resource_id="resource-id",
    filter_column="population",
    filter_value="100000",
    filter_operator="strictly_greater"
)

Filters are applied on the server side by the Tabular API, so you only receive matching rows.

Sorting data

Sort results by any column in ascending or descending order:

# Sort ascending (default)
query_resource_data(
    question="Show cities sorted by population",
    resource_id="resource-id",
    sort_column="population",
    sort_direction="asc"
)

# Sort descending
query_resource_data(
    question="Show cities sorted by population (highest first)",
    resource_id="resource-id",
    sort_column="population",
    sort_direction="desc"
)

Pagination

For datasets with many rows, use pagination to retrieve data in chunks:

# Get first page
query_resource_data(
    question="Get first 50 rows",
    resource_id="resource-id",
    page=1,
    page_size=50
)

# Get second page
query_resource_data(
    question="Get next 50 rows",
    resource_id="resource-id",
    page=2,
    page_size=50
)

Pagination indicators

The tool provides helpful indicators:

Total rows: Complete row count in the dataset
Total pages: Number of pages at current page size
Retrieved: Number of rows in the current page
Next page hint: Suggestion to continue with page=N

For large datasets (>1000 rows), the tool will suggest using download_and_parse_resource instead of paginating through many pages.

Combining filters and sorting

You can filter and sort simultaneously:

query_resource_data(
    question="Show Paris records sorted by date",
    resource_id="resource-id",
    filter_column="ville",
    filter_value="Paris",
    filter_operator="exact",
    sort_column="date",
    sort_direction="desc"
)

Using download_and_parse_resource

The download_and_parse_resource tool downloads and parses files directly.

Basic usage

download_and_parse_resource(
    resource_id: str,        # Resource ID (required)
    max_rows: int = 20,      # Maximum rows to return
    max_size_mb: int = 500   # Maximum file size in MB
)

Preview strategy

Always start with the default max_rows to preview:

# First, preview the structure
download_and_parse_resource(
    resource_id="resource-id",
    max_rows=20  # Default - just preview
)

If you need more data, increase max_rows:

# Get all data (or up to max_rows)
download_and_parse_resource(
    resource_id="resource-id",
    max_rows=10000  # Adjust based on file size
)

Supported formats

The tool supports:

CSV: Comma-separated values
CSV.GZ: Compressed CSV files
JSON: JSON arrays or single objects
JSONL: Line-delimited JSON (also called NDJSON)

The tool automatically detects file format from the filename and Content-Type header.

CSV delimiter detection

For CSV files, the tool automatically detects the delimiter:

Uses Python’s csv.Sniffer on the first 5 lines
Supports: , (comma), ; (semicolon), \t (tab), | (pipe)
Falls back to counting delimiter occurrences if sniffing fails

This works seamlessly with various CSV formats, including French CSV files that often use semicolons.

Gzip compression

Gzipped files are automatically decompressed:

# Works for .csv.gz files
download_and_parse_resource(
    resource_id="resource-id-for-csv-gz"
)

JSON handling

The tool handles multiple JSON formats:

JSON arrays: [{...}, {...}]
Single objects: {...} (returned as a one-item list)
JSONL/NDJSON: One JSON object per line

File size limits

You can adjust the maximum download size:

# Increase limit for large files
download_and_parse_resource(
    resource_id="resource-id",
    max_size_mb=1000  # Allow up to 1 GB
)

# Decrease limit for safety
download_and_parse_resource(
    resource_id="resource-id",
    max_size_mb=100   # Limit to 100 MB
)

The tool checks file size before completing the download. If a file exceeds max_size_mb, the download is aborted and an error is returned.

Format limitations

XLSX/XLS: Not supported by this tool. Use query_resource_data for Excel files under 12.5 MB.
XML: Format detected but parsing not yet implemented.
Other formats: Will be reported as unsupported.

Working with large datasets

Small datasets (<500 rows)

Use query_resource_data with adjusted page size:

query_resource_data(
    question="Get all data",
    resource_id="resource-id",
    page_size=200  # Maximum allowed
)

For datasets with 200-500 rows, paginate 2-3 times.

Medium datasets (500-1000 rows)

Decide based on your needs:

Specific data: Use query_resource_data with filters
Full analysis: Use download_and_parse_resource

Large datasets (>1000 rows)

For comprehensive analysis:

# Download and parse the full dataset
download_and_parse_resource(
    resource_id="resource-id",
    max_rows=50000  # Adjust based on file size
)

The query_resource_data tool will automatically suggest switching to download_and_parse_resource when it detects datasets over 1000 rows.

Handling errors

Resource not available via Tabular API

If you get this error with query_resource_data:

⚠️ Resource not available via Tabular API

The resource either:

Exceeds size limits (CSV >100 MB, XLSX >12.5 MB)
Is not tabular data
Has an unsupported format

Solution: Use download_and_parse_resource instead.

File too large

Error: File too large: 750 MB (max: 500 MB)

Solution: Increase the max_size_mb parameter:

download_and_parse_resource(
    resource_id="resource-id",
    max_size_mb=1000
)

Format not supported

⚠️ Format not supported for parsing

Solutions:

For XLSX files: Use query_resource_data if under 12.5 MB
For other formats: The file may need manual processing

Output format

Both tools return data in a consistent format:

Querying resource: [Resource Title]
Resource ID: [resource-id]
Dataset: [Dataset Title] (ID: [dataset-id])

Total rows: 1000
Retrieved: 20 row(s) from page 1
Columns: ville, population, code_postal

Data (20 rows):
  Row 1:
    ville: Paris
    population: 2161000
    code_postal: 75000
  Row 2:
    ville: Marseille
    population: 869815
    code_postal: 13000
  ...

Long values (>100 characters) are automatically truncated with ”…” for readability.

Best practices

Always preview first: Start with small page sizes or row counts
Check resource info: Use get_resource_info to check Tabular API availability
Use filters: Reduce data volume by filtering on the server side
Choose the right tool:
- Small tabular data → query_resource_data
- Large files or JSON → download_and_parse_resource
Monitor size: Be mindful of file sizes and adjust limits accordingly

Get Started

Setup

Client Connections

Guides

Development

Choosing the right tool

query_resource_data

download_and_parse_resource

Using query_resource_data

Basic usage

Preview strategy

Filtering data

Filter examples

Sorting data

Combining filters and sorting

Using download_and_parse_resource

Basic usage

Preview strategy

Supported formats

CSV delimiter detection

Gzip compression

JSON handling

File size limits

Format limitations

Working with large datasets

Small datasets (<500 rows)

Medium datasets (500-1000 rows)

Large datasets (>1000 rows)

Handling errors

Resource not available via Tabular API

File too large

Format not supported

Output format

Best practices

Next steps

Working with datasets

Usage metrics

Build docs developers (and LLMs) love

Get Started

Setup

Client Connections

Guides

Development

​Choosing the right tool

query_resource_data

download_and_parse_resource

​Using query_resource_data

​Basic usage

​Preview strategy

​Filtering data

​Filter examples

​Sorting data

​Pagination

​Pagination indicators

​Combining filters and sorting

​Using download_and_parse_resource

​Basic usage

​Preview strategy

​Supported formats

​CSV delimiter detection

​Gzip compression

​JSON handling

​File size limits

​Format limitations

​Working with large datasets

​Small datasets (<500 rows)

​Medium datasets (500-1000 rows)

​Large datasets (>1000 rows)

​Handling errors

​Resource not available via Tabular API

​File too large

​Format not supported

​Output format

​Best practices

​Next steps

Working with datasets

Usage metrics

Build docs developers (and LLMs) love

Choosing the right tool

Using query_resource_data

Basic usage

Preview strategy

Filtering data

Filter examples

Sorting data

Pagination

Pagination indicators

Combining filters and sorting

Using download_and_parse_resource

Basic usage

Preview strategy

Supported formats

CSV delimiter detection

Gzip compression

JSON handling

File size limits

Format limitations

Working with large datasets

Small datasets (<500 rows)

Medium datasets (500-1000 rows)

Large datasets (>1000 rows)

Handling errors

Resource not available via Tabular API

File too large

Format not supported

Output format

Best practices

Next steps