Connecting Data Sources

Evidence can connect to multiple data sources including databases, data warehouses, and flat files. This guide covers everything you need to know about data connections.

Supported Data Sources

Evidence supports these data sources out of the box:

Databases: PostgreSQL, MySQL, SQL Server, SQLite
Warehouses: Snowflake, BigQuery, Databricks, Redshift
Files: CSV, Parquet, DuckDB
Query Engines: Trino, Athena

Connection Configuration

Create a Connection File

Data source connections are defined in YAML files in the sources/ directory. Each subdirectory represents a separate data source.Create a directory and connection file:

mkdir -p sources/my_warehouse

Create sources/my_warehouse/connection.yaml:

name: my_warehouse
type: postgres
options:
  host: database.example.com
  port: 5432
  database: analytics
  user: evidence_user
  password: ${EVIDENCE_PASSWORD}
  ssl: true

Use environment variables (like ${EVIDENCE_PASSWORD}) to keep credentials secure.

Configure Environment Variables

Create a .env file in your project root:

EVIDENCE_PASSWORD=your_secure_password
SNOWFLAKE_USER=analytics_user
SNOWFLAKE_PASSWORD=another_secure_password

Important: Add .env to your .gitignore to keep credentials out of version control.

Test Your Connection

Create a test query in your source directory:sources/my_warehouse/test.sql:

SELECT 1 as test_value

Run your Evidence app and check that the query executes successfully.

Connection Examples

DuckDB (Local Database)

DuckDB is perfect for local development and small to medium datasets:

sources/local/connection.yaml

name: local
type: duckdb
options:
  filename: ./local.duckdb

DuckDB automatically loads and can query CSV and Parquet files directly:

SELECT * FROM read_csv_auto('data/sales.csv')

CSV Files

Evidence can read CSV files as a data source:

sources/csv/connection.yaml

name: csv
type: csv
options: {}

Place your CSV files in sources/csv/ and query them:

sources/csv/sales_data.sql

SELECT * FROM sales_data.csv

PostgreSQL

sources/postgres/connection.yaml

name: postgres_db
type: postgres
options:
  host: ${POSTGRES_HOST}
  port: 5432
  database: ${POSTGRES_DATABASE}
  user: ${POSTGRES_USER}
  password: ${POSTGRES_PASSWORD}
  ssl: true

Snowflake

sources/snowflake/connection.yaml

name: snowflake
type: snowflake
options:
  account: ${SNOWFLAKE_ACCOUNT}
  username: ${SNOWFLAKE_USER}
  password: ${SNOWFLAKE_PASSWORD}
  database: ANALYTICS
  warehouse: COMPUTE_WH
  schema: PUBLIC

BigQuery

sources/bigquery/connection.yaml

name: bigquery
type: bigquery
options:
  project_id: ${GCP_PROJECT_ID}
  credentials_path: ./credentials.json

Or use a service account key directly:

options:
  project_id: my-project
  client_email: ${GCP_CLIENT_EMAIL}
  private_key: ${GCP_PRIVATE_KEY}

Databricks

sources/databricks/connection.yaml

name: databricks
type: databricks
options:
  server_hostname: ${DATABRICKS_HOST}
  http_path: ${DATABRICKS_HTTP_PATH}
  access_token: ${DATABRICKS_TOKEN}

Working with Multiple Sources

Querying Multiple Databases

You can query different data sources on the same page:

---
title: Cross-Database Analytics
queries:
  - warehouse_sales: warehouse_sales.sql      # From Snowflake
  - app_events: app_events.sql                # From PostgreSQL
  - local_targets: local_targets.sql          # From DuckDB
---

Joining Across Sources

Use inline SQL to join data from different sources:

combined_metrics

SELECT 
  w.month,
  w.total_sales,
  a.total_events,
  t.target
FROM ${warehouse_sales} w
LEFT JOIN ${app_events} a ON w.month = a.month
LEFT JOIN ${local_targets} t ON w.month = t.month

Query Organization

Source-Specific Queries

Store queries alongside their connection configuration:

sources/
  snowflake/
    connection.yaml
    orders.sql
    customers.sql
  postgres/
    connection.yaml
    events.sql
    users.sql

Reference the full path in your pages:

queries:
  - orders: snowflake/orders.sql
  - events: postgres/events.sql

Shared Queries

Place reusable queries in the queries/ directory:

queries/
  orders_by_month.sql
  customer_summary.sql
  product_metrics.sql

Reference without a path:

queries:
  - orders_by_month: orders_by_month.sql

Real Example: E-commerce Database

Here’s a complete example from the Evidence project:

sources/ecommerce/connection.yaml

name: ecommerce
type: duckdb
options:
  filename: ./ecommerce.duckdb

sources/ecommerce/orders.sql

SELECT * FROM orders

sources/ecommerce/orders_by_month.sql

SELECT 
  DATE_TRUNC('month', order_datetime) as month,
  COUNT(*) as num_orders,
  SUM(sales) as total_sales,
  SUM(sales) / COUNT(*) as avg_order_value
FROM orders
GROUP BY DATE_TRUNC('month', order_datetime)
ORDER BY month

Use in your page:

---
queries:
  - orders_by_month: ecommerce/orders_by_month.sql
---

# E-commerce Dashboard

<LineChart 
  data={orders_by_month}
  x="month"
  y="total_sales"
  yAxisTitle="Revenue"
/>

Best Practices

Security

Never commit credentials - Use environment variables
Use read-only accounts - Evidence only needs SELECT permissions
Limit access scope - Grant access only to necessary schemas/tables
Rotate credentials - Update passwords regularly

Performance

Pre-aggregate data - Create summary tables in your warehouse
Use materialized views - For complex, frequently-used queries
Limit result sets - Use LIMIT or date filters for large tables
Index appropriately - Ensure your database has proper indexes

Organization

One source per directory - Keep connection files organized
Descriptive names - Use clear names like snowflake_prod or postgres_staging
Document connections - Add comments to YAML files explaining purpose
Version control - Commit connection files (without credentials)

Troubleshooting

Connection Fails

Check network access: Ensure your database allows connections from your IP Verify credentials: Test credentials using a database client Review SSL settings: Some databases require ssl: true or specific SSL modes

Queries Run Slowly

Add filters: Limit date ranges or row counts

SELECT * FROM orders 
WHERE order_date >= CURRENT_DATE - INTERVAL '90 days'

Create indexes: Work with your DBA to add indexes on frequently queried columns Use query caching: Evidence caches query results during development

Environment Variables Not Working

Check .env file location: Must be in project root Restart dev server: Changes to .env require a restart Use correct syntax: ${VARIABLE_NAME} in YAML files

Next Steps

Build your first app - Create a complete dashboard
Create charts - Visualize your connected data
Use templating - Build dynamic pages from your data

Get Started

Core Concepts

Data Sources

Components

Deployment

Guides

Connecting Data Sources

Supported Data Sources

Connection Configuration

Connection Examples

DuckDB (Local Database)

CSV Files

PostgreSQL

Snowflake

BigQuery

Databricks

Working with Multiple Sources

Querying Multiple Databases

Joining Across Sources

Query Organization

Source-Specific Queries

Shared Queries

Real Example: E-commerce Database

Best Practices

Security

Performance

Organization

Troubleshooting

Connection Fails

Queries Run Slowly

Environment Variables Not Working

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Data Sources

Components

Deployment

Guides

​Supported Data Sources

​Connection Configuration

​Connection Examples

​DuckDB (Local Database)

​CSV Files

​PostgreSQL

​Snowflake

​BigQuery

​Databricks

​Working with Multiple Sources

​Querying Multiple Databases

​Joining Across Sources

​Query Organization

​Source-Specific Queries

​Shared Queries

​Real Example: E-commerce Database

​Best Practices

​Security

​Performance

​Organization

​Troubleshooting

​Connection Fails

​Queries Run Slowly

​Environment Variables Not Working

​Next Steps

Build docs developers (and LLMs) love

Supported Data Sources

Connection Configuration

Connection Examples

DuckDB (Local Database)

CSV Files

PostgreSQL

Snowflake

BigQuery

Databricks

Working with Multiple Sources

Querying Multiple Databases

Joining Across Sources

Query Organization

Source-Specific Queries

Shared Queries

Real Example: E-commerce Database

Best Practices

Security

Performance

Organization

Troubleshooting

Connection Fails

Queries Run Slowly

Environment Variables Not Working

Next Steps