Data Sources
Evidence connects to databases and extracts data into a common storage format (Parquet) to enable querying across multiple data sources using SQL.How Data Sources Work
Evidence uses a Universal SQL architecture that:- Extracts data from your source databases into Parquet files
- Enables querying across multiple data sources using DuckDB SQL dialect
- Stores extracted data locally for fast query performance
This architecture allows you to query data from PostgreSQL, Snowflake, BigQuery, and other sources in a single SQL query.
Connecting a Data Source
To connect your development environment to a database:Using the Settings UI
-
Start your Evidence app:
- Navigate to localhost:3000/settings
- Select your data source type, name it, and enter required credentials
- Test the connection to verify it works
Evidence saves your credentials locally in your development environment. Production credentials are managed via environment variables.
Supported Data Sources
Evidence supports a wide range of data sources: SQL Databases:- BigQuery
- Snowflake
- Redshift
- PostgreSQL / Timescale
- Microsoft SQL Server
- MySQL
- SQLite
- DuckDB
- MotherDuck
- Databricks
- Trino
- Cube
- CSV files
- Google Sheets
- JavaScript data sources
- API connectors via plugins
Configuring Source Queries
For SQL data sources, you define which data to extract by adding.sql files to the /sources/[source_name]/ directory.
Example Source Query
Create a filesources/my_database/orders.sql:
my_database.orders.
Running Sources
Extract data from your configured sources:Running Specific Sources
For large data sources, you can run only what you need:Using Extracted Data
Once extracted, query your data in Evidence pages using the DuckDB SQL dialect:Build-Time Variables
You can parameterize source queries using environment variables with theEVIDENCE_VAR__ prefix.
.env
Build-time variables are only available in source queries, not in page queries or markdown files.
Working with Large Data
If you encounter memory errors when running sources:Production Deployment
In production, credentials are managed via environment variables. Each data source has specific environment variables for credentials.See the deployment configuration documentation for details on setting up production credentials.