Installation
Install the Databricks connector:Configuration
Configuration Parameters
Databricks workspace hostname (e.g.,
dbc-a1b2c3d4-e5f6.cloud.databricks.com)HTTP path to your SQL Warehouse or cluster (e.g.,
/sql/1.0/warehouses/abc123def456)Personal access token for authentication
Connection port (typically 443 for HTTPS)
Getting Connection Details
SQL Warehouse
- In Databricks workspace, go to SQL Warehouses
- Click on your warehouse
- Go to Connection Details tab
- Find:
- Server hostname →
host - HTTP path →
path
- Server hostname →
- Create a personal access token in User Settings → Access Tokens
All-Purpose Cluster
- Go to Compute in Databricks workspace
- Click on your cluster
- Go to Advanced Options → JDBC/ODBC
- Find the HTTP path
Features
Type Mapping
Databricks types are mapped to Evidence types:- Numbers: BIGINT, INT, SMALLINT, TINYINT, DECIMAL, FLOAT, DOUBLE
- Strings: STRING, VARCHAR, CHAR, BINARY
- Dates: DATE, TIMESTAMP
- Booleans: BOOLEAN
Unity Catalog Support
Query tables across catalogs and schemas:queries/unity_catalog.sql
Delta Lake Support
Query Delta tables with time travel:queries/delta_time_travel.sql
queries/delta_timestamp.sql
Example Queries
Basic Query
queries/sales_summary.sql
Cross-Catalog Query
queries/cross_catalog.sql
Using Databricks Functions
queries/advanced_analytics.sql
Performance Optimization
Use Appropriate Warehouse Size
Choose warehouse size based on query complexity:- X-Small/Small: Simple queries, small datasets
- Medium/Large: Complex joins, aggregations
- X-Large/2X-Large: Heavy analytical workloads
Enable Auto-Stop
Configure SQL Warehouses to auto-stop after inactivity to save costs.Partition Filtering
Filter on partition columns for better performance:Use Delta Table Optimizations
Limit Result Sets
For exploratory queries, use LIMIT:Security Best Practices
Use Personal Access Tokens
Create tokens with appropriate expiration:- User Settings → Access Tokens → Generate New Token
- Set a descriptive comment (e.g., “Evidence Analytics”)
- Set expiration (recommended: 90 days)
- Store securely in environment variables
Rotate Tokens Regularly
Update tokens before expiration to avoid connection failures.Use Service Principals
For production deployments, use service principals instead of personal tokens:connection.yaml
Troubleshooting
Authentication errors
Authentication errors
- Verify the personal access token is valid and not expired
- Ensure the token has permission to access the warehouse/cluster
- Check that the token is stored correctly in environment variables
Connection timeout
Connection timeout
- Verify the host and HTTP path are correct
- Ensure the SQL Warehouse or cluster is running
- Check network connectivity and firewall rules
- For SQL Warehouses, check if auto-stop is enabled and restart if needed
Table not found errors
Table not found errors
- Verify the full table name with catalog and schema:
catalog.schema.table - Check permissions on the table/schema/catalog
- Use
SHOW TABLES IN catalog.schemato list available tables
Permission denied
Permission denied
Grant appropriate permissions:
Slow query performance
Slow query performance
- Check SQL Warehouse size and scale up if needed
- Verify tables are optimized (run OPTIMIZE on Delta tables)
- Use EXPLAIN to analyze query execution plan
- Add partition filters to queries
- Consider creating materialized views for frequently accessed aggregations
Unity Catalog Migration
If migrating from Hive metastore to Unity Catalog:queries/migrate_reference.sql
catalog.schema.table