Overview
The Lead Intelligence Engine prevents duplicate CRM entries by checking if a business URL already exists in Coda before inserting new records. This ensures your CRM stays clean and prevents wasted analysis on already-qualified leads.Duplication check is performed after AI evaluation but before CRM insertion, so you don’t waste Groq tokens on duplicates.
How It Works
core.py (lines 56-61)
Implementation Details
Coda Search Query
Thefetch_row_by_url() method uses Coda’s search API:
coda_client.py (lines 38-62)
Coda query format:
column_name:"value"Example: Business URL:"https://example.com"Only fetch 1 row since we just need to know if any match exists.
URL Matching Logic
The system performs exact string matching on the full URL:Edge Cases
URL Normalization
The engine does NOT normalize URLs before checking duplicates:Trailing Slashes
Trailing Slashes
Scenario: User analyzes both:
https://example.comhttps://example.com/
WWW Subdomain
WWW Subdomain
Scenario: User analyzes both:
https://example.comhttps://www.example.com
HTTP vs HTTPS
HTTP vs HTTPS
Scenario: User analyzes:
http://example.comhttps://example.com
Query Parameters
Query Parameters
Scenario: User analyzes:
https://example.comhttps://example.com?utm_source=facebook
Performance
API Latency
Coda search typically takes:- Average: 500-1,000ms
- 95th percentile: 1,500ms
- Timeout: 10s (configured)
Duplicate check adds ~1s to total pipeline latency, but prevents wasted Groq tokens and keeps CRM clean.
Coda API Limits
- Rate Limit: 100 requests per minute per API token
- Concurrency: Up to 10 concurrent requests
Error Handling
The system gracefully handles duplicate check failures:coda_client.py (lines 60-62)
Common Errors
401 Unauthorized
401 Unauthorized
404 Not Found
404 Not Found
Cause: Invalid
CODA_DOC_ID or CODA_TABLE_ID.Fix: Get correct IDs from Coda table URL:Timeout
Timeout
Cause: Coda API slow or unresponsive.Behavior: After 10s timeout, duplicate check returns
False (not duplicate).Fix: Retry the URL later, or increase timeout in coda_client.py.Monitoring Duplicates
CLI Output
When a duplicate is detected:The AI evaluation still runs and displays results, but no CRM insertion occurs. This lets you verify the analysis even for duplicates.
Telegram Bot Output
The bot displays a different message for duplicates:Manual Duplicate Resolution
If you need to re-analyze an existing lead:Batch Deduplication
For processing large URL lists with duplicates:batch_process.py
CLI Usage Guide
Learn batch processing patterns and automation
Coda Column Requirement
For duplicate detection to work, your Coda table must have a column named exactly:Coda Table Setup
When creating your Coda table, ensure these columns exist:| Column Name | Type | Required |
|---|---|---|
| Business URL | Text | Yes |
| Business Name | Text | Yes |
| Business Type | Text | Yes |
| Primary Service | Text | Yes |
| Secondary Service | Text | No |
| Fit Score | Number | Yes |
| Reasoning | Text | Yes |
| Outreach Angle | Text | Yes |
Coda Integration Guide
Complete setup instructions for Coda CRM
Best Practices
URL Consistency
URL Consistency
Train users to always use the same URL format:
- ✅ Always include
https:// - ✅ Remove
www.prefix - ✅ Remove trailing slashes
- ✅ Remove query parameters
extractor.py before processing.Bulk Upload
Bulk Upload
When importing historical leads, add them directly to Coda instead of processing through the engine. This populates the Business URL column for duplicate detection.
URL Validation
URL Validation
Validate URLs before analysis:
Re-qualification Period
Re-qualification Period
Consider adding a “Last Analyzed” date column in Coda. Re-analyze leads after 6-12 months to detect changes in digital maturity.
Future Enhancements
Potential improvements to duplicate detection:- Fuzzy Matching: Detect similar URLs with minor differences
- Domain-Level Deduplication: Treat
blog.example.comandshop.example.comas same business - Business Name Matching: If URL changes but business name matches, flag as potential duplicate
- Canonical URL Resolution: Follow redirects to find true destination before checking
Next Steps
Coda Integration
Complete guide to Coda setup and troubleshooting
CodaClient API
Programmatic usage of CRM functions
Architecture
How duplicate detection fits in the pipeline
CLI Usage
Batch processing with duplicate handling