Skip to main content

What is the Lead Intelligence Engine?

The Lead Intelligence Engine is an AI-powered lead qualification system designed for internal agency use. It analyzes business websites, enriches the analysis with domain knowledge, and automatically qualifies leads into your CRM.
This is a stateless, Python-based engine optimized for speed (sub-20 second latency) and accuracy through RAG-enhanced AI evaluation.

How It Works

The engine processes business URLs through a four-stage pipeline:
1

Web Extraction

Fetches and cleans HTML content from the target URL using BeautifulSoup with intelligent fallback to Jina for SPAs.
2

RAG Enrichment

Retrieves relevant context from your knowledge base (lead criteria, strategy frameworks, service definitions).
3

AI Evaluation

Groq LLM analyzes the business, matches services, scores fit, and generates outreach angles.
4

CRM Sync

Automatically inserts qualified leads into Coda with duplicate prevention.

Key Features

AI-Powered Analysis

Uses Groq LLM (llama-3.3-70b-versatile) with custom prompts for accurate business categorization.

RAG Context

Enriches evaluation with domain knowledge from your knowledge base for better accuracy.

Duplicate Prevention

Automatically checks Coda CRM by URL to prevent duplicate entries.

Multi-Interface

Available as CLI for batch processing and Telegram bot for interactive analysis.

Industry Rules

Smart exclusion logic prevents selling redundant services (e.g., no marketing to marketing agencies).

Token Tracking

Displays AI token consumption for cost monitoring and quota management.

Who Is This For?

Qualify inbound leads instantly by analyzing their business website. Get fit scores, service recommendations, and outreach angles automatically.
Research prospects at scale by processing lists of URLs. Build intelligence profiles for outreach campaigns.
Identify upsell opportunities by analyzing existing client websites for service gaps and maturity improvements.
Segment prospects by digital maturity, industry type, and service fit for targeted campaigns.

Architecture Overview

The Lead Intelligence Engine is built with a modular, stateless architecture:
┌─────────────┐
│   Input     │  CLI or Telegram Bot
│   (URL)     │
└──────┬──────┘


┌─────────────┐
│ LeadEngine  │  Orchestrates pipeline
│   (core)    │
└──────┬──────┘

       ├──► Extractor ──► Web scraping + Jina fallback

       ├──► RAG ──────── Knowledge retrieval

       ├──► Evaluator ─► Groq LLM + service matching

       └──► CodaClient ► Duplicate check + insert
Every URL is analyzed fresh - no caching. This ensures up-to-date analysis but means repeated URLs will consume tokens each time.

Performance Characteristics

  • Latency Target: < 20 seconds per URL
  • Token Usage: ~2,000-5,000 tokens per analysis (varies by website size)
  • Rate Limits: Telegram bot enforces 3 requests per minute per user
  • Concurrency: Stateless design allows parallel processing

Technology Stack

  • Language: Python 3.7+
  • AI Provider: Groq (llama-3.3-70b-versatile)
  • Web Scraping: BeautifulSoup4, requests, Jina AI
  • CRM Integration: Coda API
  • Bot Framework: python-telegram-bot
  • Knowledge Base: Local markdown files + JSON schemas

Next Steps

Installation

Set up your Python environment and configure credentials.

Quickstart

Analyze your first URL in under 2 minutes.

Architecture

Deep dive into how the pipeline components work together.

CLI Usage

Learn batch processing and automation patterns.

Build docs developers (and LLMs) love