List Enrichment

Add research-powered enrichment columns to Extruct company tables.

Trigger Phrases

“enrich”, “add column”, “add data point”, “research column”, “enrich table”, “enrichment”, “add a field”, “run enrichment”, “monitor enrichment”

Environment

Variable	Service
`EXTRUCT_API_TOKEN`	Extruct API

Before making API calls, check that EXTRUCT_API_TOKEN is set by running test -n "$EXTRUCT_API_TOKEN" && echo "set" || echo "missing". If missing, ask the user to provide their Extruct API token and set it via export EXTRUCT_API_TOKEN=<value>. Do not proceed until confirmed.

Base URL: https://api.extruct.ai/v1

Official API Reference

https://www.extruct.ai/docs

Workflow

Verify API reference

Read local reference: references/api_reference.md
Fetch live docs: https://www.extruct.ai/docs
Compare endpoints, params, and response fields
If discrepancies found:
- Update the local reference file
- Flag changes to the user before proceeding
Proceed with the skill workflow

Confirm the table

Get the table ID from the user (URL or ID). Fetch table metadata via GET /tables/{table_id}. Show the user: table name, row count, existing columns.

Get column configs

Two paths:Path A: From enrichment-design - User has column_configs ready. Confirm and proceed.Path B: Design on the fly - Confirm with the user:

What data point? - what to research (e.g. “funding stage”, “primary vertical”, “tech stack”)
Output format - pick the right format:

Format	When to use	Extra params
`text`	Free-form research output	-
`number` / `money`	Numeric data (revenue, headcount)	-
`select`	Single choice from known categories	`labels: [...]`
`multiselect`	Multiple tags from known categories	`labels: [...]`
`json`	Structured multi-field data	`output_schema: {...}`
`grade`	1-5 score	-
`label`	Single tag from list	`labels: [...]`
`date`	Date values	-
`url` / `email` / `phone`	Contact info	-

Agent type - default research_pro. Use llm when no web research needed (classification from existing profile data).

Write the prompt

Craft a clear prompt using {input} for the row’s domain value. Prompt guidelines:

Be specific about what to find
Specify the exact output format in the prompt (e.g. “Return ONLY a number in millions USD”)
Include fallback instruction (e.g. “If not found, return N/A”)
For select/multiselect, the labels constrain the output - the prompt should guide which label to pick

Create the column(s)

Create columns via POST /tables/{table_id}/columns with the column_configs array.

Trigger enrichment (only the new columns)

Run via POST /tables/{table_id}/run with { "mode": "new", "columns": [new_column_ids] }.

Always scope the run to the newly created column(s) only. Avoid broad or implicit run payloads when you only intend to enrich specific columns.

Report: run ID, rows queued, and table URL.

Monitor progress

Poll the table data via GET /tables/{table_id}/data every 30 seconds. For each row, check the status field of the relevant column cells (done, pending, failed).Show the user:

Current % complete (done cells / total cells)
Number of failed cells (if any)
Estimated time remaining (based on rate so far)

Stop polling when all cells are done or failed.

Quality spot-check

After enrichment completes (or after 50%+ is done), fetch a sample of 5-10 enriched rows and display for review.Present to user as a table. Ask:

“Does the data quality look right?”
“Any columns returning garbage or N/A too often?”
“Should we adjust any prompts and re-run?”

If quality issues are found:

Delete the problematic column
Adjust the prompt
Re-create and re-run

Output Formats Reference

All Available Output Formats

Format	Description	Use Case	Extra Params
`text`	Free-form text	Open-ended research, descriptions	-
`number`	Numeric value	Revenue, headcount, counts	-
`money`	Currency amount	Funding, ARR, deal size	-
`select`	Single choice	Categories, stages, maturity	`labels: [...]`
`multiselect`	Multiple choices	Tags, industries, use cases	`labels: [...]`
`json`	Structured data	Multiple related fields	`output_schema: {...}`
`grade`	1-5 rating	Fit scores, quality ratings	-
`label`	Single tag	Status, tier, segment	`labels: [...]`
`date`	Date value	Launch dates, funding dates	-
`url`	URL	LinkedIn, website, blog	-
`email`	Email address	Contact emails	-
`phone`	Phone number	Contact phones	-

Agent Types

Agent Type	When to Use
`research_pro`	Needs web research (funding, news, launches, tech stack)
`llm`	Classification from existing company profile (no web research needed)
`research_reasoning`	Nuanced judgment requiring chain-of-thought (fit scoring, maturity assessment)
`linkedin`	People/org structure data from LinkedIn

Example Column Configs

Funding Stage (select)

{
  "kind": "agent",
  "name": "Funding Stage",
  "key": "funding_stage",
  "value": {
    "agent_type": "research_pro",
    "prompt": "Find the latest funding stage for {input}. Choose from: Seed, Series A, Series B, Series C+, Bootstrapped, Unknown. Return only the label.",
    "output_format": "select",
    "labels": ["Seed", "Series A", "Series B", "Series C+", "Bootstrapped", "Unknown"]
  }
}

Quality Checks

After enrichment completes, always perform a quality spot-check:

Sample 5-10 rows
Check for high N/A rates (>30% suggests prompt issues)
Check for irrelevant data (suggests agent type mismatch)
Check for format mismatches (suggests output_format mismatch)

Get Started

Core Concepts

Skills Reference

Guides

API Integration

Trigger Phrases

Environment

Official API Reference

Workflow

Output Formats Reference

Agent Types

Example Column Configs

Quality Checks

Build docs developers (and LLMs) love

Get Started

Core Concepts

Skills Reference

Guides

API Integration

​Trigger Phrases

​Environment

​Official API Reference

​Workflow

​Output Formats Reference

​Agent Types

​Example Column Configs

​Quality Checks

Build docs developers (and LLMs) love

Trigger Phrases

Environment

Official API Reference

Workflow

Output Formats Reference

Agent Types

Example Column Configs

Quality Checks