Overview
The Extract node (Data Extractor) extracts structured data from multiple similar elements on a web page. It’s ideal for scraping lists, tables, product cards, search results, and other repeating patterns. Results can be stored in context and optionally saved to CSV.Configuration
Selector for the container elements. Each matching element will be processed.Example:
.product-card, tr.data-row, div.search-resultSupports variable interpolation: ${data.containerClass}Type of selector for containers:
css, xpath, textArray of field definitions specifying what data to extract from each container.Field Definition:
name: Field name (becomes object key)selector: Element selector within containerselectorType:css,xpath, ortextextract: What to extract:text,attribute, orinnerHTMLattribute: Attribute name (required if extract isattribute)
Context variable name to store the extracted data array.
Maximum number of containers to process.
0 means process all.Wait for the first container element to be visible before extraction.
Maximum time in milliseconds to wait for elements.
If
true, missing fields in containers will be set to null instead of throwing errors.CSV Export
Enable CSV file export of extracted data.
Path to save the CSV file. Required if
saveToCSV is enabled.Supports variable interpolation: exports/data-${data.timestamp}.csvCSV column delimiter:
,, ;, \t, or custom.Examples
Extract Product List
Extract Table Data
Extract Search Results
Search Results
Extract and Save to CSV
With CSV Export
Extract with Limit
First 5 Items
Accessing Extracted Data
Field Extract Types
| Type | Description | Use Case |
|---|---|---|
text | Extract text content | Visible text, labels, descriptions |
attribute | Extract attribute value | URLs (href), images (src), IDs (data-id) |
innerHTML | Extract inner HTML | Rich content, formatted text |
Notes
The extractor processes each container element in sequence. For large datasets, consider using the
limit parameter to control processing time.CSV export automatically creates parent directories if they don’t exist. Relative paths are resolved from the project root.
Best Practices
Common Patterns
Pagination Loop
Conditional Extraction
Related Nodes
- Get Text - Extract from single elements
- Loop - Process extracted data
- JavaScript Code - Transform data
- API Request - Send extracted data
