Overview
Thesfn load command provides a convenient way to bulk-load CSV data into ElasticSearch indices. This is useful for importing threat intelligence feeds, historical data, or custom datasets into SafeNetworking.
Command Syntax
Description
This command reads a CSV file and uses ElasticSearch’s bulk API to efficiently load all rows as documents into the specified index. The CSV headers are automatically mapped to document field names.Arguments
Path to the CSV file to be loaded into ElasticSearch.Required: YesFormat: Must be a valid CSV file with headers in the first rowExample:
threat_data.csv or /path/to/data.csvName of the ElasticSearch index where documents will be stored.Required: YesConvention: Use lowercase with hyphens (e.g.,
sfn-custom-data)Example: sfn-threat-intelCSV File Format
The CSV file must have:- Header row - First row containing field names
- Data rows - Subsequent rows with corresponding values
- Proper escaping - Commas and quotes properly escaped per CSV standards
Example CSV Structure
Usage Examples
Load Threat Intelligence Feed
Import a threat intelligence CSV into a custom index:Load Historical DNS Data
Import historical DNS query logs:Load Custom IoT Data
Import additional IoT threat data from an external source:Load with Absolute Path
Import a file from a specific directory:Expected Output
The command performs a bulk load operation. Upon successful completion:- No output is displayed to the console
- All CSV rows are inserted as documents in ElasticSearch
- The index is created automatically if it doesn’t exist
Verifying the Load
After loading, verify the data using ElasticSearch queries or Kibana:Document Mapping
Each CSV row becomes an ElasticSearch document:CSV Input
ElasticSearch Document
Common Use Cases
Import OSINT Threat Feeds
Load open-source threat intelligence feeds:Migrate Data Between Environments
Transfer data from development to production:Load Historical Analysis Results
Import previously analyzed security data:Bulk Import Customer Data
Service providers can load customer-specific threat data:Error Handling
File Not Found
Invalid CSV Format
If the CSV is malformed:ElasticSearch Connection Issues
If ElasticSearch is unavailable:Performance Considerations
The command uses ElasticSearch’s
helpers.bulk() API for efficient batch loading, which is optimized for large datasets.Optimal File Sizes
- Small: < 10MB - Loads instantly
- Medium: 10MB - 100MB - Loads in seconds
- Large: 100MB - 1GB - May take minutes
- Very Large: > 1GB - Consider chunking
Best Practices
Index Naming Convention
Follow SafeNetworking naming patterns:Data Validation
Validate CSV data before loading:Backup Before Loading
If loading into an existing index:Notes
All documents are stored with the document type
"type" as per ElasticSearch conventions in use by SafeNetworking.