AbstractTask class.
Task Architecture
Processing Pipeline
Tasks are executed in a specific order defined inTaskInstaller.xml. Each item flows through the pipeline:
Multi-threaded Execution
IPED creates multiple worker threads, and each worker has its own instance of each task:- Each task instance processes items independently
- Shared data must use thread-safe objects in
caseData.objectMap - Processing order is important - some tasks depend on results from previous tasks
Creating a Custom Task
Basic Structure
ExtendAbstractTask and implement the required methods:
MyCustomTask.java
AbstractTask Class Reference
TheAbstractTask class provides these key members:
Real-World Example: Hash Task
Let’s examine a simplified version of IPED’sHashTask:
HashTask.java
Configuration Support
Tasks can have custom configuration files:Accessing Item Properties
IItem Interface
TheIItem interface provides access to all item properties:
Modifying Items
Sharing Data Between Threads
Using CaseData
Share objects across all task instances:Thread-Safe Collections
Use Java concurrent collections:Processing Subitems
Tasks can create new items (e.g., carved files):Checking Dependencies
Ensure required tasks are enabled:Installing Your Task
Best Practices
Task Ordering
Task Ordering
Place your task at the right point in the pipeline:
- Before ParsingTask: If you need raw file content
- After ParsingTask: If you need extracted text
- After CategoryTask: If you need file categories
- Before IndexTask: To ensure attributes are indexed
Performance
Performance
- Cache expensive computations using hash values
- Use
item.getLength()to skip empty files - Check
item.isDir()to skip directories - Use
processIgnoredItem()= false to skip ignored files - Release resources in
finish()method
Error Handling
Error Handling
- Catch exceptions to prevent stopping entire pipeline
- Log errors with
LOGGER.error() - Set error attributes:
item.setExtraAttribute("error", message) - Update statistics:
stats.incErrors()
Testing
Testing
- Test with small dataset first
- Check TaskInstaller.xml order carefully
- Verify shared objects are thread-safe
- Test with multiple worker threads
- Monitor memory usage with large batches
Task Execution Control
Enable/Disable Tasks
Process Ignored Items
Process Queue-End Markers
Next Steps
Scripting
Create tasks with JavaScript or Python
Web API
Access processed cases remotely