Overview
TheEntityExtractor class provides a unified interface for extracting entities from text using either cloud-based (Gemini) or local (Ollama) models. It eliminates code duplication by supporting all four entity types through a single generic implementation.
Class Definition
Constructor Parameters
The type of entity to extract. Supported values:
"people"- Person entities"organizations"- Organization entities"locations"- Location entities"events"- Event entities
The domain configuration to use. Determines which prompts and schemas are loaded.
Raises
ValueError- Ifentity_typeis not one of the supported values
Methods
extract_cloud()
Extract entities using cloud-based models (Gemini via Google AI).The article text to extract entities from
The cloud model to use (defaults to
gemini-2.0-flash-exp)Sampling temperature for generation (0 = deterministic)
Optional suffix appended to the system prompt on retry attempts. Used by QC retry logic to guide the model toward fixing specific issues (e.g., “avoid generic names”).
List of extracted entities as dictionaries with type-specific fields
Example
extract_local()
Extract entities using local models (Ollama with structured output via Instructor).The article text to extract entities from
The local model to use (defaults to
llama3.3:70b-instruct-q8_0)Sampling temperature for generation (0 = deterministic)
Optional suffix appended to the system prompt on retry. See
extract_cloud() for details.List of extracted entities as dictionaries
Uses the same
List[Entity] response model as cloud extraction so prompts (which teach bare JSON arrays) parse correctly in both modes.extract()
Convenience method that routes to eitherextract_cloud() or extract_local() based on model type.
The article text to extract entities from
Either
"gemini" for cloud or "ollama" for localSpecific model to use. If
None, uses the appropriate default for the model type.Sampling temperature for generation
List of extracted entities as dictionaries
Example
Entity Type Configuration
The extractor internally maps entity types to their Pydantic models and list attribute names:configs/<domain>/types/ using the domain configuration system.
Convenience Factory Functions
For backward compatibility, entity-specific factory functions are provided:Integration with QC Retry
Therepair_hint parameter enables single-attempt retry when extraction quality control detects severe issues:
Source Location
~/workspace/source/src/engine/extractors.py