Overview
The Common Internal Representation (CIR) is the normalized data format used throughout Mimir AIP. Every piece of data flowing through the platform — whether ingested from APIs, databases, files, or streams — is converted to CIR format before storage and processing. CIR provides:- Provenance tracking: Know where data came from and when
- Format independence: Work with data regardless of original format
- Schema inference: Automatically detect data structure
- Quality metrics: Track data quality indicators
- Consistency: Unified interface across all storage backends
Structure
A CIR object consists of three main blocks:Source
Provenance information — where the data came from, when it was ingested, and in what format.
Data
The actual payload — can be JSON objects, arrays, CSV records, text, or binary data.
Metadata
Size, encoding, record count, inferred schema, and quality metrics.
Type Definition
pkg/models/cir.go:35
Source Block
TheSource block captures provenance information for data lineage and debugging.
pkg/models/cir.go:43
Source Types
api
api
Data fetched from REST APIs or HTTP endpoints.URI format: Full URL including query parameters
file
file
Data loaded from local or remote files.URI format: File path or file:// URL
database
database
Data queried from SQL or NoSQL databases.URI format: Connection string or database identifier
stream
stream
Data ingested from real-time streams (Kafka, webhooks, etc.).URI format: Stream endpoint or topic name
Data Formats
| Format | Description | Example Use Cases |
|---|---|---|
csv | Comma-separated values | Export files, spreadsheets |
json | JSON objects or arrays | API responses, config files |
xml | XML documents | SOAP APIs, legacy systems |
text | Plain text | Logs, notes, unstructured data |
binary | Binary data | Images, PDFs, arbitrary files |
Data Block
TheData block contains the actual payload in its native structure.
- JSON Object
- JSON Array
- CSV
- Text
Single entity as a JSON object:
Metadata Block
TheMetadata block provides information about the data itself.
pkg/models/cir.go:52
Schema Inference
Schema Inference
Automatically detected structure of the data payload.
Quality Metrics
Quality Metrics
Data quality indicators calculated during ingestion.
Helper Functions
Mimir provides utility functions for working with CIR objects.Creating CIR Objects
Accessing CIR Data
Validation
pkg/models/cir.go:78
Querying CIR Data
Storage plugins useCIRQuery to retrieve data from backends.
pkg/models/storage.go:91
Storage Operations
All storage plugins implement theStoragePlugin interface:
pkg/models/storage.go:28
Store
Store
Store a CIR object in the backend:
Retrieve
Retrieve
Query and retrieve CIR objects:
Update
Update
Update existing CIR data:
Delete
Delete
Delete CIR data matching query:
Best Practices
Always Set Source URI
Include a descriptive, unique URI for data lineage tracking. Use full URLs for APIs, absolute paths for files.
Preserve Original Format
Store the original format in
Source.Format even after conversion. This aids in debugging and re-processing.Include Parameters
Store ingestion parameters (HTTP headers, query filters, pagination state) for reproducibility.
Calculate Quality Metrics
Add quality metrics during ingestion to enable data quality monitoring and alerting.
Update Size
Call
cir.UpdateSize() after modifying data to keep metadata accurate.Validate Before Storage
Call
cir.Validate() before passing to storage plugins to catch errors early.Complete Example
Next Steps
Storage Plugins
Learn how to implement custom storage plugins.
Pipeline Development
Create pipelines that produce and consume CIR data.
API Reference
Explore storage API endpoints.