Skip to main content

Overview

The Covariate class represents metadata associated with subjects in the knowledge graph. While the class is named “Covariate” in the codebase, it is commonly used to represent claims - specific factual statements or assertions about entities extracted from the text. Covariates are flexible metadata containers where each subject (e.g., an entity) may be associated with multiple types of covariates. The most common use case is entity claims, which are factual statements about entities. Covariates inherit from the Identified base class, which provides id and short_id fields.

Schema

Core fields

id
string
required
Unique identifier for the covariate/claim.
short_id
string | null
Human-readable ID used to refer to this covariate in prompts or texts displayed to users.
subject_id
string
required
The ID of the subject this covariate is associated with. Typically an entity ID.
subject_type
string
default:"entity"
The type of the subject. Defaults to “entity” but can represent other subject types.
covariate_type
string
default:"claim"
The type of covariate. Defaults to “claim” for factual assertions, but can represent other metadata types.

Relationships

text_unit_ids
string[]
List of text unit IDs in which the covariate information appears. Links the claim back to its source text chunks.

Metadata

attributes
object
Additional attributes containing the actual claim content and metadata. For claims, this typically includes fields like:
  • description: The claim text
  • status: Verification status
  • start_date / end_date: Temporal validity
  • source_text: Original text snippet
  • Any custom metadata fields

Example

{
  "id": "cov1234567-89ab-cdef-0123-456789abcdef",
  "short_id": "0",
  "subject_id": "e1234567-89ab-cdef-0123-456789abcdef",
  "subject_type": "entity",
  "covariate_type": "claim",
  "text_unit_ids": ["t1", "t2"],
  "attributes": {
    "description": "Microsoft was founded in 1975",
    "status": "verified",
    "start_date": "1975-04-04",
    "source_text": "Bill Gates and Paul Allen founded Microsoft on April 4, 1975",
    "confidence": 0.95
  }
}

Creating from dictionary

The Covariate class provides a from_dict() class method to create instances from dictionary data:
claim = Covariate.from_dict({
    "id": "cov1234567-89ab-cdef-0123-456789abcdef",
    "subject_id": "e1234567-89ab-cdef-0123-456789abcdef",
    "covariate_type": "claim",
    "text_unit_ids": ["t1", "t2"],
    "attributes": {
        "description": "Microsoft was founded in 1975",
        "status": "verified"
    }
})

Common claim attributes

When using covariates as claims, the attributes dictionary typically contains:
attributes.description
string
The actual claim text - a factual statement about the subject.
attributes.status
string
Verification status of the claim (e.g., “verified”, “disputed”, “unverified”).
attributes.start_date
string
When the claim became true or valid.
attributes.end_date
string
When the claim ceased to be true or valid (if applicable).
attributes.source_text
string
The original text snippet from which the claim was extracted.
attributes.confidence
float
Confidence score for the claim extraction (0.0 to 1.0).

Use cases

  • Fact extraction: Store specific facts and assertions about entities
  • Temporal tracking: Track time-sensitive information with start/end dates
  • Source attribution: Link claims back to original text via text_unit_ids
  • Metadata enrichment: Add structured metadata to entities beyond basic descriptions
  • Claim verification: Store and track verification status of extracted facts

Multiple covariate types

While “claim” is the default and most common covariate type, the system supports multiple types of covariates per subject. This allows you to organize different kinds of metadata separately:
{
  "covariate_type": "claim",
  "attributes": {"description": "Founded in 1975"}
},
{
  "covariate_type": "metric",
  "attributes": {"revenue": "198B", "year": "2023"}
}

Build docs developers (and LLMs) love