Skip to main content

Understanding version control in Infrahub

Infrahub implements immutable history—a foundational principle where data cannot be deleted or modified in place. Every change creates a new version while preserving all previous states. This approach mirrors version control systems like Git, providing infrastructure teams with complete traceability, risk-free rollbacks, and powerful temporal queries.

Why immutability matters for infrastructure

Infrastructure changes carry significant risk. A misconfigured router can take down an entire data center. A wrong IP allocation can cause routing loops. Understanding what changed, when, and why becomes critical when troubleshooting incidents or auditing compliance. Immutable history provides several essential capabilities:
  • Complete audit trail: Every modification is permanently recorded with who made the change, what changed, and when it occurred
  • Time travel queries: Access the exact state of your infrastructure at any point in history to understand past configurations
  • Risk-free rollbacks: Return to any previous state without data loss when issues are detected
  • Compliance and forensics: Meet regulatory requirements with immutable change history for audits
  • Parallel workflows: Enable multiple teams to work on infrastructure changes simultaneously using branches
  • Change verification: Review proposed changes before committing them to production environments

Core concepts

Immutability at the attribute level

Unlike systems that capture entire object snapshots, Infrahub’s immutable history operates at the attribute level. When you change a device’s hostname from “router-1” to “router-2”, Infrahub doesn’t store two complete device objects. Instead, it stores:
  • The original hostname attribute value with its timestamp
  • The new hostname attribute value with its timestamp
  • All other attributes remain unchanged, referencing their original values
This granular approach offers several advantages: Storage efficiency: Only changed values consume storage, not entire object copies. A device with 50 attributes where you change the hostname only stores one new attribute value. Change clarity: You can see exactly which attributes changed in each commit without comparing full objects. Diffs show precise modifications rather than entire object replacements. Performance optimization: Queries retrieve only the attributes needed at the requested time, avoiding the overhead of loading complete historical snapshots.

Timestamps and commits

Every change in Infrahub is organized into commits, each with an immutable timestamp. These commits capture:
  • The specific attributes that changed
  • The new values for those attributes
  • Who made the change (account information)
  • When the change occurred (precise timestamp)
  • Which branch the change occurred in
Timestamps in Infrahub use microsecond precision and are immutable—they can never be changed or deleted. This ensures a complete and coherent historical record. The commit structure differs from Git’s commit model. Git commits represent snapshots of files at a point in time. Infrahub commits represent attribute-level changes in a graph database. However, both share the core principle: history is immutable and complete.

Temporal queries

Infrahub’s temporal query system allows you to retrieve data from any point in time. When you query the database, you’re accessing a specific moment in the database’s history. By default, queries return the latest state, but you can specify any timestamp to see exactly how your infrastructure looked at that moment. This capability extends across all interfaces: Web UI: Time navigation controls in the interface allow you to “rewind” to previous states. You can view how a device configuration looked last week or compare network topology changes over time. GraphQL API: Timestamp parameters for historical queries enable programmatic access to past states:
query {
  InfraDevice(at: "2024-03-01T10:00:00Z") {
    edges {
      node {
        hostname {
          value
        }
      }
    }
  }
}
REST API: Temporal query support for retrieving historical data through standard REST endpoints with timestamp parameters. Python SDK: Time-aware methods to access past states:
from infrahub_sdk import InfrahubClient

client = InfrahubClient()
devices = await client.get(
    kind="InfraDevice",
    at="2024-03-01T10:00:00Z"
)

Architecture and implementation

Graph database storage model

Infrahub stores attribute history using a graph structure in Neo4j. Each attribute value becomes a node in the graph with a timestamp property. When you query for data at a specific time, Infrahub traverses the graph to find the most recent attribute value before or at that timestamp. For example, a device hostname changing three times creates this structure:
Device Node
  ├─ hostname Attribute
  │   ├─ "router-1" (created_at: 2024-01-01T00:00:00Z)
  │   ├─ "router-2" (created_at: 2024-02-01T00:00:00Z)
  │   └─ "router-3" (created_at: 2024-03-01T00:00:00Z)
Querying at 2024-02-15T00:00:00Z returns “router-2” because it’s the most recent value before the requested time.

Copy-on-write semantics

Branches in Infrahub use copy-on-write semantics. When you create a branch, Infrahub doesn’t duplicate the entire database. Instead, it maintains:
  • A pointer to the branch base (branched_from timestamp)
  • Only the delta of changes made within the branch
  • References to unchanged data in the parent branch
This approach means creating a branch has minimal overhead, and storage grows only with actual changes. A branch with 1000 objects where you modify 5 objects only stores the 5 changed objects—the other 995 reference the parent branch.

Branch awareness and immutability

Some schema elements are branch-aware (changes are local to branches), while others are branch-agnostic (changes are global across all branches). This distinction affects how immutable history works: Branch-aware objects: Each branch maintains its own history. Changes in a feature branch don’t affect the main branch until merged. Historical queries respect branch boundaries—querying a branch’s history shows only that branch’s view of the data. Branch-agnostic objects: History is global across all branches. Changes are immediately visible everywhere. This mode is used for system-level configurations that should remain consistent. Branch-local objects: History stays in the branch and is never merged. These objects support temporary data and branch-specific metadata.

Use cases and workflows

Historical analysis

View how your infrastructure looked at a specific point in time to troubleshoot issues or understand past decisions. For example: Incident investigation: An outage occurred at 2:37 AM. Query the database state at 2:30 AM to see the configuration before the incident, then compare it with the state at 2:45 AM to identify what changed. Capacity planning: Compare infrastructure growth over time by querying device counts and interface utilization at monthly intervals. Change correlation: Determine if a recent schema change correlates with data quality issues by examining the schema and data state before and after the change.

Compliance auditing

Extract all changes performed within a specific time frame for regulatory compliance. Many industries require audit trails showing who changed what and when: SOC 2 compliance: Generate reports showing all infrastructure changes in the audit period with timestamps and account information. Change management validation: Prove that changes followed approval workflows by correlating proposed change records with committed changes. Security investigations: Trace unauthorized or unexpected changes to their source, including the account that made the change and the exact time it occurred.

Change impact assessment

Compare infrastructure states before and after significant changes to understand impact: Pre/post-migration analysis: Query the database state before a data center migration, perform the migration, then compare the new state to verify all objects migrated correctly. Rollback verification: After rolling back a failed change, compare the current state with the pre-change state to confirm complete rollback. A/B testing: Create two branches with different configurations, deploy them to test environments, and compare metrics to determine which performs better.

Knowledge preservation

Understand why specific configuration decisions were made, even as team members change. Commit messages and change history provide context: Onboarding: New team members can explore the history of infrastructure decisions, understanding why certain choices were made. Architecture evolution: Track how infrastructure architecture evolved over time, seeing the reasoning behind major changes. Pattern recognition: Identify recurring issues by examining historical changes and their outcomes.

Implementation examples

Querying historical data in GraphQL

Retrieve device information as it existed on a specific date:
query {
  InfraDevice(
    at: "2024-01-15T00:00:00Z",
    hostname__value: "router-1"
  ) {
    edges {
      node {
        id
        hostname {
          value
          updated_at
        }
        location {
          node {
            name {
              value
            }
          }
        }
      }
    }
  }
}
This query returns the device state at January 15, 2024, including attribute values and relationships as they existed at that time.

Comparing states across time

Use the diff functionality to compare infrastructure states:
query {
  InfraDiffTree(
    branch: "main",
    time_from: "2024-01-01T00:00:00Z",
    time_to: "2024-01-31T00:00:00Z"
  ) {
    nodes {
      id
      action
      kind
      display_label
      attributes {
        name
        action
        value_old
        value_new
      }
    }
  }
}
This query shows all changes between two timestamps, including added, updated, and deleted objects with their specific attribute changes.

Accessing historical data in Python SDK

from infrahub_sdk import InfrahubClient
from datetime import datetime, timedelta

client = InfrahubClient()

# Get current state
current_devices = await client.get(kind="InfraDevice")

# Get state from last week
week_ago = (datetime.now() - timedelta(days=7)).isoformat()
historical_devices = await client.get(
    kind="InfraDevice",
    at=week_ago
)

# Compare device counts
print(f"Current devices: {len(current_devices)}")
print(f"Devices last week: {len(historical_devices)}")
print(f"Change: {len(current_devices) - len(historical_devices)}")

Design trade-offs

Storage growth

Immutability means data accumulates over time. Every attribute change adds storage. For high-change environments, this can be significant. Infrahub mitigates storage growth through: Attribute-level granularity: Only changed attributes consume storage, not entire objects Efficient graph storage: Neo4j’s compact storage format minimizes overhead Compression: Historical data can be compressed since it’s rarely modified Future Infrahub versions may support archiving or pruning very old history while maintaining audit compliance.

Query performance

Historical queries can be more expensive than current-state queries because they require traversing temporal relationships. Infrahub optimizes this through: Indexing: Timestamp indexes enable efficient temporal lookups Caching: Frequently accessed historical states are cached Default behavior: Queries default to current state unless explicitly requesting historical data

Complexity vs. capability

Immutability adds system complexity—the database must track versions, timestamps, and branches. This complexity is worthwhile for infrastructure management where understanding change history is critical. The capability to audit, rollback, and analyze changes outweighs the implementation complexity.

Relationship to branching

Version control and branching are complementary concepts: Version control provides the time dimension—the ability to see how data evolved over time within a branch. Branching provides the isolation dimension—the ability to work on changes in parallel without affecting production. Together, they form a two-dimensional view of your infrastructure:
  • Horizontal axis: branches (isolation)
  • Vertical axis: time (history)
You can query any point in this two-dimensional space: “Show me how this device looked in the feature-network branch last Tuesday.”

Build docs developers (and LLMs) love