Skip to main content
BR-ACC uses a property graph data model built on Neo4j, designed to capture complex relationships between Brazilian public entities, contracts, politicians, and companies.

Graph Model Architecture

The BR-ACC graph consists of:
  • 58+ entity types (nodes) representing people, companies, contracts, sanctions, etc.
  • 25+ relationship types (edges) connecting entities based on legal, financial, and social connections
  • Property attributes storing metadata, timestamps, identifiers, and values
  • Constraints and indexes ensuring data integrity and query performance

Core Design Principles

BR-ACC uses multiple identification strategies:
  • Primary identifiers: CPF for persons, CNPJ for companies
  • Partial matching: CPF middle 6 digits, partial document numbers
  • Fuzzy matching: Name similarity with geographic constraints
  • SAME_AS relationships: Linking confirmed duplicate entities
  • POSSIBLE_SAME_AS: Probabilistic entity matches requiring review
Many relationships and properties are time-bound:
  • Contract dates, sanction start/end dates
  • Snapshot-based company ownership (SOCIO_DE_SNAPSHOT with snapshot_date)
  • Election years, amendment dates
  • Source document retrieval timestamps
This enables temporal queries like “Who owned this company in 2022?” or “Was this person sanctioned when they won this contract?”
Entities and relationships track their data sources:
  • SourceDocument nodes link to original data files
  • IngestionRun tracks ETL pipeline executions
  • Properties include source database IDs (e.g., tse_candidato_id, pncp_contract_id)
  • Multiple sources may contribute to the same entity
The model includes privacy controls:
  • is_pep (Politically Exposed Person) flag for transparency requirements
  • CPF masking in public APIs
  • exposure_tier metadata for access control
  • User and Investigation nodes for private workspace management

Graph Topology

The BR-ACC graph forms a heterogeneous network with several connected subgraphs:

Hub Entities

Certain entity types act as hubs connecting multiple subgraphs:
  • Company: Connects to contracts, partners, sanctions, donations, embargoes
  • Person: Connects to companies, elections, amendments, public offices, family
  • Contract: Connects companies to government entities and amendments

Query Patterns

The graph model supports several investigation patterns:

Network Analysis

Find all companies connected to a politician through partnerships, family, or donations

Temporal Queries

Identify contracts won after a company was sanctioned or embargoed

Aggregation

Calculate total contract values by company, region, or contracting organization

Pattern Detection

Detect self-dealing (amendments benefiting family companies) or contract concentration

Entity Lifecycle

Data Completeness

Not all entities have complete information:
  • Some persons lack CPF (only names with geographic hints)
  • Partners may have partial document numbers only
  • Historical data may be incomplete or inconsistent
  • Offshore entities use non-Brazilian identifiers
The model accommodates partial data through optional properties and probabilistic matching.

Next Steps

Entity Types

Explore all 58+ entity types and their properties

Relationships

Learn about relationship types and their semantics

Schema Reference

View the complete Neo4j schema definition

Cypher Basics

Start querying the graph database

Build docs developers (and LLMs) love