Skip to main content

Overview

BR-ACC’s Neo4j database is optimized with indexes and constraints to ensure fast query performance. This guide shows how to leverage these optimizations.

Database Schema

BR-ACC automatically creates indexes and constraints during initialization.
All schema definitions are in schema_init.cypher and applied on database startup

Uniqueness Constraints

Constraints ensure data integrity and automatically create indexes:
// Person uniqueness
CREATE CONSTRAINT person_cpf_unique IF NOT EXISTS
  FOR (p:Person) REQUIRE p.cpf IS UNIQUE;

// Partner uniqueness
CREATE CONSTRAINT partner_id_unique IF NOT EXISTS
  FOR (p:Partner) REQUIRE p.partner_id IS UNIQUE;

// Company uniqueness
CREATE CONSTRAINT company_cnpj_unique IF NOT EXISTS
  FOR (c:Company) REQUIRE c.cnpj IS UNIQUE;

// Contract uniqueness
CREATE CONSTRAINT contract_contract_id_unique IF NOT EXISTS
  FOR (c:Contract) REQUIRE c.contract_id IS UNIQUE;

// Sanction uniqueness
CREATE CONSTRAINT sanction_sanction_id_unique IF NOT EXISTS
  FOR (s:Sanction) REQUIRE s.sanction_id IS UNIQUE;
Uniqueness constraints automatically create an index, so you don’t need separate index creation for these properties.

Property Indexes

Indexes accelerate queries that filter or sort by specific properties:

Person Indexes

// Name search
CREATE INDEX person_name IF NOT EXISTS
  FOR (p:Person) ON (p.name);

// Partial CPF matching
CREATE INDEX person_cpf_middle6 IF NOT EXISTS
  FOR (p:Person) ON (p.cpf_middle6);

CREATE INDEX person_cpf_partial IF NOT EXISTS
  FOR (p:Person) ON (p.cpf_partial);

// Electoral data
CREATE INDEX person_sq_candidato IF NOT EXISTS
  FOR (p:Person) ON (p.sq_candidato);

// Author key for amendments
CREATE INDEX person_author_key IF NOT EXISTS
  FOR (p:Person) ON (p.author_key);

// Public servant ID
CREATE INDEX person_servidor_id IF NOT EXISTS
  FOR (p:Person) ON (p.servidor_id);

Partner Indexes

CREATE INDEX partner_name IF NOT EXISTS
  FOR (p:Partner) ON (p.name);

CREATE INDEX partner_doc_partial IF NOT EXISTS
  FOR (p:Partner) ON (p.doc_partial);

// Composite index for combined lookups
CREATE INDEX partner_name_doc_partial IF NOT EXISTS
  FOR (p:Partner) ON (p.name, p.doc_partial);

Company Indexes

// Company name
CREATE INDEX company_razao_social IF NOT EXISTS
  FOR (c:Company) ON (c.razao_social);

// Economic sector
CREATE INDEX company_cnae_principal IF NOT EXISTS
  FOR (c:Company) ON (c.cnae_principal);

Contract Indexes

// Contract value for aggregations
CREATE INDEX contract_value IF NOT EXISTS
  FOR (c:Contract) ON (c.value);

// Contract date for temporal queries
CREATE INDEX contract_date IF NOT EXISTS
  FOR (c:Contract) ON (c.date);

// Contracting organization
CREATE INDEX contract_contracting_org IF NOT EXISTS
  FOR (c:Contract) ON (c.contracting_org);

// Contract description/object
CREATE INDEX contract_object IF NOT EXISTS
  FOR (c:Contract) ON (c.object);

Sanction Indexes

// Sanction type
CREATE INDEX sanction_type IF NOT EXISTS
  FOR (s:Sanction) ON (s.type);

// Sanction start date
CREATE INDEX sanction_date_start IF NOT EXISTS
  FOR (s:Sanction) ON (s.date_start);

Election Indexes

// Election year
CREATE INDEX election_year IF NOT EXISTS
  FOR (e:Election) ON (e.year);

// Composite index for detailed election queries
CREATE INDEX election_composite IF NOT EXISTS
  FOR (e:Election) ON (e.year, e.cargo, e.uf, e.municipio);

Amendment Indexes

CREATE INDEX amendment_function IF NOT EXISTS
  FOR (a:Amendment) ON (a.function);

CREATE INDEX amendment_value_committed IF NOT EXISTS
  FOR (a:Amendment) ON (a.value_committed);

Finance Indexes

CREATE INDEX finance_type IF NOT EXISTS
  FOR (f:Finance) ON (f.type);

CREATE INDEX finance_value IF NOT EXISTS
  FOR (f:Finance) ON (f.value);

CREATE INDEX finance_date IF NOT EXISTS
  FOR (f:Finance) ON (f.date);

CREATE INDEX finance_source IF NOT EXISTS
  FOR (f:Finance) ON (f.source);

Embargo Indexes

CREATE INDEX embargo_uf IF NOT EXISTS
  FOR (e:Embargo) ON (e.uf);

CREATE INDEX embargo_biome IF NOT EXISTS
  FOR (e:Embargo) ON (e.biome);

Health Facility Indexes

CREATE INDEX health_name IF NOT EXISTS
  FOR (h:Health) ON (h.name);

CREATE INDEX health_uf IF NOT EXISTS
  FOR (h:Health) ON (h.uf);

CREATE INDEX health_municipio IF NOT EXISTS
  FOR (h:Health) ON (h.municipio);

CREATE INDEX health_atende_sus IF NOT EXISTS
  FOR (h:Health) ON (h.atende_sus);

Full-Text Search Index

BR-ACC uses Neo4j’s full-text search for entity discovery:
CREATE FULLTEXT INDEX entity_search IF NOT EXISTS
  FOR (n:Person|Partner|Company|Health|Education|Contract|
       Amendment|Convenio|Embargo|PublicOffice|Inquiry|
       InquiryRequirement|MunicipalContract|MunicipalBid|
       MunicipalGazetteAct|JudicialCase|SourceDocument)
  ON EACH [n.name, n.razao_social, n.cpf, n.cnpj, 
           n.doc_partial, n.doc_raw, n.cnes_code, 
           n.object, n.contracting_org, n.convenente, 
           n.infraction, n.org, n.function, n.subject, 
           n.text, n.topic, n.case_number, n.url];
Use full-text search for fuzzy matching and multi-field searches. Use property indexes for exact matches.

Full-Text Search Usage

// Fuzzy search across multiple fields
CALL db.index.fulltext.queryNodes("entity_search", "Petrobras~")
YIELD node, score
RETURN node, score
ORDER BY score DESC
LIMIT 10

Relationship Indexes

BR-ACC indexes critical relationship properties:
// Partnership snapshot membership
CREATE INDEX socio_snapshot_membership_id IF NOT EXISTS
  FOR ()-[r:SOCIO_DE_SNAPSHOT]-() ON (r.membership_id);

// Partnership snapshot date
CREATE INDEX socio_snapshot_date IF NOT EXISTS
  FOR ()-[r:SOCIO_DE_SNAPSHOT]-() ON (r.snapshot_date);

Performance Best Practices

1. Use Indexes in WHERE Clauses

// Uses company_cnpj_unique constraint
MATCH (c:Company)
WHERE c.cnpj = $cnpj
RETURN c
Always specify the node label to enable index usage

2. Limit Relationship Depth

// Bounded depth
MATCH (c:Company)-[:VENCEU*1..3]->(ct:Contract)
WHERE c.cnpj = $cnpj
RETURN ct
LIMIT 100

3. Filter Early

// Filter at match time
MATCH (c:Company {cnpj: $cnpj})-[:VENCEU]->(ct:Contract)
WHERE ct.value > 1000000
RETURN ct

4. Use Composite Indexes

For queries that filter on multiple properties:
// Leverages election_composite index
MATCH (e:Election)
WHERE e.year = 2022 
  AND e.cargo = 'PRESIDENTE'
  AND e.uf = 'SP'
RETURN e

5. Prefer EXISTS() for Existence Checks

// Fast existence check
MATCH (c:Company)
WHERE EXISTS((c)-[:SANCIONADA]->(:Sanction))
RETURN c

6. Use CALL Subqueries for Complex Aggregations

MATCH (c:Company)
WHERE c.cnpj = $cnpj
CALL {
  WITH c
  MATCH (c)-[:VENCEU]->(ct:Contract)
  RETURN sum(ct.value) AS total_contracts
}
CALL {
  WITH c
  MATCH (c)-[:SANCIONADA]->(s:Sanction)
  RETURN count(s) AS total_sanctions
}
RETURN c, total_contracts, total_sanctions
CALL subqueries isolate aggregations and can improve query plan efficiency

Query Profiling

EXPLAIN - View Query Plan

EXPLAIN
MATCH (c:Company)
WHERE c.cnpj = $cnpj
RETURN c

PROFILE - Get Execution Statistics

PROFILE
MATCH (c:Company)-[:VENCEU]->(ct:Contract)
WHERE c.cnpj = $cnpj
RETURN count(ct) AS contract_count
Look for:
  • db hits - Lower is better
  • Index usage - Should see “NodeIndexSeek” or “NodeUniqueIndexSeek”
  • Rows - Understand cardinality at each step

Common Anti-Patterns

Avoid these common performance pitfalls

Anti-Pattern 1: COLLECT Before Filter

MATCH (c:Company)-[:VENCEU]->(ct:Contract)
WHERE c.cnpj = $cnpj AND ct.value > 1000000
RETURN count(ct)

Anti-Pattern 2: Unnecessary OPTIONAL MATCH

MATCH (c:Company)
WHERE c.cnpj = $cnpj
RETURN c,
       EXISTS((c)-[:SANCIONADA]->(:Sanction)) AS has_sanctions

Anti-Pattern 3: Cartesian Products

MATCH (c:Company)-[:VENCEU]->(ct:Contract)
WHERE c.cnpj = $cnpj AND ct.date > $date
RETURN c, ct

Monitoring Query Performance

Check Running Queries

CALL dbms.listQueries()
YIELD queryId, elapsedTimeMillis, query
WHERE elapsedTimeMillis > 1000
RETURN queryId, elapsedTimeMillis, query
ORDER BY elapsedTimeMillis DESC

Kill Long-Running Query

CALL dbms.killQuery($queryId)

Next Steps

Cypher Basics

Review fundamental query syntax

Common Patterns

Explore network and risk pattern queries

Build docs developers (and LLMs) love