Skip to main content

Overview

This guide will walk you through the basic usage of bormeparser, from parsing BORME PDF files to extracting company data and exporting to JSON.
Make sure you have installed bormeparser before following this guide.

Basic Workflow

1

Parse a BORME PDF file

The main entry point is the parse() function. You need to specify the file path and the section type:
import bormeparser
from bormeparser import SECCION

# Parse a BORME PDF file (Section A)
borme = bormeparser.parse('BORME-A-2015-27-10.pdf', SECCION.A)

print(borme)
# Output: <Borme(2015-02-09) seccion:A provincia:Málaga>
The parse() function returns a Borme object containing all the parsed data.
2

Access basic information

The Borme object contains metadata about the bulletin:
# Access BORME metadata
print(f"Date: {borme.date}")
print(f"Section: {borme.seccion}")
print(f"Province: {borme.provincia}")
print(f"CVE: {borme.cve}")
print(f"Number: {borme.num}")

# Get announcement range
print(f"Announcements from {borme.anuncios_rango[0]} to {borme.anuncios_rango[1]}")
3

Extract company announcements

Each BORME file contains multiple announcements (anuncios), one per company:
# Get all announcement IDs
ids = borme.get_anuncios_ids()
print(f"Total announcements: {len(ids)}")

# Get all announcements
anuncios = borme.get_anuncios()

# Access a specific announcement
anuncio = borme.get_anuncio(ids[0])
print(f"Company: {anuncio.empresa}")
print(f"Registry: {anuncio.registro}")
4

Extract acts and company data

Each announcement contains one or more acts (actos) representing registry actions:
# Iterate through announcements
for anuncio in borme.get_anuncios():
    print(f"\nCompany: {anuncio.empresa}")
    print(f"Registry: {anuncio.registro}")
    
    # Get acts for this company
    for acto_nombre, acto_valor in anuncio.get_actos():
        print(f"  {acto_nombre}: {acto_valor}")
Acts can be text-based (like company constitution) or cargo-based (appointments):
from bormeparser.borme import BormeActoCargo, BormeActoTexto

# Check act types
for acto in anuncio.get_borme_actos():
    if isinstance(acto, BormeActoCargo):
        # Cargo acts have appointments
        print(f"Position act: {acto.name}")
        for cargo, nombres in acto.cargos.items():
            print(f"  {cargo}: {', '.join(nombres)}")
    elif isinstance(acto, BormeActoTexto):
        # Text acts have simple values
        print(f"Text act: {acto.name} = {acto.value}")
5

Export to JSON

Convert the parsed BORME data to JSON format for easy storage and processing:
# Export to JSON file
json_path = borme.to_json('output.json', pretty=True)
print(f"Saved to: {json_path}")

# The JSON file will contain:
# - BORME metadata (date, section, province, CVE)
# - All announcements with company data
# - All acts for each company
# - Registry information
You can also customize the export:
# Export without URL (no internet required)
borme.to_json('output.json', include_url=False)

# Export to specific directory
borme.to_json('/path/to/output/', pretty=True)

# Compact JSON (no indentation)
borme.to_json('compact.json', pretty=False)

Working with Provinces and Sections

bormeparser provides convenient enums for provinces and sections:
from bormeparser import SECCION

# BORME sections
print(SECCION.A)  # 'A' - Registered acts
print(SECCION.B)  # 'B' - Other published acts  
print(SECCION.C)  # 'C' - Announcements

# Parse different sections
borme_a = bormeparser.parse('file-a.pdf', SECCION.A)
borme_b = bormeparser.parse('file-b.pdf', SECCION.B)

Downloading BORME Files

bormeparser can automatically download BORME files from the official source:
import datetime
from bormeparser import download_pdf, SECCION, PROVINCIA

# Download BORME for a specific date, section, and province
date = datetime.date(2015, 6, 1)
downloaded = download_pdf(
    date=date,
    filename='BORME-A-2015-101-29.pdf',
    seccion=SECCION.A,
    provincia=PROVINCIA.MALAGA
)

if downloaded:
    print("Downloaded successfully!")
else:
    print("File already exists")

Working with BORME XML

BORME XML files contain daily summaries with metadata and download URLs:
import datetime
from bormeparser import BormeXML, SECCION, PROVINCIA

# Load XML from file
borme_xml = BormeXML.from_file('BORME-S-20150601.xml')

# Or load from date (downloads automatically)
borme_xml = BormeXML.from_date(datetime.date(2015, 6, 1))

# Get information
print(f"Date: {borme_xml.date}")
print(f"Number: {borme_xml.nbo}")
print(f"Is final: {borme_xml.is_final}")

# Get download URLs for specific section/province
urls = borme_xml.get_url_pdfs(seccion=SECCION.A, provincia=PROVINCIA.MADRID)
print(urls)

# Get all CVEs (document identifiers)
cves = borme_xml.get_cves(seccion=SECCION.A)
print(f"CVEs: {cves}")

# Download BORMEs from XML
borme_xml.download_borme(
    path='/tmp/bormes',
    seccion=SECCION.A,
    provincia=PROVINCIA.VALENCIA
)

Loading from JSON

You can also load previously exported JSON files:
from bormeparser import Borme

# Load from JSON file
borme = Borme.from_json('BORME-A-2015-27-10.json')

# Now you can work with it as usual
print(f"Loaded BORME from {borme.date}")
for anuncio in borme.get_anuncios():
    print(f"Company: {anuncio.empresa}")

Complete Example

Here’s a complete example that downloads, parses, and exports BORME data:
import datetime
import bormeparser
from bormeparser import SECCION, PROVINCIA

# Configuration
date = datetime.date(2015, 6, 1)
provincia = PROVINCIA.MALAGA
seccion = SECCION.A

# Step 1: Download the PDF
filename = f'BORME-{seccion}-{date.isoformat()}-{provincia.code}.pdf'
print(f"Downloading {filename}...")

downloaded = bormeparser.download_pdf(
    date=date,
    filename=filename,
    seccion=seccion,
    provincia=provincia
)

# Step 2: Parse the PDF
print("Parsing PDF...")
borme = bormeparser.parse(filename, seccion)

# Step 3: Extract information
print(f"\nBORME Information:")
print(f"  Date: {borme.date}")
print(f"  Province: {borme.provincia}")
print(f"  Section: {borme.seccion}")
print(f"  CVE: {borme.cve}")
print(f"  Total announcements: {len(borme.get_anuncios())}")

# Step 4: Process announcements
print("\nSample companies:")
for i, anuncio in enumerate(borme.get_anuncios()[:5]):
    print(f"\n{i+1}. {anuncio.empresa}")
    print(f"   Registry: {anuncio.registro}")
    print(f"   Acts: {len(anuncio.actos)}")
    
    # Show first act
    if anuncio.actos:
        first_act = anuncio.actos[0]
        print(f"   First act: {first_act.name}")

# Step 5: Export to JSON
json_file = borme.to_json('borme_output.json', pretty=True)
print(f"\nExported to: {json_file}")

Type Signatures

bormeparser provides type information for better IDE support:
from typing import List, Dict, Set
from bormeparser.borme import Borme, BormeAnuncio, BormeActo
import datetime

def process_borme(filename: str, seccion: str) -> Dict[str, any]:
    """Parse BORME and return summary statistics."""
    borme: Borme = bormeparser.parse(filename, seccion)
    
    anuncios: List[BormeAnuncio] = borme.get_anuncios()
    
    return {
        'date': borme.date,
        'total_companies': len(anuncios),
        'province': str(borme.provincia)
    }

Next Steps

Now that you know the basics, you can:
  • Process multiple BORME files in batch
  • Build databases of company information
  • Track changes in corporate governance over time
  • Integrate BORME data into your applications
For production deployments with API access and enterprise support, consider using LibreBOR and the LibreBOR API.

Build docs developers (and LLMs) love