Skip to main content

Overview

The exporters module implements the Factory pattern to provide flexible data export capabilities. It supports exporting job analysis data to both JSON and Excel formats.

Architecture

The module uses three key design patterns:
  • Abstract Base Class: DataExporter defines the export interface
  • Concrete Implementations: JSONExporter and ExcelExporter
  • Factory Pattern: ExporterFactory instantiates the appropriate exporter

DataExporter

Abstract base class that defines the export interface.

Class Definition

from abc import ABC, abstractmethod
from typing import Dict

class DataExporter(ABC):
    pass

Methods

exportar()

Abstract method that all exporters must implement.
@abstractmethod
def exportar(self, datos: Dict, ruta_salida: str) -> str:
    pass
datos
Dict
required
Dictionary containing job data to export. Expected structure:
  • termino_busqueda (str): Search term used
  • titulo_oferta (str): Job title
  • url (str): Job posting URL
  • habilidades (List[str]): Cleaned skills list
  • fecha_extraccion (str): Extraction timestamp
ruta_salida
str
required
File path where the exported data should be saved (including file extension)
return
str
The file path where data was successfully saved

ExporterFactory

Factory class that instantiates the appropriate exporter based on format.

Class Definition

class ExporterFactory:
    """Fábrica que decide qué exportador instanciar."""

Methods

obtener_exportador()

Static factory method that returns an exporter instance for the specified format.
@staticmethod
def obtener_exportador(formato: str):
formato
str
required
Export format identifier. Supported values:
  • "json" - Returns a JSONExporter instance
  • "excel" - Returns an ExcelExporter instance
return
DataExporter
Instance of the appropriate exporter (JSONExporter or ExcelExporter)
raises
ValueError
Raised if the formato parameter is not “json” or “excel”
Example Usage:
from exportadores.exporter_factory import ExporterFactory

# Get JSON exporter
json_exporter = ExporterFactory.obtener_exportador('json')

# Get Excel exporter
excel_exporter = ExporterFactory.obtener_exportador('excel')

# Invalid format raises ValueError
try:
    csv_exporter = ExporterFactory.obtener_exportador('csv')
except ValueError as e:
    print(f"Error: {e}")  # "Formato csv no soportado"

JSONExporter

Exporter implementation for JSON format using Python’s built-in json module.

Class Definition

import json
from .base_exporter import DataExporter

class JSONExporter(DataExporter):
    """Exportador responsable de guardar los diccionarios de datos en formato .json."""

Methods

exportar()

Saves job data to a JSON file with proper UTF-8 encoding and human-readable formatting.
def exportar(self, datos: Dict, ruta_salida: str) -> str:
datos
Dict
required
Dictionary containing job analysis data
ruta_salida
str
required
Output file path (should end with .json)
return
str
The file path where data was saved
Implementation Details:
  • Uses UTF-8 encoding to properly handle Spanish characters (á, é, í, ó, ú, ñ)
  • ensure_ascii=False preserves accented characters
  • indent=4 creates human-readable JSON with 4-space indentation
  • Prints confirmation message to console
Example Usage:
from exportadores.json_exporter import JSONExporter

exporter = JSONExporter()

datos = {
    'termino_busqueda': 'Python Developer',
    'titulo_oferta': 'Senior Python Engineer',
    'url': 'https://linkedin.com/jobs/...',
    'habilidades': ['Python', 'Django', 'PostgreSQL'],
    'fecha_extraccion': '2026-03-07 14:30:00'
}

ruta = exporter.exportar(datos, 'datos_extraidos/job_20260307.json')
print(f"Saved to: {ruta}")
Output File Example:
{
    "termino_busqueda": "Python Developer",
    "titulo_oferta": "Senior Python Engineer",
    "url": "https://linkedin.com/jobs/...",
    "habilidades": [
        "Python",
        "Django",
        "PostgreSQL"
    ],
    "fecha_extraccion": "2026-03-07 14:30:00"
}

ExcelExporter

Exporter implementation for Excel format using pandas and openpyxl.

Class Definition

import pandas as pd
from .base_exporter import DataExporter

class ExcelExporter(DataExporter):
    """Exportador responsable de guardar los diccionarios de datos en formato .xlsx usando Pandas."""

Methods

exportar()

Creates a multi-sheet Excel workbook with organized job data.
def exportar(self, datos: Dict, ruta_salida: str) -> str:
datos
Dict
required
Dictionary containing job analysis data
ruta_salida
str
required
Output file path (should end with .xlsx)
return
str
The file path where the Excel file was saved
Excel Structure: The generated Excel file contains two sheets:
Contains metadata about the job posting:
CampoValor
Término de BúsquedaUser’s search term
Título de la OfertaJob title
URLJob posting URL
Fecha de ExtracciónTimestamp
Total de HabilidadesCount of skills
Numbered list of all extracted skills:
NúmeroHabilidad
1Python
2Django
3PostgreSQL
Implementation Details:
  • Uses pandas.DataFrame for data structuring
  • Uses openpyxl engine for writing .xlsx files
  • Creates professional-looking tables with headers
  • Handles missing data with 'N/A' fallback values
Example Usage:
from exportadores.excel_exporter import ExcelExporter

exporter = ExcelExporter()

datos = {
    'termino_busqueda': 'Data Scientist',
    'titulo_oferta': 'Senior Data Scientist',
    'url': 'https://linkedin.com/jobs/view/123456',
    'habilidades': [
        'Python',
        'Machine Learning',
        'TensorFlow',
        'SQL',
        'Statistics'
    ],
    'fecha_extraccion': '2026-03-07 14:30:00'
}

ruta = exporter.exportar(datos, 'datos_extraidos/job_20260307.xlsx')
print(f"Excel saved to: {ruta}")

Complete Usage Example

Here’s how to use the factory pattern to export data in multiple formats:
from datetime import datetime
from exportadores.exporter_factory import ExporterFactory

# Prepare job data
datos_completos = {
    'termino_busqueda': 'Full Stack Developer',
    'titulo_oferta': 'Full Stack Engineer - Remote',
    'url': 'https://linkedin.com/jobs/view/789012',
    'habilidades': [
        'React',
        'Node.js',
        'TypeScript',
        'MongoDB',
        'Docker',
        'AWS'
    ],
    'fecha_extraccion': datetime.now().strftime("%Y-%m-%d %H:%M:%S")
}

# Generate timestamp for filenames
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
base_path = f"datos_extraidos/linkedin_{timestamp}"

# Export to JSON
json_exporter = ExporterFactory.obtener_exportador('json')
json_path = json_exporter.exportar(datos_completos, f"{base_path}.json")
print(f"JSON exported: {json_path}")

# Export to Excel
excel_exporter = ExporterFactory.obtener_exportador('excel')
excel_path = excel_exporter.exportar(datos_completos, f"{base_path}.xlsx")
print(f"Excel exported: {excel_path}")

Dependencies

import json
from typing import Dict
The Excel exporter requires pandas and openpyxl. Install with:
pip install pandas openpyxl

Best Practices

Use the Factory

Always use ExporterFactory.obtener_exportador() instead of directly instantiating exporters. This maintains flexibility and follows the Factory pattern.

Consistent File Naming

Use timestamps in filenames to prevent overwriting previous exports:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

Create Output Directory

Ensure the output directory exists before exporting:
import os
os.makedirs('datos_extraidos', exist_ok=True)

Handle Exceptions

Wrap export calls in try-except blocks to handle file permission errors and disk space issues gracefully.

Build docs developers (and LLMs) love