Skip to main content

Overview

The JobService class implements the Facade design pattern to provide a simplified interface for the entire job analysis workflow. It orchestrates interactions between the scraper, text cleaner, AI analyzer, and exporters.

JobService

Class Definition

from datetime import datetime
from typing import Dict, List
from scraping.base_scraper import ScraperStrategy
from utilidades.text_cleaner import TextCleaner
from inteligencia_artificial.gpt_analyzer import AIAnalyzer
from exportadores.exporter_factory import ExporterFactory

class JobService:
    """Patrón Fachada: Orquesta el flujo entre el Scraper, el Limpiador, la IA y los Exportadores."""

Constructor

def __init__(self, scraper: ScraperStrategy):
    self.scraper = scraper
    self.analyzer = AIAnalyzer()
    self.directorio_salida = "datos_extraidos"
    
    if not os.path.exists(self.directorio_salida):
        os.makedirs(self.directorio_salida)
scraper
ScraperStrategy
required
Scraper implementation to use for data extraction. Injected via dependency injection pattern. Typically an instance of LinkedInScraper.
Initialization:
  1. Stores the injected scraper instance
  2. Creates an AIAnalyzer instance for GPT integration
  3. Sets output directory to "datos_extraidos"
  4. Creates the output directory if it doesn’t exist
Example:
from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

scraper = LinkedInScraper()
service = JobService(scraper)

Methods

procesar_busqueda()

Executes the complete extraction and cleaning workflow for a job search.
def procesar_busqueda(self, termino_busqueda: str) -> Dict:
termino_busqueda
str
required
Search term for job postings (e.g., “Python Developer”, “Data Scientist”)
return
Dict
Dictionary containing processed results:On success:
  • exito (bool): True
  • habilidades (List[str]): Cleaned skills list
  • titulo_oferta (str): Job title
  • datos_completos (Dict): Complete structured data including:
    • termino_busqueda (str): Original search term
    • titulo_oferta (str): Job title
    • url (str): Job posting URL
    • habilidades (List[str]): Cleaned skills
    • fecha_extraccion (str): Timestamp in format “YYYY-MM-DD HH:MM:SS”
On failure:
  • exito (bool): False
  • mensaje (str): Error description
Workflow:
1

Extract Data

Delegates to the injected scraper’s extraer_datos() method
2

Check Success

Returns immediately if extraction failed
3

Clean Skills

Uses TextCleaner.limpiar_habilidades() to filter and clean raw extracted text
4

Package Data

Combines cleaned data with metadata and timestamp
5

Return Results

Returns structured dictionary with all processed information
Example Usage:
from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("React Developer")

if resultado['exito']:
    print(f"Título: {resultado['titulo_oferta']}")
    print(f"Habilidades limpias: {len(resultado['habilidades'])}")
    print(f"Primera habilidad: {resultado['habilidades'][0]}")
    
    # Access complete data
    datos = resultado['datos_completos']
    print(f"URL: {datos['url']}")
    print(f"Extraído: {datos['fecha_extraccion']}")
else:
    print(f"Error: {resultado['mensaje']}")

generar_resumen_ia()

Generates an AI-powered summary of the job posting.
def generar_resumen_ia(self, titulo: str, habilidades: List[str]) -> str:
titulo
str
required
Job posting title
habilidades
List[str]
required
List of cleaned skills. Should be the output from procesar_busqueda().
return
str
Markdown-formatted summary generated by GPT, or an error message if AI analysis fails
Implementation: This method is a simple delegation to the AIAnalyzer class:
return self.analyzer.generar_resumen(titulo, habilidades)
Example Usage:
service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("Senior Backend Developer")

if resultado['exito']:
    resumen = service.generar_resumen_ia(
        resultado['titulo_oferta'],
        resultado['habilidades']
    )
    print(resumen)

guardar_datos()

Exports job data to JSON and/or Excel formats.
def guardar_datos(self, datos: Dict, formato: str) -> List[str]:
datos
Dict
required
Complete job data dictionary (typically datos_completos from procesar_busqueda())
formato
str
required
Export format selection:
  • "1" - JSON only
  • "2" - Excel only
  • "3" - Both JSON and Excel
return
List[str]
List of file paths where data was successfully saved. Contains 1-2 paths depending on format selection.
Implementation Details:
  • Generates timestamped filename: linkedin_YYYYMMDD_HHMMSS
  • Saves to datos_extraidos/ directory
  • Uses ExporterFactory to obtain appropriate exporters
  • Returns all created file paths
Example Usage:
service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("Data Engineer")

if resultado['exito']:
    # Save as JSON only
    rutas_json = service.guardar_datos(resultado['datos_completos'], '1')
    print(f"JSON saved: {rutas_json[0]}")
    
    # Save as Excel only
    rutas_excel = service.guardar_datos(resultado['datos_completos'], '2')
    print(f"Excel saved: {rutas_excel[0]}")
    
    # Save both formats
    rutas_ambos = service.guardar_datos(resultado['datos_completos'], '3')
    print(f"Saved to: {rutas_ambos}")
    # Output: ['datos_extraidos/linkedin_20260307_143022.json', 
    #          'datos_extraidos/linkedin_20260307_143022.xlsx']

Complete Workflow Example

Here’s a complete example showing the typical usage of JobService:
from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

def analizar_vacante(termino: str):
    # 1. Initialize service with scraper
    service = JobService(LinkedInScraper())
    
    # 2. Process the search
    print(f"Buscando: {termino}...")
    resultado = service.procesar_busqueda(termino)
    
    if not resultado['exito']:
        print(f"❌ Error: {resultado['mensaje']}")
        return
    
    # 3. Display results
    print(f"\n✅ Título: {resultado['titulo_oferta']}")
    print(f"✅ Habilidades encontradas: {len(resultado['habilidades'])}")
    print(f"\nPrimeras 5 habilidades:")
    for i, hab in enumerate(resultado['habilidades'][:5], 1):
        print(f"  {i}. {hab}")
    
    # 4. Generate AI summary
    print("\n🤖 Generando resumen con IA...")
    resumen = service.generar_resumen_ia(
        resultado['titulo_oferta'],
        resultado['habilidades']
    )
    print(f"\n{resumen}")
    
    # 5. Save data in both formats
    print("\n💾 Guardando datos...")
    rutas = service.guardar_datos(resultado['datos_completos'], '3')
    for ruta in rutas:
        print(f"  ✓ {ruta}")

if __name__ == "__main__":
    analizar_vacante("Machine Learning Engineer")

Design Patterns

The JobService class demonstrates several software design patterns:
The service provides a simplified interface to a complex subsystem of scrapers, cleaners, analyzers, and exporters. Clients interact with a single JobService instance instead of managing multiple components.
The scraper is injected via the constructor rather than hardcoded. This allows:
  • Easy testing with mock scrapers
  • Swapping LinkedIn scraper for other job sites
  • Better separation of concerns
# Can easily swap implementations
service1 = JobService(LinkedInScraper())
service2 = JobService(IndeedScraper())  # Future implementation
service3 = JobService(MockScraper())     # For testing
The service orchestrates workflow but delegates actual work:
  • Scraping → ScraperStrategy
  • Cleaning → TextCleaner
  • AI Analysis → AIAnalyzer
  • Exporting → ExporterFactory

Dependencies

import os
from datetime import datetime
from typing import Dict, List
from scraping.base_scraper import ScraperStrategy
from utilidades.text_cleaner import TextCleaner
from inteligencia_artificial.gpt_analyzer import AIAnalyzer
from exportadores.exporter_factory import ExporterFactory

Best Practices

Use Dependency Injection

Always inject the scraper in the constructor rather than instantiating it inside the service. This improves testability and flexibility.

Check Success Flag

Always check the exito flag in the returned dictionary before accessing other fields to avoid KeyError exceptions.

Handle AI Errors

The AI summary may return error strings. Check for warning/error prefixes (⚠️, ❌) before displaying to users.

Timestamp Filenames

The service automatically generates timestamped filenames. This prevents overwriting previous extractions.

Build docs developers (and LLMs) love