Job Service

Overview

The JobService class implements the Facade design pattern to provide a simplified interface for the entire job analysis workflow. It orchestrates interactions between the scraper, text cleaner, AI analyzer, and exporters.

JobService

Class Definition

from datetime import datetime
from typing import Dict, List
from scraping.base_scraper import ScraperStrategy
from utilidades.text_cleaner import TextCleaner
from inteligencia_artificial.gpt_analyzer import AIAnalyzer
from exportadores.exporter_factory import ExporterFactory

class JobService:
    """Patrón Fachada: Orquesta el flujo entre el Scraper, el Limpiador, la IA y los Exportadores."""

Constructor

def __init__(self, scraper: ScraperStrategy):
    self.scraper = scraper
    self.analyzer = AIAnalyzer()
    self.directorio_salida = "datos_extraidos"
    
    if not os.path.exists(self.directorio_salida):
        os.makedirs(self.directorio_salida)

scraper

ScraperStrategy

required

Scraper implementation to use for data extraction. Injected via dependency injection pattern. Typically an instance of LinkedInScraper.

Initialization:

Stores the injected scraper instance
Creates an AIAnalyzer instance for GPT integration
Sets output directory to "datos_extraidos"
Creates the output directory if it doesn’t exist

Example:

from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

scraper = LinkedInScraper()
service = JobService(scraper)

Methods

procesar_busqueda()

Executes the complete extraction and cleaning workflow for a job search.

def procesar_busqueda(self, termino_busqueda: str) -> Dict:

termino_busqueda

str

required

Search term for job postings (e.g., “Python Developer”, “Data Scientist”)

return

Dict

Dictionary containing processed results:On success:

exito (bool): True
habilidades (List[str]): Cleaned skills list
titulo_oferta (str): Job title
datos_completos (Dict): Complete structured data including:
- termino_busqueda (str): Original search term
- titulo_oferta (str): Job title
- url (str): Job posting URL
- habilidades (List[str]): Cleaned skills
- fecha_extraccion (str): Timestamp in format “YYYY-MM-DD HH:MM:SS”

On failure:

exito (bool): False
mensaje (str): Error description

Workflow:

Extract Data

Delegates to the injected scraper’s extraer_datos() method

Check Success

Returns immediately if extraction failed

Clean Skills

Uses TextCleaner.limpiar_habilidades() to filter and clean raw extracted text

Package Data

Combines cleaned data with metadata and timestamp

Return Results

Returns structured dictionary with all processed information

Example Usage:

from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("React Developer")

if resultado['exito']:
    print(f"Título: {resultado['titulo_oferta']}")
    print(f"Habilidades limpias: {len(resultado['habilidades'])}")
    print(f"Primera habilidad: {resultado['habilidades'][0]}")
    
    # Access complete data
    datos = resultado['datos_completos']
    print(f"URL: {datos['url']}")
    print(f"Extraído: {datos['fecha_extraccion']}")
else:
    print(f"Error: {resultado['mensaje']}")

generar_resumen_ia()

Generates an AI-powered summary of the job posting.

def generar_resumen_ia(self, titulo: str, habilidades: List[str]) -> str:

titulo

str

required

Job posting title

habilidades

List[str]

required

List of cleaned skills. Should be the output from procesar_busqueda().

return

str

Markdown-formatted summary generated by GPT, or an error message if AI analysis fails

Implementation: This method is a simple delegation to the AIAnalyzer class:

return self.analyzer.generar_resumen(titulo, habilidades)

Example Usage:

service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("Senior Backend Developer")

if resultado['exito']:
    resumen = service.generar_resumen_ia(
        resultado['titulo_oferta'],
        resultado['habilidades']
    )
    print(resumen)

guardar_datos()

Exports job data to JSON and/or Excel formats.

def guardar_datos(self, datos: Dict, formato: str) -> List[str]:

datos

Dict

required

Complete job data dictionary (typically datos_completos from procesar_busqueda())

formato

str

required

Export format selection:

"1" - JSON only
"2" - Excel only
"3" - Both JSON and Excel

return

List[str]

List of file paths where data was successfully saved. Contains 1-2 paths depending on format selection.

Implementation Details:

Generates timestamped filename: linkedin_YYYYMMDD_HHMMSS
Saves to datos_extraidos/ directory
Uses ExporterFactory to obtain appropriate exporters
Returns all created file paths

Example Usage:

service = JobService(LinkedInScraper())
resultado = service.procesar_busqueda("Data Engineer")

if resultado['exito']:
    # Save as JSON only
    rutas_json = service.guardar_datos(resultado['datos_completos'], '1')
    print(f"JSON saved: {rutas_json[0]}")
    
    # Save as Excel only
    rutas_excel = service.guardar_datos(resultado['datos_completos'], '2')
    print(f"Excel saved: {rutas_excel[0]}")
    
    # Save both formats
    rutas_ambos = service.guardar_datos(resultado['datos_completos'], '3')
    print(f"Saved to: {rutas_ambos}")
    # Output: ['datos_extraidos/linkedin_20260307_143022.json', 
    #          'datos_extraidos/linkedin_20260307_143022.xlsx']

Complete Workflow Example

Here’s a complete example showing the typical usage of JobService:

from logica_negocio.servicio_vacantes import JobService
from scraping.linkedin_scraper import LinkedInScraper

def analizar_vacante(termino: str):
    # 1. Initialize service with scraper
    service = JobService(LinkedInScraper())
    
    # 2. Process the search
    print(f"Buscando: {termino}...")
    resultado = service.procesar_busqueda(termino)
    
    if not resultado['exito']:
        print(f"❌ Error: {resultado['mensaje']}")
        return
    
    # 3. Display results
    print(f"\n✅ Título: {resultado['titulo_oferta']}")
    print(f"✅ Habilidades encontradas: {len(resultado['habilidades'])}")
    print(f"\nPrimeras 5 habilidades:")
    for i, hab in enumerate(resultado['habilidades'][:5], 1):
        print(f"  {i}. {hab}")
    
    # 4. Generate AI summary
    print("\n🤖 Generando resumen con IA...")
    resumen = service.generar_resumen_ia(
        resultado['titulo_oferta'],
        resultado['habilidades']
    )
    print(f"\n{resumen}")
    
    # 5. Save data in both formats
    print("\n💾 Guardando datos...")
    rutas = service.guardar_datos(resultado['datos_completos'], '3')
    for ruta in rutas:
        print(f"  ✓ {ruta}")

if __name__ == "__main__":
    analizar_vacante("Machine Learning Engineer")

Design Patterns

The JobService class demonstrates several software design patterns:

Facade Pattern

The service provides a simplified interface to a complex subsystem of scrapers, cleaners, analyzers, and exporters. Clients interact with a single JobService instance instead of managing multiple components.

Dependency Injection

The scraper is injected via the constructor rather than hardcoded. This allows:

Easy testing with mock scrapers
Swapping LinkedIn scraper for other job sites
Better separation of concerns

# Can easily swap implementations
service1 = JobService(LinkedInScraper())
service2 = JobService(IndeedScraper())  # Future implementation
service3 = JobService(MockScraper())     # For testing

Single Responsibility

The service orchestrates workflow but delegates actual work:

Scraping → ScraperStrategy
Cleaning → TextCleaner
AI Analysis → AIAnalyzer
Exporting → ExporterFactory

Dependencies

import os
from datetime import datetime
from typing import Dict, List
from scraping.base_scraper import ScraperStrategy
from utilidades.text_cleaner import TextCleaner
from inteligencia_artificial.gpt_analyzer import AIAnalyzer
from exportadores.exporter_factory import ExporterFactory

Best Practices

Use Dependency Injection

Always inject the scraper in the constructor rather than instantiating it inside the service. This improves testability and flexibility.

Check Success Flag

Always check the exito flag in the returned dictionary before accessing other fields to avoid KeyError exceptions.

Handle AI Errors

The AI summary may return error strings. Check for warning/error prefixes (⚠️, ❌) before displaying to users.

Timestamp Filenames

The service automatically generates timestamped filenames. This prevents overwriting previous extractions.

Core Modules

Utilities

Overview

JobService

Class Definition

Constructor

Methods

procesar_busqueda()

generar_resumen_ia()

guardar_datos()

Complete Workflow Example

Design Patterns

Dependencies

Best Practices

Use Dependency Injection

Check Success Flag

Handle AI Errors

Timestamp Filenames

Build docs developers (and LLMs) love

Core Modules

Utilities

​Overview

​JobService

​Class Definition

​Constructor

​Methods

​procesar_busqueda()

​generar_resumen_ia()

​guardar_datos()

​Complete Workflow Example

​Design Patterns

​Dependencies

​Best Practices

Use Dependency Injection

Check Success Flag

Handle AI Errors

Timestamp Filenames

Build docs developers (and LLMs) love

Overview

JobService

Class Definition

Constructor

Methods

procesar_busqueda()

generar_resumen_ia()

guardar_datos()

Complete Workflow Example

Design Patterns

Dependencies

Best Practices