Skip to main content

Design Patterns

The LinkedIn Job Analyzer implements several proven design patterns to achieve a flexible, maintainable, and extensible architecture. This page details each pattern, why it was chosen, and how it’s implemented.

1. Strategy Pattern

Purpose

The Strategy Pattern defines a family of algorithms, encapsulates each one, and makes them interchangeable. It lets the algorithm vary independently from clients that use it.

Implementation: ScraperStrategy

Location: scraping/base_scraper.py, scraping/linkedin_scraper.py Problem Solved: Need to support multiple job platforms (LinkedIn, Indeed, Glassdoor) without coupling the business logic to specific scraping implementations.

Code Example

Abstract Strategy (scraping/base_scraper.py:4-10):
from abc import ABC, abstractmethod
from typing import Dict

class ScraperStrategy(ABC):
    """Interfaz base para cualquier scraper de vacantes."""
    
    @abstractmethod
    def extraer_datos(self, termino_busqueda: str) -> Dict:
        """Debe retornar un diccionario con el éxito, título, url y habilidades brutas."""
        pass
Concrete Strategy (scraping/linkedin_scraper.py:15-148):
from .base_scraper import ScraperStrategy

class LinkedInScraper(ScraperStrategy):
    """Implementación específica para extraer datos de LinkedIn."""
    
    def __init__(self):
        self.driver = None

    def extraer_datos(self, termino_busqueda: str) -> Dict:
        try:
            # LinkedIn-specific implementation
            url = f"https://www.linkedin.com/jobs/search/?keywords={termino_codificado}"
            self._iniciar_navegador()
            self.driver.get(url)
            # ... scraping logic ...
            
            return {
                'exito': True,
                'titulo_oferta': titulo,
                'url': self.driver.current_url,
                'habilidades_brutas': habilidades
            }
        except Exception as e:
            return {'exito': False, 'mensaje': f"Error al extraer: {str(e)}"}
Context Usage (flask_app.py:9-10):
# Easy to swap strategies
scraper = LinkedInScraper()  # Could be IndeedScraper(), GlassdoorScraper(), etc.
servicio = JobService(scraper=scraper)

Benefits

  1. Open/Closed Principle: Can add new scrapers without modifying existing code
  2. Dependency Inversion: JobService depends on abstraction, not concrete implementation
  3. Testability: Easy to mock scrapers in unit tests
  4. Runtime Flexibility: Can switch scrapers based on user selection or configuration

Future Extensions

# Easy to add new platforms
class IndeedScraper(ScraperStrategy):
    def extraer_datos(self, termino_busqueda: str) -> Dict:
        # Indeed-specific implementation
        pass

class GlassdoorScraper(ScraperStrategy):
    def extraer_datos(self, termino_busqueda: str) -> Dict:
        # Glassdoor-specific implementation
        pass

2. Factory Pattern

Purpose

The Factory Pattern provides an interface for creating objects without specifying their exact classes. It centralizes object creation logic and handles implementation selection.

Implementation: ExporterFactory

Location: exportadores/exporter_factory.py Problem Solved: Need to create different exporter types (JSON, Excel, CSV) based on user preference without coupling the business logic to specific exporter classes.

Code Example

Factory Class (exportadores/exporter_factory.py:4-14):
from .json_exporter import JSONExporter
from .excel_exporter import ExcelExporter

class ExporterFactory:
    """Fábrica que decide qué exportador instanciar."""
    
    @staticmethod
    def obtener_exportador(formato: str):
        if formato == 'json':
            return JSONExporter()
        elif formato == 'excel':
            return ExcelExporter()
        else:
            raise ValueError(f"Formato {formato} no soportado")
Product Interface (exportadores/base_exporter.py:4-7):
from abc import ABC, abstractmethod
from typing import Dict

class DataExporter(ABC):
    @abstractmethod
    def exportar(self, datos: Dict, ruta_salida: str) -> str:
        pass
Concrete Products: JSONExporter (exportadores/json_exporter.py:5-21):
class JSONExporter(DataExporter):
    def exportar(self, datos: Dict, ruta_salida: str) -> str:
        with open(ruta_salida, 'w', encoding='utf-8') as f:
            json.dump(datos, f, ensure_ascii=False, indent=4)
        return ruta_salida
ExcelExporter (exportadores/excel_exporter.py:5-45):
class ExcelExporter(DataExporter):
    def exportar(self, datos: Dict, ruta_salida: str) -> str:
        df_habilidades = pd.DataFrame({
            'Número': range(1, len(datos['habilidades']) + 1),
            'Habilidad': datos['habilidades']
        })
        
        with pd.ExcelWriter(ruta_salida, engine='openpyxl') as writer:
            df_habilidades.to_excel(writer, sheet_name='Habilidades', index=False)
        
        return ruta_salida
Client Usage (logica_negocio/servicio_vacantes.py:57-63):
def guardar_datos(self, datos: Dict, formato: str) -> List[str]:
    rutas = []
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    ruta_base = f"{self.directorio_salida}/linkedin_{timestamp}"
    
    if formato in ['1', '3']:
        exportador_json = ExporterFactory.obtener_exportador('json')
        rutas.append(exportador_json.exportar(datos, f"{ruta_base}.json"))
        
    if formato in ['2', '3']:
        exportador_excel = ExporterFactory.obtener_exportador('excel')
        rutas.append(exportador_excel.exportar(datos, f"{ruta_base}.xlsx"))
        
    return rutas

Benefits

  1. Single Responsibility: Factory handles creation logic, clients focus on business logic
  2. Encapsulation: Hides complex instantiation details
  3. Flexibility: Easy to add new export formats
  4. Consistency: Centralized place to enforce exporter initialization rules

Future Extensions

class ExporterFactory:
    @staticmethod
    def obtener_exportador(formato: str):
        if formato == 'json':
            return JSONExporter()
        elif formato == 'excel':
            return ExcelExporter()
        elif formato == 'csv':  # New format
            return CSVExporter()
        elif formato == 'pdf':  # New format
            return PDFExporter()
        else:
            raise ValueError(f"Formato {formato} no soportado")

3. Facade Pattern

Purpose

The Facade Pattern provides a simplified interface to a complex subsystem. It hides the complexity of multiple components behind a single, easy-to-use interface.

Implementation: JobService

Location: logica_negocio/servicio_vacantes.py Problem Solved: Orchestrating complex workflows involving scraping, text cleaning, AI analysis, and data export without exposing implementation details to the presentation layer.

Code Example

Facade Class (logica_negocio/servicio_vacantes.py:9-65):
class JobService:
    """
    Patrón Fachada: Orquesta el flujo entre el Scraper, el Limpiador, 
    la IA y los Exportadores.
    """
    def __init__(self, scraper: ScraperStrategy):
        self.scraper = scraper
        self.analyzer = AIAnalyzer()
        self.directorio_salida = "datos_extraidos"
        
        if not os.path.exists(self.directorio_salida):
            os.makedirs(self.directorio_salida)

    def procesar_busqueda(self, termino_busqueda: str) -> Dict:
        # Coordinates multiple subsystems
        
        # 1. Scraping subsystem
        resultado = self.scraper.extraer_datos(termino_busqueda)
        if not resultado['exito']:
            return resultado
            
        # 2. Text cleaning subsystem
        habilidades_limpias = TextCleaner.limpiar_habilidades(
            resultado['habilidades_brutas']
        )
        
        # 3. Package data
        datos_completos = {
            'termino_busqueda': termino_busqueda,
            'titulo_oferta': resultado['titulo_oferta'],
            'url': resultado['url'],
            'habilidades': habilidades_limpias,
            'fecha_extraccion': datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        }
        
        return {
            'exito': True,
            'habilidades': habilidades_limpias,
            'titulo_oferta': resultado['titulo_oferta'],
            'datos_completos': datos_completos
        }

    def generar_resumen_ia(self, titulo: str, habilidades: List[str]) -> str:
        # Delegates to AI subsystem
        return self.analyzer.generar_resumen(titulo, habilidades)

    def guardar_datos(self, datos: Dict, formato: str) -> List[str]:
        # Delegates to export subsystem
        rutas = []
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        ruta_base = f"{self.directorio_salida}/linkedin_{timestamp}"
        
        if formato in ['1', '3']:
            exportador_json = ExporterFactory.obtener_exportador('json')
            rutas.append(exportador_json.exportar(datos, f"{ruta_base}.json"))
            
        if formato in ['2', '3']:
            exportador_excel = ExporterFactory.obtener_exportador('excel')
            rutas.append(exportador_excel.exportar(datos, f"{ruta_base}.xlsx"))
            
        return rutas
Client Usage (flask_app.py:17-27):
@app.route('/buscar', methods=['POST'])
def buscar():
    # Simple interface hides complex workflow
    termino = request.form.get('puesto')
    resultado = servicio.procesar_busqueda(termino)
    
    if resultado['exito']:
        rutas = servicio.guardar_datos(resultado['datos_completos'], '3')
        resultado['rutas_archivos'] = rutas
        return jsonify(resultado)
    return jsonify({"error": resultado['mensaje']}), 400

Benefits

  1. Simplified Interface: Flask routes don’t need to know about Selenium, BeautifulSoup, Pandas, or OpenAI
  2. Loose Coupling: Changes to subsystems don’t affect presentation layer
  3. Workflow Orchestration: Centralized place for business logic flow
  4. Error Handling: Single point for comprehensive error management

Subsystems Coordinated

JobService (Facade)
├── ScraperStrategy (Scraping subsystem)
│   └── LinkedInScraper
├── TextCleaner (Utilities subsystem)
├── AIAnalyzer (AI subsystem)
│   └── OpenAI API
└── ExporterFactory (Export subsystem)
    ├── JSONExporter
    └── ExcelExporter

4. Dependency Injection

Purpose

While not a GoF pattern, Dependency Injection is a fundamental design principle used throughout the system.

Implementation

Constructor Injection (logica_negocio/servicio_vacantes.py:13-20):
class JobService:
    def __init__(self, scraper: ScraperStrategy):
        # Dependency is injected, not created internally
        self.scraper = scraper
        self.analyzer = AIAnalyzer()
Setup (flask_app.py:8-10):
# Dependencies are wired at application startup
scraper = LinkedInScraper()
servicio = JobService(scraper=scraper)

Benefits

  1. Testability: Easy to inject mock scrapers for testing
  2. Flexibility: Runtime strategy selection
  3. Loose Coupling: JobService depends on abstraction, not concrete class

Example: Testing with Mocks

# In tests
class MockScraper(ScraperStrategy):
    def extraer_datos(self, termino_busqueda: str) -> Dict:
        return {
            'exito': True,
            'titulo_oferta': 'Test Job',
            'url': 'https://test.com',
            'habilidades_brutas': ['Python', 'Testing']
        }

# Inject mock for testing
servicio = JobService(scraper=MockScraper())
resultado = servicio.procesar_busqueda('developer')
assert resultado['exito'] == True

5. Static Utility Pattern

Purpose

Provide stateless helper functions through static methods, avoiding unnecessary object instantiation.

Implementation: TextCleaner

Location: utilidades/text_cleaner.py Code Example (utilidades/text_cleaner.py:4-45):
class TextCleaner:
    """
    Clase de utilidad responsable de procesar, limpiar y filtrar 
    el texto extraído de las vacantes.
    """
    
    @staticmethod
    def limpiar_habilidades(habilidades: List[str]) -> List[str]:
        limpiadas = []
        vistas = set()
        
        for habilidad in habilidades:
            # Clean whitespace
            limpia = re.sub(r'\s+', ' ', habilidad).strip()
            
            # Remove special characters
            limpia = re.sub(r'[^\w\s\+\#\.\-_,\(\)]', '', limpia)
            
            # Validate quality
            if len(limpia) < 3 or len(limpia) > 250:
                continue
            
            # Deduplicate
            if limpia.lower() not in vistas:
                vistas.add(limpia.lower())
                limpiadas.append(limpia)
                
        return limpiadas
Usage (logica_negocio/servicio_vacantes.py:29):
# No instantiation needed
habilidades_limpias = TextCleaner.limpiar_habilidades(resultado['habilidades_brutas'])

Benefits

  1. No State: Pure functions, no side effects
  2. Simplicity: No object lifecycle management
  3. Clear Intent: Signals this is a utility, not a business object

Pattern Synergy

These patterns work together to create a cohesive architecture:
┌─────────────────────────────────────────────┐
│          Facade Pattern (JobService)        │
│  Simplifies complex workflow orchestration  │
└────────┬──────────────────────┬─────────────┘
         │                      │
         ▼                      ▼
┌────────────────────┐  ┌──────────────────┐
│  Strategy Pattern  │  │ Factory Pattern  │
│  (ScraperStrategy) │  │(ExporterFactory) │
└────────────────────┘  └──────────────────┘
         │                      │
         ▼                      ▼
    [LinkedIn]           [JSON, Excel]
    [Indeed]             [CSV, PDF]
    [Glassdoor]          [Markdown]
Combined Benefits:
  • High cohesion, low coupling
  • Easy to extend (new scrapers, exporters)
  • Easy to test (dependency injection)
  • Easy to understand (facade simplifies)
  • Easy to maintain (single responsibility)

Conclusion

The design patterns implemented in this system provide:
  1. Maintainability: Clear structure makes code easy to understand and modify
  2. Extensibility: New features can be added without changing existing code
  3. Testability: Components can be tested in isolation with mocks
  4. Flexibility: Runtime behavior can be configured through dependency injection
  5. Scalability: Patterns support growth from monolith to distributed system
These patterns aren’t theoretical—they solve real problems in the LinkedIn Job Analyzer and provide a solid foundation for future enhancements.

Build docs developers (and LLMs) love