Project Structure
This page documents the directory structure, module responsibilities, file organization, and naming conventions used in the LinkedIn Job Analyzer project.Directory Layout
Module Responsibilities
1. exportadores/
Purpose: Handle data export to various file formats Pattern: Factory Pattern + Strategy Pattern Files:base_exporter.py
Lines: 7 | Responsibility: Define exporter contractjson_exporter.py
Lines: 21 | Responsibility: Export data to JSON format Key Features:- UTF-8 encoding for international characters
- Pretty printing with 4-space indentation
- Returns file path for download links
excel_exporter.py
Lines: 45 | Responsibility: Export data to Excel format Key Features:- Multiple sheets (“Información General”, “Habilidades”)
- Pandas DataFrame for structured data
- openpyxl engine for .xlsx format
exporter_factory.py
Lines: 14 | Responsibility: Create appropriate exporter instances Supported Formats:'json'→ JSONExporter'excel'→ ExcelExporter- Extensible for CSV, PDF, Markdown, etc.
2. inteligencia_artificial/
Purpose: AI-powered job analysis using Large Language Models Files:gpt_analyzer.py
Lines: 77 | Responsibility: OpenAI API integration Key Features:- Environment-based API key configuration
- Structured prompt engineering
- Error handling for API failures
- Token optimization (limits to 30 skills)
3. logica_negocio/
Purpose: Business logic and workflow orchestration Pattern: Facade Pattern Files:servicio_vacantes.py
Lines: 65 | Responsibility: Coordinate all subsystems Key Methods:scraping.base_scraper.ScraperStrategyutilidades.text_cleaner.TextCleanerinteligencia_artificial.gpt_analyzer.AIAnalyzerexportadores.exporter_factory.ExporterFactory
4. scraping/
Purpose: Extract job posting data from websites Pattern: Strategy Pattern Files:base_scraper.py
Lines: 10 | Responsibility: Define scraper contractlinkedin_scraper.py
Lines: 148 | Responsibility: LinkedIn-specific web scraping Key Features:- Selenium WebDriver automation
- User-agent spoofing for bot detection
- Cookie banner handling
- Modal dismissal (login prompts)
- “Ver más” button expansion
- BeautifulSoup HTML parsing
- Graceful error handling
- Selenium WebDriver
- ChromeDriver (webdriver-manager)
- BeautifulSoup4
- Regular expressions
5. static/
Purpose: Frontend assets served to the browser Files:script.js
Responsibility: Client-side interaction logic Key Features:- Form submission via AJAX
- Display extracted skills
- Trigger AI analysis
- Render markdown summaries
- Download links for exports
style.css
Responsibility: Visual styling6. templates/
Purpose: Jinja2 HTML templates for Flask Files:index.html
Responsibility: Main user interface Components:- Search form for job title input
- Results display area
- AI analysis button
- File download section
7. utilidades/
Purpose: Reusable utility functions Pattern: Static Utility Files:text_cleaner.py
Lines: 45 | Responsibility: Text processing and cleaning Key Features:- Whitespace normalization
- Special character filtering (preserves C#, C++)
- Length validation (3-250 characters)
- Noise word filtering
- Case-insensitive deduplication
8. Root Files
flask_app.py
Lines: 42 | Responsibility: Web server and routing Routes:GET /- Render search formPOST /buscar- Execute scrapingPOST /analizar_ia- Request AI analysisGET /descargar/<filename>- Download exports
requirements.txt
Responsibility: Python package dependencies Key Packages:.env.example
Responsibility: Environment variable templateNaming Conventions
Python Files and Modules
Convention:snake_case
Examples:
servicio_vacantes.py(business logic)text_cleaner.py(utility)linkedin_scraper.py(specific implementation)
snake_case
logica_negocio/inteligencia_artificial/
Classes
Convention:PascalCase
Examples:
JobService(facade)ScraperStrategy(interface)LinkedInScraper(implementation)ExporterFactory(factory)AIAnalyzer(service)
*Strategy- Strategy pattern interfaces*Factory- Factory pattern classes*Exporter- Exporter implementations*Service- Business logic facades
Methods
Convention:snake_case
Examples:
procesar_busqueda()(business logic)extraer_datos()(scraping)limpiar_habilidades()(utility)generar_resumen()(AI)
_
_iniciar_navegador()(linkedin_scraper.py:21)_cerrar_navegador()(linkedin_scraper.py:34)
Variables
Convention:snake_case
Examples:
termino_busqueda(parameters)habilidades_limpias(results)datos_completos(data structures)directorio_salida(configuration)
Constants
Convention:UPPER_SNAKE_CASE (implicit, few constants in this project)
Example:
File Naming Patterns
Base Classes:base_*.py
base_scraper.pybase_exporter.py
<platform>_<type>.py
linkedin_scraper.pyjson_exporter.pyexcel_exporter.py
*_factory.py
exporter_factory.py
servicio_*.py
servicio_vacantes.py
File Organization Principles
1. Separation of Concerns
Each directory represents a distinct concern:- Scraping: Web automation and data extraction
- Business Logic: Workflow orchestration
- Export: Data persistence
- AI: External API integration
- Utilities: Stateless helpers
2. Layered Architecture
3. Dependency Direction
Dependencies flow downward:flask_app.py→logica_negocio/logica_negocio/→scraping/,utilidades/,exportadores/,inteligencia_artificial/- Lower layers have NO dependencies on upper layers
4. Abstract Before Concrete
In each module:base_*.py(abstract interface) first- Concrete implementations second
- Factory (if applicable) last
How to Extend the System
Adding a New Scraper
1. Create concrete implementation:Adding a New Export Format
1. Create exporter class:Adding a New Utility Function
1. Add to existing utility class:Adding a New Route
Code Organization Best Practices
1. One Class Per File
Exceptions:exporter_factory.pyimports multiple exporters but defines only the factory
2. Import Organization
3. File Size Guidelines
- Small (under 50 lines): Interfaces, utilities, factories
- Medium (50-150 lines): Services, exporters, scrapers
- Large (over 150 lines): Only when necessary (linkedin_scraper.py handles complex workflow)
4. Comments in Spanish
Since the target audience is Spanish-speaking:Testing Structure (Future)
Recommended test organization:Conclusion
The project structure follows these principles:- Modularity: Clear boundaries between subsystems
- Naming Consistency: Spanish terms for domain concepts, English for technical patterns
- Pattern Implementation: Directory structure reflects design patterns
- Extensibility: Easy to add scrapers, exporters, and utilities
- Maintainability: Small, focused files with single responsibilities