Skip to main content
This guide will walk you through running your first job analysis from start to finish.

Prerequisites

Before starting, ensure you have:
  • Python 3.10 or higher installed
  • Google Chrome browser installed
  • An OpenAI API key (get one here)
  • Git installed on your system

Installation Steps

1

Clone the Repository

Clone the project to your local machine:
git clone https://github.com/JuseAR27/Web_Scraping.git
cd Web_Scraping
2

Install Dependencies

Install all required Python packages:
pip install -r requirements.txt
This will install:
  • Flask (web framework)
  • Selenium (web automation)
  • BeautifulSoup4 (HTML parsing)
  • Pandas & Openpyxl (data export)
  • OpenAI (AI analysis)
  • python-dotenv (environment management)
3

Configure Your API Key

Create a .env file in the project root:
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env
Replace your_openai_api_key_here with your actual OpenAI API key.
Never commit your .env file to version control. It should already be in .gitignore.
4

Start the Flask Server

Launch the application:
python flask_app.py
You should see output similar to:
* Serving Flask app 'flask_app'
* Debug mode: on
* Running on http://127.0.0.1:5000
5

Open the Web Interface

Open your web browser and navigate to:
http://127.0.0.1:5000

Running Your First Analysis

1

Enter a Job Search Term

In the web interface, enter a job title in the search field. For example:
  • “Frontend Developer”
  • “Data Scientist”
  • “DevOps Engineer”
  • “Product Manager”
2

Extract Job Data

Click the “Extraer Datos” (Extract Data) button.The scraper will:
  • Open a Chrome browser instance (you’ll see it)
  • Navigate to LinkedIn
  • Search for the job title
  • Extract job information and skills
  • Close the browser automatically
The scraping process typically takes 10-30 seconds depending on your internet connection.
3

Review Extracted Data

Once extraction completes, you’ll see:
  • Job Title: The exact title from LinkedIn
  • Skills List: Cleaned and filtered required skills
  • Download Links: For JSON and Excel exports
4

Generate AI Summary (Optional)

Click the “Generar Resumen Estratégico” (Generate Strategic Summary) button.The AI will analyze the extracted data and provide:
  • Professional summary of the position
  • Technology stack breakdown
  • Required experience level
  • Key insights and recommendations
This step uses your OpenAI API credits. Only generate summaries when needed.
5

Export Your Data

Download the extracted data in your preferred format:JSON Format: Structured data for programmatic use
{
  "termino_busqueda": "Frontend Developer",
  "titulo_oferta": "Senior Frontend Engineer",
  "url": "https://linkedin.com/jobs/...",
  "habilidades": ["React", "TypeScript", "CSS"],
  "fecha_extraccion": "2026-03-07 14:30:45"
}
Excel Format: Spreadsheet for analysis and sharing
  • Organized in rows and columns
  • Easy to filter and sort
  • Shareable with non-technical stakeholders

Understanding the Output

Extracted Data Fields

Each job analysis contains:
FieldDescription
termino_busquedaYour original search term
titulo_ofertaActual job title from LinkedIn
urlDirect link to the job posting
habilidadesList of cleaned, extracted skills
fecha_extraccionTimestamp of when data was extracted

File Locations

All exported files are saved in the datos_extraidos/ directory with timestamps:
datos_extraidos/
├── linkedin_20260307_143045.json
└── linkedin_20260307_143045.xlsx

Common Use Patterns

Analyzing Multiple Jobs

  1. Search for a job title
  2. Extract and export data
  3. Repeat with different search terms
  4. Compare exported Excel files to identify common skills
  1. Run the same search periodically (weekly/monthly)
  2. Export results with timestamps
  3. Analyze how skill requirements evolve

Interview Preparation

  1. Search for your target position
  2. Generate AI summary for insights
  3. Review extracted skills list
  4. Focus preparation on most common requirements

Next Steps

Installation Guide

Learn about prerequisites, troubleshooting, and advanced setup

Architecture

Understand the project structure and design patterns

Need Help?

If you encounter issues:
  1. Check the Installation Guide for troubleshooting tips
  2. Verify your OpenAI API key is correctly set in .env
  3. Ensure Chrome browser is installed and up to date
  4. Check that all dependencies installed successfully

Build docs developers (and LLMs) love