Overview
Google Jobs is a job search aggregator that pulls listings from across the web (company sites, job boards, etc.). JobSpy JS scrapes Google Jobs using Playwright with headless Chrome to execute JavaScript and parse the dynamic search results.Scraping Method
Playwright (headless Chrome) navigatinghttps://www.google.com/search?udm=8
- Launches a real Chrome browser instance
- Executes JavaScript to load dynamic job listings
- Parses structured JSON data embedded in the page HTML
- Supports cursor-based pagination
- Requires Playwright Chromium installation
Installation
After installingjobspy-js, install Playwright’s Chromium browser:
Google Jobs is the slowest scraper due to browser automation. Expect ~5-10 seconds per page.
IP Requirements
Using proxies
Google Jobs-Specific Parameters
google_search_term
Google Jobs search is based on natural language queries. You can override the auto-generated query with a custom one:
If
google_search_term is not provided, JobSpy builds a query from:
search_term+ “jobs”location(“near ”)job_type(“Full time”, “Part time”, etc.)hours_old(“since yesterday”, “in the last week”, etc.)is_remote(“remote”)
"software engineer jobs Full time near San Francisco remote"
Query refinement tips
Example Usage
Basic search
Remote jobs posted recently
Custom search query
Filter by job type
With residential proxy
Supported Filters
| Filter | Support | Notes |
|---|---|---|
search_term | ✅ | Job title or keywords (used in auto-generated query) |
location | ✅ | City, state, or region (added as “near “) |
is_remote | ✅ | Appends “remote” to search query |
job_type | ✅ | Appends “Full time”, “Part time”, etc. to query |
hours_old | ✅ | Adds recency filter (“since yesterday”, “in the last week”) |
google_search_term | ✅ | Override auto-generated query |
distance | ❌ | Not supported (Google Jobs doesn’t expose this filter) |
easy_apply | ❌ | Not supported |
Google Jobs filters are applied via natural language query construction, not API parameters. For best results, use the
google_search_term override.Returned Fields
Google Jobs returns minimal but structured data:id,title,company_name,location,job_url,date_posteddescription(short snippet, not full description)is_remote(inferred from description)job_type(extracted from description text)emails(extracted from description)
Rate Limits & Best Practices
Use residential IPs/proxies
Google Jobs is very strict about bot detection. Always use:- Your home/office internet (residential IP)
- High-quality residential proxies
- Cloud server IPs (AWS, GCP, Azure, DigitalOcean, etc.)
- Datacenter proxies
- Free/cheap VPNs
Keep results_wanted moderate
Each page load takes 5-10 seconds. Limit your results to avoid long wait times:Avoid rapid successive runs
Google tracks browser sessions and may block repeated searches. Add delays between runs:Headless mode
JobSpy runs Chrome in headless mode by default. If you encounter blocks, consider running in headful mode during development (requires modifying source code):Troubleshooting
Empty results or blocks
Symptom: 0 results returned or browser hangs Cause: Google detected bot traffic and blocked the request Solutions:- Use a residential IP or high-quality residential proxy
- Avoid running multiple searches in quick succession
- Try a different proxy or IP
- Simplify your search query
”Chromium not found” error
Symptom:browserType.launch: Executable doesn't exist
Solution: Install Playwright Chromium:
Slow performance
Symptom: Scraping takes several minutes Cause: Google Jobs requires launching a real browser and waiting for JavaScript execution Solutions:- Reduce
results_wanted - Use a faster scraper (Indeed, LinkedIn, Glassdoor) for bulk data
- Run Google Jobs searches in parallel (separate processes)
Missing job descriptions
Symptom:description field contains only a short snippet
Note: This is expected. Google Jobs only shows short previews. Visit the job_url for full details, or use a different scraper (Indeed, LinkedIn) if you need full descriptions.
Initial cursor not found
Symptom: “initial cursor not found, try changing your query or there was at most 10 results” Cause: Google didn’t return enough results to generate a pagination cursor Solutions:- Use broader search terms
- Remove restrictive filters (e.g.,
hours_old,job_type) - Try a different location
CLI Examples
Performance Tips
- Combine with faster scrapers: Use Google Jobs for unique listings, but rely on Indeed/LinkedIn for bulk data
- Use specific queries: Narrower searches return faster
- Limit results: Request only what you need
- Proxy rotation: Use multiple residential proxies for parallel searches
Source Code
- Implementation:
~/workspace/source/src/scrapers/google/index.ts - Key:
google - Site enum:
Site.GOOGLE
