Skip to main content

Overview

The main entry point for jobspy-js. Scrapes one or more job boards concurrently and returns a unified, flattened result set. All sites are scraped in parallel via Promise.allSettled. If one site fails, the others still return results. Results are sorted by site name (alphabetical), then by date_posted descending (newest first).

Signature

async function scrapeJobs(
  params?: ScrapeJobsParams
): Promise<ScrapeJobsResult>

Parameters

params
ScrapeJobsParams
default:"{}"
Configuration object for the scraping operation. All fields are optional.

ScrapeJobsParams Properties

site_name
string | string[] | Site | Site[]
default:"all sites"
Job boards to scrape. Accepts enum values, string keys, or arrays. Site names are normalized — "ziprecruiter", "zip_recruiter", and "zip-recruiter" all work.Supported sites: linkedin, indeed, glassdoor, google, google_careers, zip_recruiter, bayt, naukri, bdjobs
search_term
string
Job title or search query (e.g. "software engineer", "react developer")
google_search_term
string
Overrides search_term for the Google scraper only. Useful for customizing Google’s broader search syntax.
location
string
Job location (e.g. "San Francisco, CA", "London", "Remote")
distance
number
default:"50"
Search radius in miles from the specified location
is_remote
boolean
default:"false"
Filter for remote jobs only
job_type
string
Filter by employment type. Valid values:
  • "fulltime"
  • "parttime"
  • "contract"
  • "internship"
  • "temporary"
easy_apply
boolean
Filter for easy-apply jobs (supported on LinkedIn, Indeed, Glassdoor)
results_wanted
number
default:"15"
Maximum number of results per site. Total results may be up to results_wanted * number_of_sites.
country_indeed
string
default:"usa"
Country for Indeed and Glassdoor regional domains. See Country Support for the full list of 60+ supported countries.
proxies
string | string[]
Proxy server(s) for rotating requests. Accepts formats:
  • "host:port"
  • "user:pass@host:port"
  • "http://host:port"
  • "socks5://host:port"
Multiple proxies rotate round-robin.
description_format
string
default:"markdown"
Format for job descriptions. Valid values:
  • "markdown" — converts HTML to Markdown
  • "html" — preserves original HTML
  • "plain" — strips all markup
linkedin_fetch_description
boolean
default:"false"
Fetch full job descriptions from LinkedIn. Requires an extra HTTP request per job (slower).
indeed_fetch_description
boolean
default:"false"
Visit the Indeed job page or direct link to scrape the full description
linkedin_company_ids
number[]
Filter LinkedIn results to specific company IDs. Example: [1441, 1035] for Google and Microsoft.
offset
number
default:"0"
Skip the first N results (pagination offset)
hours_old
number
Only return jobs posted within the last N hours
enforce_annual_salary
boolean
default:"false"
Convert all salary figures to annual equivalents (hourly × 2080, monthly × 12, etc.)
verbose
number
default:"0"
Logging verbosity level:
  • 0 — errors only
  • 1 — warnings
  • 2 — all logs
use_creds
boolean
default:"false"
Enable authenticated scraping fallback when anonymous access is blocked (e.g. LinkedIn 429s). Can also be set via JOBSPY_CREDS=1 environment variable.
credentials
ProviderCredentials
Pre-built credentials object. Takes precedence over individual credential fields below.
linkedin_username
string
LinkedIn username/email. Also reads from LINKEDIN_USERNAME environment variable.
linkedin_password
string
LinkedIn password. Also reads from LINKEDIN_PASSWORD environment variable.
indeed_username
string
Indeed username/email. Also reads from INDEED_USERNAME environment variable.
indeed_password
string
Indeed password. Also reads from INDEED_PASSWORD environment variable.
glassdoor_username
string
Glassdoor username/email. Also reads from GLASSDOOR_USERNAME environment variable.
glassdoor_password
string
Glassdoor password. Also reads from GLASSDOOR_PASSWORD environment variable.
ziprecruiter_username
string
ZipRecruiter username/email. Also reads from ZIPRECRUITER_USERNAME environment variable.
ziprecruiter_password
string
ZipRecruiter password. Also reads from ZIPRECRUITER_PASSWORD environment variable.
bayt_username
string
Bayt username/email. Also reads from BAYT_USERNAME environment variable.
bayt_password
string
Bayt password. Also reads from BAYT_PASSWORD environment variable.
naukri_username
string
Naukri username/email. Also reads from NAUKRI_USERNAME environment variable.
naukri_password
string
Naukri password. Also reads from NAUKRI_PASSWORD environment variable.
bdjobs_username
string
BDJobs username/email. Also reads from BDJOBS_USERNAME environment variable.
bdjobs_password
string
BDJobs password. Also reads from BDJOBS_PASSWORD environment variable.
profile
string
Named profile for deduplication tracking. When set, jobspy tracks seen job URLs and filters duplicates across runs.
state_file
string
Path to custom state file for profile tracking. Defaults to jobspy.json in current directory.
skip_dedup
boolean
default:"false"
Skip deduplication filtering (state is still updated for next run)

Return Value

jobs
FlatJobRecord[]
Array of job postings. Each job is a flat record with all nested structures expanded to top-level fields.
totalScraped
number
Total number of jobs scraped across all sites before deduplication
newCount
number
Number of new jobs after deduplication filtering (equals totalScraped when not using profiles)
profile
object
Profile information (only present when using profile parameter)
profile.name
string
Profile name
profile.lastRunAt
string | null
ISO timestamp of last run
profile.stateFile
string
Path to state file used for tracking

FlatJobRecord Fields

Each job in the jobs array contains:
id
string
Unique job ID with site prefix (e.g. "li-123", "in-abc")
site
string
Source site key (e.g. "linkedin", "indeed")
job_url
string
Canonical job URL on the board
job_url_direct
string
Direct employer/ATS URL (LinkedIn, Indeed, ZipRecruiter)
title
string
Job title
company
string
Company name
location
string
Formatted as "City, State, Country"
date_posted
string
ISO date "YYYY-MM-DD"
job_type
string
Comma-separated employment types (e.g. "fulltime, contract")
salary_source
string
How salary was obtained: "direct_data" or "description"
interval
string
Pay interval: "yearly", "monthly", "weekly", "daily", or "hourly"
min_amount
number
Minimum salary/pay amount
max_amount
number
Maximum salary/pay amount
currency
string
Currency code (e.g. "USD", "EUR")
is_remote
boolean
Whether the job is remote
job_level
string
Seniority level (e.g. "mid-senior level") — LinkedIn only
job_function
string
Job function category — LinkedIn only
listing_type
string
E.g. "sponsored" — LinkedIn, Indeed
emails
string
Comma-separated emails extracted from description
description
string
Full job description (format per description_format parameter)
company_industry
string
Industry classification — LinkedIn, Indeed
company_url
string
Company page on the job board — LinkedIn, Glassdoor
Company logo URL — Indeed, Naukri
company_url_direct
string
Company’s own website URL — LinkedIn, Indeed
company_addresses
string
Company address(es) — Indeed
company_num_employees
string
Employee count range — Indeed
company_revenue
string
Revenue range — Indeed
company_description
string
Company description text — Indeed
skills
string
Comma-separated skill tags — Naukri
experience_range
string
Required experience (e.g. "3-5 years") — Naukri
company_rating
number
Company rating (e.g. 4.2) — Naukri
company_reviews_count
number
Number of company reviews — Naukri
vacancy_count
number
Number of open positions — Naukri
work_from_home_type
string
"Remote", "Hybrid", or "Work from office" — Naukri

Examples

import { scrapeJobs } from "jobspy-js";

const result = await scrapeJobs({
  site_name: ["indeed", "linkedin"],
  search_term: "software engineer",
  location: "San Francisco, CA",
  results_wanted: 20,
});

console.log(`Found ${result.jobs.length} jobs`);
for (const job of result.jobs) {
  console.log(`${job.title} at ${job.company}${job.job_url}`);
}

Remote Jobs with Salary Filter

const result = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "frontend developer",
  is_remote: true,
  enforce_annual_salary: true,
});

const wellPaid = result.jobs.filter(
  (j) => j.min_amount && j.min_amount >= 100000
);
console.log(`${wellPaid.length} remote jobs paying $100k+`);

With Credentials

const result = await scrapeJobs({
  site_name: ["linkedin"],
  search_term: "engineer",
  use_creds: true,
  linkedin_username: process.env.LINKEDIN_USERNAME,
  linkedin_password: process.env.LINKEDIN_PASSWORD,
});

With Profile Deduplication

// First run — returns all 50 jobs, saves state
const run1 = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "react developer",
  location: "New York, NY",
  results_wanted: 50,
  profile: "frontend",
});
console.log(`Found ${run1.newCount} new jobs`);

// Second run (hours later) — returns only new postings
const run2 = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "react developer",
  location: "New York, NY",
  results_wanted: 50,
  profile: "frontend",
});
console.log(`Found ${run2.newCount} new jobs (${run2.totalScraped - run2.newCount} duplicates filtered)`);
// Search Indeed and Glassdoor in Germany
const result = await scrapeJobs({
  site_name: ["indeed", "glassdoor"],
  search_term: "Softwareentwickler",
  location: "Berlin",
  country_indeed: "germany",
});

LinkedIn Company Filter

// Only jobs from specific LinkedIn company IDs
const result = await scrapeJobs({
  site_name: "linkedin",
  search_term: "product manager",
  linkedin_company_ids: [1441, 1035], // Google, Microsoft
  linkedin_fetch_description: true,
});

Recent Jobs Only

// Jobs posted in the last 24 hours
const result = await scrapeJobs({
  site_name: ["linkedin", "indeed"],
  search_term: "machine learning engineer",
  hours_old: 24,
  description_format: "plain",
});

Error Handling

scrapeJobs() uses Promise.allSettled internally, so individual scraper failures don’t crash the entire call. Failed scrapers are silently skipped — you’ll receive results from whichever sites succeeded.
try {
  const result = await scrapeJobs({
    site_name: ["linkedin", "indeed", "glassdoor"],
    search_term: "developer",
  });

  if (result.jobs.length === 0) {
    console.log("No jobs found — try broadening your search");
  }
} catch (err) {
  // Only throws if all scrapers fail or params are invalid
  console.error("Scrape failed:", err);
}
Set verbose: 2 to see detailed logs from each scraper:
const result = await scrapeJobs({
  search_term: "developer",
  verbose: 2,
});

Build docs developers (and LLMs) love