The job management tools allow you to monitor and control asynchronous scraping operations.
These tools are disabled in read-only mode. Configure readOnly: false to enable job management.
list_jobs
List all indexing jobs with optional status filtering.
MCP Tool Name: list_jobs
Source: src/tools/ListJobsTool.ts
Parameters
Filter jobs by status. Valid values:
"queued" - Job is waiting to start
"running" - Job is actively processing
"completed" - Job finished successfully
"failed" - Job encountered an error
"cancelling" - Job is in the process of being cancelled
"cancelled" - Job was successfully cancelled
Response Structure
Array of job information objectsLibrary name being indexed
Version being indexed (null for unversioned)
Database status (may differ from pipeline status)
ISO timestamp when job was created
ISO timestamp when job started (null if not started)
ISO timestamp when job finished (null if still running)
Error message if job failed
Progress information (only present when job is running)Number of pages scraped so far
Total URLs discovered during crawling
ISO timestamp of last update
Detailed error message from database
TypeScript Types
interface ListJobsInput {
status?: PipelineJobStatus;
}
interface ListJobsToolResponse {
jobs: JobInfo[];
}
interface JobInfo {
id: string;
library: string;
version: string | null;
status: PipelineJobStatus;
dbStatus?: VersionStatus;
createdAt: string;
startedAt: string | null;
finishedAt: string | null;
error: string | null;
progress?: {
pages: number;
totalPages: number;
totalDiscovered: number;
};
updatedAt?: string;
errorMessage?: string;
}
type PipelineJobStatus =
| "queued"
| "running"
| "completed"
| "failed"
| "cancelling"
| "cancelled";
type VersionStatus =
| "COMPLETED"
| "IN_PROGRESS"
| "FAILED"
| "QUEUED"
| "CANCELLED";
See src/tools/GetJobInfoTool.ts:15-36 for the JobInfo type definition.
Example Requests
List All Jobs
{
"name": "list_jobs",
"arguments": {}
}
List Running Jobs
{
"name": "list_jobs",
"arguments": {
"status": "running"
}
}
List Failed Jobs
{
"name": "list_jobs",
"arguments": {
"status": "failed"
}
}
Example Response
{
"jobs": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"library": "react",
"version": "18.2.0",
"status": "running",
"dbStatus": "IN_PROGRESS",
"createdAt": "2024-03-15T10:30:00.000Z",
"startedAt": "2024-03-15T10:30:05.000Z",
"finishedAt": null,
"error": null,
"progress": {
"pages": 45,
"totalPages": 100,
"totalDiscovered": 67
},
"updatedAt": "2024-03-15T10:35:12.000Z"
},
{
"id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"library": "typescript",
"version": "5.4.2",
"status": "completed",
"dbStatus": "COMPLETED",
"createdAt": "2024-03-15T09:00:00.000Z",
"startedAt": "2024-03-15T09:00:03.000Z",
"finishedAt": "2024-03-15T09:15:42.000Z",
"error": null,
"updatedAt": "2024-03-15T09:15:42.000Z"
}
]
}
MCP Output
When called through MCP (see src/mcp/mcpServer.ts:312):
Current Jobs:
- ID: 550e8400-e29b-41d4-a716-446655440000
Status: running
Library: react
Version: 18.2.0
Created: 2024-03-15T10:30:00.000Z
Started: 2024-03-15T10:30:05.000Z
- ID: 7c9e6679-7425-40de-944b-e07fc1f90ae7
Status: completed
Library: typescript
Version: 5.4.2
Created: 2024-03-15T09:00:00.000Z
Started: 2024-03-15T09:00:03.000Z
Finished: 2024-03-15T09:15:42.000Z
If no jobs found:
get_job_info
Get detailed information for a specific job.
MCP Tool Name: get_job_info
Source: src/tools/GetJobInfoTool.ts
Parameters
Response Structure
Complete job information (see JobInfo type above)
TypeScript Types
interface GetJobInfoInput {
jobId: string;
}
interface GetJobInfoToolResponse {
job: JobInfo;
}
See src/tools/GetJobInfoTool.ts:9-43 for complete type definitions.
Example Request
{
"name": "get_job_info",
"arguments": {
"jobId": "550e8400-e29b-41d4-a716-446655440000"
}
}
Example Response
{
"job": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"library": "react",
"version": "18.2.0",
"status": "running",
"dbStatus": "IN_PROGRESS",
"createdAt": "2024-03-15T10:30:00.000Z",
"startedAt": "2024-03-15T10:30:05.000Z",
"finishedAt": null,
"error": null,
"progress": {
"pages": 45,
"totalPages": 100,
"totalDiscovered": 67
},
"updatedAt": "2024-03-15T10:35:12.000Z"
}
}
Error Cases
Invalid Job ID:
{
"error": "Job ID is required and must be a non-empty string."
}
Job Not Found:
{
"error": "Job with ID 550e8400-e29b-41d4-a716-446655440000 not found."
}
MCP Output
When called through MCP (see src/mcp/mcpServer.ts:357):
Job Info:
- ID: 550e8400-e29b-41d4-a716-446655440000
Status: running
Library: [email protected]
Created: 2024-03-15T10:30:00.000Z
Started: 2024-03-15T10:30:05.000Z
cancel_job
Cancel a queued or running job.
MCP Tool Name: cancel_job
Source: src/tools/CancelJobTool.ts
This is a destructive operation. Cancelled jobs cannot be resumed.
Parameters
UUID of the job to cancel
Response Structure
Outcome message describing the cancellation result
Final status of the job after cancellation attempt
TypeScript Types
interface CancelJobInput {
jobId: string;
}
interface CancelJobResult {
message: string;
finalStatus: string;
}
See src/tools/CancelJobTool.ts:9-22 for complete type definitions.
Example Request
{
"name": "cancel_job",
"arguments": {
"jobId": "550e8400-e29b-41d4-a716-446655440000"
}
}
Example Responses
Successfully Cancelled
{
"message": "Cancellation requested for job 550e8400-e29b-41d4-a716-446655440000. Current status: cancelling.",
"finalStatus": "cancelling"
}
Already Completed
{
"message": "Job 550e8400-e29b-41d4-a716-446655440000 is already completed. No action taken.",
"finalStatus": "completed"
}
Already Failed
{
"message": "Job 550e8400-e29b-41d4-a716-446655440000 is already failed. No action taken.",
"finalStatus": "failed"
}
Error Cases
Invalid Job ID:
{
"error": "Job ID is required and must be a non-empty string."
}
Job Not Found:
{
"error": "Job with ID 550e8400-e29b-41d4-a716-446655440000 not found."
}
Cancellation Failed:
{
"error": "Failed to cancel job 550e8400-e29b-41d4-a716-446655440000: Internal error."
}
MCP Output
When called through MCP (see src/mcp/mcpServer.ts:390):
Cancellation requested for job 550e8400-e29b-41d4-a716-446655440000. Current status: cancelling.
Implementation Details
The tool checks job status before attempting cancellation:
async execute(input: CancelJobInput): Promise<CancelJobResult> {
if (!input.jobId || typeof input.jobId !== "string" || input.jobId.trim() === "") {
throw new ValidationError(
"Job ID is required and must be a non-empty string.",
this.constructor.name,
);
}
try {
// Retrieve the job first
const job = await this.pipeline.getJob(input.jobId);
if (!job) {
throw new ToolError(
`Job with ID ${input.jobId} not found.`,
this.constructor.name,
);
}
// Check if job is in a final state
if (
job.status === PipelineJobStatus.COMPLETED ||
job.status === PipelineJobStatus.FAILED ||
job.status === PipelineJobStatus.CANCELLED
) {
return {
message: `Job ${input.jobId} is already ${job.status}. No action taken.`,
finalStatus: job.status,
};
}
// Attempt cancellation
await this.pipeline.cancelJob(input.jobId);
// Re-fetch to confirm status change
const updatedJob = await this.pipeline.getJob(input.jobId);
const finalStatus = updatedJob?.status ?? "UNKNOWN (job disappeared?)";
return {
message: `Cancellation requested for job ${input.jobId}. Current status: ${finalStatus}.`,
finalStatus,
};
} catch (error) {
logger.error(`❌ Error cancelling job ${input.jobId}: ${error}`);
throw new ToolError(
`Failed to cancel job ${input.jobId}: ${error.message}`,
this.constructor.name,
);
}
}
See src/tools/CancelJobTool.ts:45 for the complete implementation.
Cancellation Behavior
- Queued Jobs: Removed from queue immediately, status becomes
cancelled
- Running Jobs: Graceful shutdown initiated, status becomes
cancelling
- Final State Jobs: No action taken, returns current status
Status Transitions
queued → cancelling → cancelled
running → cancelling → cancelled
completed (no change)
failed (no change)
cancelled (no change)
Resource Endpoints
Job data is also available through MCP resources (when not in read-only mode):
docs://jobs
List all jobs (supports ?status=<status> query parameter).
docs://jobs/{jobId}
Get a specific job by ID.
See src/mcp/mcpServer.ts:543-638 for resource implementation.
Usage Examples
Monitor Job Progress
// Start scraping
const { jobId } = await scrapeDocs({
library: "react",
version: "18.2.0",
url: "https://react.dev/"
});
// Poll for completion
let job;
do {
await sleep(5000); // Wait 5 seconds
const result = await getJobInfo({ jobId });
job = result.job;
if (job.progress) {
console.log(`Progress: ${job.progress.pages}/${job.progress.totalPages}`);
}
} while (job.status === "running" || job.status === "queued");
if (job.status === "completed") {
console.log("✅ Scraping completed successfully!");
} else if (job.status === "failed") {
console.error(`❌ Scraping failed: ${job.error}`);
}
Cancel Long-Running Job
// List running jobs
const { jobs } = await listJobs({ status: "running" });
// Find job taking too long
const longRunningJob = jobs.find(job => {
const runtime = Date.now() - new Date(job.startedAt).getTime();
return runtime > 30 * 60 * 1000; // 30 minutes
});
if (longRunningJob) {
console.log(`Cancelling long-running job: ${longRunningJob.id}`);
await cancelJob({ jobId: longRunningJob.id });
}
Retry Failed Jobs
// List failed jobs
const { jobs } = await listJobs({ status: "failed" });
// Retry each failed job
for (const job of jobs) {
console.log(`Retrying ${job.library}@${job.version}`);
// Get original scrape parameters (would need to be stored)
await scrapeDocs({
library: job.library,
version: job.version,
url: "...", // Original URL
});
}