Skip to main content
The ingestion workflow follows a multi-step process: upload files, validate data, optionally apply field mappings, and confirm ingestion. The ingestion process runs asynchronously with real-time progress updates via Server-Sent Events (SSE).

Upload Incident Files

Upload incident files and create an upload session for validation. Accepts JSON or CSV content as base64-encoded strings.

Request Body

files
array
required
Array of file objects to upload

Response

success
boolean
Whether the upload was successful
session_id
string
Unique identifier for this upload session (UUID)
status
string
Current status of the session (e.g., “pending_validation”)
incident_count
integer
Number of incidents detected in the uploaded files
file_metadata
array
Metadata about uploaded files (filenames, sizes, record counts)
preview
array
First 10 records from the uploaded data for preview

Example Request

curl -X POST https://api.example.com/api/knowledge-base/upload \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "filename": "incidents.json",
        "size": 4096,
        "content": "W3siaW5jaWRlbnRfaWQiOiAiSU5DMDAxIiwgInRpdGxlIjogIlNlcnZlciBkb3duIiwgImRlc2NyaXB0aW9uIjogIlByb2R1Y3Rpb24gc2VydmVyIHVucmVzcG9uc2l2ZSJ9XQ=="
      }
    ]
  }'

Example Response

{
  "success": true,
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending_validation",
  "incident_count": 150,
  "file_metadata": [
    {
      "filename": "incidents.json",
      "size": 4096,
      "record_count": 150
    }
  ],
  "preview": [
    {
      "incident_id": "INC001",
      "title": "Server down",
      "description": "Production server unresponsive"
    }
  ]
}

Validate Upload Session

Run schema validation on an upload session to identify missing required fields and data quality issues.

Path Parameters

session_id
string
required
Upload session ID returned from the upload endpoint

Response

success
boolean
Whether validation completed successfully
session_id
string
Upload session identifier
status
string
Validation status: “valid”, “invalid”, or “needs_mapping”
total_records
integer
Total number of records in the upload
valid_count
integer
Number of records that passed validation
error_count
integer
Number of validation errors found
errors
array
Detailed validation errors
preview
array
Preview of validated records
file_fields
object
Detected fields in each uploaded file (used for field mapping)

Example Request

curl -X POST https://api.example.com/api/knowledge-base/validate/550e8400-e29b-41d4-a716-446655440000 \
  -H "Authorization: Bearer $TOKEN"

Example Response

{
  "success": true,
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "invalid",
  "total_records": 150,
  "valid_count": 145,
  "error_count": 5,
  "errors": [
    {
      "file": "incidents.json",
      "row": 23,
      "field": "incident_id",
      "message": "Required field 'incident_id' is missing",
      "value": null
    }
  ],
  "preview": [...],
  "file_fields": {
    "incidents.json": ["incident_id", "title", "description", "action_taken"]
  }
}

Apply Field Mapping

Apply field mappings to transform source field names to the expected schema and re-validate the data.

Path Parameters

session_id
string
required
Upload session ID

Request Body

mapping
object
required
Map of source field names to target schema fieldsExample: {"inc_id": "incident_id", "summary": "title"}

Response

Returns a validation report (same structure as validate endpoint) with the field mappings applied.

Example Request

curl -X POST https://api.example.com/api/knowledge-base/validate/550e8400-e29b-41d4-a716-446655440000/map-fields \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "mapping": {
      "inc_id": "incident_id",
      "summary": "title",
      "desc": "description",
      "resolution": "action_taken"
    }
  }'

Example Response

{
  "success": true,
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "valid",
  "total_records": 150,
  "valid_count": 150,
  "error_count": 0,
  "errors": [],
  "preview": [...]
}

Confirm and Ingest

Confirm a validated upload session and begin asynchronous ingestion into the vector database. Returns a Server-Sent Events (SSE) stream with real-time progress updates.

Request Body

session_id
string
required
Validated upload session ID
notes
string
Optional notes about this ingestion (stored in version metadata)

Response (Server-Sent Events)

The endpoint returns an SSE stream with the following event types:

Event: progress

batch
integer
Current batch number being processed
totalBatches
integer
Total number of batches to process
incidents
array
Array of incident IDs processed in this batch
message
string
Human-readable progress message

Event: complete

success
boolean
Always true on successful completion
version_id
string
UUID of the newly created dataset version
version_number
integer
Incremental version number
collection_name
string
Qdrant collection name where data was ingested
incident_count
integer
Total number of incidents ingested

Event: error

success
boolean
Always false on error
message
string
User-friendly error message

Example Request

curl -X POST https://api.example.com/api/knowledge-base/ingest \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "notes": "Q4 2024 incident data upload"
  }'

Example SSE Stream

event: progress
data: {"batch": 0, "totalBatches": 0, "message": "Starting ingestion..."}

event: progress
data: {"batch": 1, "totalBatches": 30, "incidents": ["INC001", "INC002", "INC003"], "message": "Batch 1 of 30 processed"}

event: progress
data: {"batch": 2, "totalBatches": 30, "incidents": ["INC004", "INC005"], "message": "Batch 2 of 30 processed"}

event: complete
data: {"success": true, "version_id": "a1b2c3d4-e5f6-4a5b-8c9d-0e1f2a3b4c5d", "version_number": 5, "collection_name": "past_issues_v2", "incident_count": 150}

Required Schema

All incidents must include the following required fields:
  • incident_id (string): Unique identifier for the incident
  • title (string): Short title or summary
  • description (string): Detailed incident description
Optional fields that enhance search and retrieval:
  • action_taken (string): Resolution or mitigation steps
  • opened_at (string): Timestamp when incident was opened
  • updated_at (string): Timestamp when incident was last updated
  • impacted_application (string): Affected application or service
  • root_cause (string): Root cause analysis
  • mitigation (string): Mitigation strategy
  • accountable_party (string): Team or person responsible
  • source_system (string): Source system (defaults to “ServiceNow”)
  • repeat_incident (string): Whether this is a repeat incident

Build docs developers (and LLMs) love