Skip to main content

Overview

Sessions are the foundational concept in the midPilot Connector Generator. A session represents a complete workflow instance that tracks all state, configuration, and results as you progress through discovery, scraping, digesting, and code generation. Each session maintains:
  • Configuration data: Application details, API URLs, and processing parameters
  • Documentation items: Scraped or uploaded API documentation chunks
  • Job tracking: References to all jobs executed within the session
  • Results: Outputs from each processing stage (discovery, schema extraction, code generation)

Session Lifecycle

1

Create Session

Initialize a new session to begin the connector generation workflow.
2

Add Documentation

Upload documentation files or run discovery/scraping jobs.
3

Process & Extract

Execute digester jobs to extract schema information.
4

Generate Code

Run code generation to produce connector code.
5

Retrieve Results

Access generated code and metadata from session data.

Session Schema

Sessions follow a consistent data structure defined in the system:
src/common/session/schema.py
class Session(BaseModel):
    sessionId: UUID        # Unique identifier (UUID v4)
    createdAt: str        # ISO 8601 timestamp
    updatedAt: str        # ISO 8601 timestamp
    data: Dict[str, Any]  # Arbitrary session payload

Session Data Structure

The data field contains all session state and results. Common keys include:
KeyTypeDescription
discoveryInputObjectDiscovery job input parameters
discoveryJobIdUUIDID of the discovery job
discoveryOutputObjectDiscovered candidate URLs
scrapeInputObjectScraping configuration
scrapeJobIdUUIDID of the scraping job
documentationItemsArrayAll documentation chunks
digesterInputObjectSchema extraction parameters
digesterJobIdUUIDID of the digester job
objectClassesArrayExtracted object class schemas
codegenInputObjectCode generation parameters
codegenJobIdUUIDID of the codegen job
generatedCodeObjectGenerated connector code

API Operations

Create a New Session

curl -X POST "http://localhost:8000/session" \
  -H "Content-Type: application/json"

Create Session with Specific ID

You can create a session with a predetermined ID:
curl -X POST "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000" \
  -H "Content-Type: application/json"
Returns 409 Conflict if the session ID already exists.

Get Session Data

Retrieve all data associated with a session:
curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"

Update Session Data

Merge new data into the session (partial update):
curl -X PATCH "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "customMetadata": "value",
      "processingStage": "discovery"
    }
  }'
The update operation merges the provided data with existing session data. Existing keys are preserved unless explicitly overwritten.

Check Session Exists

Verify if a session exists without retrieving data:
HEAD /session/{session_id}
curl -I "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"

# Returns 204 No Content if exists
# Returns 404 Not Found if doesn't exist

Delete Session

Remove a session and all associated data:
curl -X DELETE "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"
Deleting a session removes all associated jobs, documentation items, and generated results. This operation cannot be undone.

Documentation Items

Sessions store documentation chunks as DocumentationItem objects:
src/common/session/schema.py
class DocumentationItem(BaseModel):
    id: UUID                        # Unique identifier
    page_id: Optional[UUID]         # Page grouping identifier
    source: str                     # "scraper" or "upload"
    scrape_job_ids: Optional[list[UUID]]
    url: Optional[str]              # Source URL (for scraped docs)
    summary: Optional[str]          # AI-generated summary
    content: str                    # Raw documentation text
    metadata: Dict[str, Any]        # Additional metadata

Upload Documentation

Add documentation files to a session:
curl -X POST "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/documentation" \
  -F "[email protected]"
Documentation is automatically chunked and processed with LLM analysis. Each chunk becomes a separate DocumentationItem with extracted metadata like tags, categories, and endpoint counts.

Retrieve Documentation

Get all documentation items in a session:
curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/documentation"

List Session Jobs

Retrieve all jobs associated with a session:
curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/jobs"

Best Practices

Store the session ID returned from creation. You’ll need it for all subsequent operations in the workflow.
Sessions support caching of previous results. Use usePreviousSessionData: true in job inputs to reuse compatible outputs from previous jobs, significantly reducing processing time and LLM costs.
Delete sessions after completing the connector generation workflow to free up storage. Sessions accumulate significant data from documentation processing and intermediate results.
Always check job status before proceeding to the next stage. Failed jobs may leave the session in an inconsistent state requiring manual intervention or session restart.

Workflow

Learn about the complete connector generation workflow

Job Status

Track job progress and handle job lifecycle

Build docs developers (and LLMs) love