Sessions

Overview

Sessions are the foundational concept in the midPilot Connector Generator. A session represents a complete workflow instance that tracks all state, configuration, and results as you progress through discovery, scraping, digesting, and code generation. Each session maintains:

Configuration data: Application details, API URLs, and processing parameters
Documentation items: Scraped or uploaded API documentation chunks
Job tracking: References to all jobs executed within the session
Results: Outputs from each processing stage (discovery, schema extraction, code generation)

Session Lifecycle

Create Session

Initialize a new session to begin the connector generation workflow.

Add Documentation

Upload documentation files or run discovery/scraping jobs.

Process & Extract

Execute digester jobs to extract schema information.

Generate Code

Run code generation to produce connector code.

Retrieve Results

Access generated code and metadata from session data.

Session Schema

Sessions follow a consistent data structure defined in the system:

src/common/session/schema.py

class Session(BaseModel):
    sessionId: UUID        # Unique identifier (UUID v4)
    createdAt: str        # ISO 8601 timestamp
    updatedAt: str        # ISO 8601 timestamp
    data: Dict[str, Any]  # Arbitrary session payload

Session Data Structure

The data field contains all session state and results. Common keys include:

Key	Type	Description
`discoveryInput`	Object	Discovery job input parameters
`discoveryJobId`	UUID	ID of the discovery job
`discoveryOutput`	Object	Discovered candidate URLs
`scrapeInput`	Object	Scraping configuration
`scrapeJobId`	UUID	ID of the scraping job
`documentationItems`	Array	All documentation chunks
`digesterInput`	Object	Schema extraction parameters
`digesterJobId`	UUID	ID of the digester job
`objectClasses`	Array	Extracted object class schemas
`codegenInput`	Object	Code generation parameters
`codegenJobId`	UUID	ID of the codegen job
`generatedCode`	Object	Generated connector code

API Operations

Create a New Session

curl -X POST "http://localhost:8000/session" \
  -H "Content-Type: application/json"

Create Session with Specific ID

You can create a session with a predetermined ID:

curl -X POST "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000" \
  -H "Content-Type: application/json"

Returns 409 Conflict if the session ID already exists.

Get Session Data

Retrieve all data associated with a session:

curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"

Update Session Data

Merge new data into the session (partial update):

curl -X PATCH "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "customMetadata": "value",
      "processingStage": "discovery"
    }
  }'

The update operation merges the provided data with existing session data. Existing keys are preserved unless explicitly overwritten.

Check Session Exists

Verify if a session exists without retrieving data:

HEAD /session/{session_id}

curl -I "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"

# Returns 204 No Content if exists
# Returns 404 Not Found if doesn't exist

Delete Session

Remove a session and all associated data:

curl -X DELETE "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000"

Deleting a session removes all associated jobs, documentation items, and generated results. This operation cannot be undone.

Documentation Items

Sessions store documentation chunks as DocumentationItem objects:

src/common/session/schema.py

class DocumentationItem(BaseModel):
    id: UUID                        # Unique identifier
    page_id: Optional[UUID]         # Page grouping identifier
    source: str                     # "scraper" or "upload"
    scrape_job_ids: Optional[list[UUID]]
    url: Optional[str]              # Source URL (for scraped docs)
    summary: Optional[str]          # AI-generated summary
    content: str                    # Raw documentation text
    metadata: Dict[str, Any]        # Additional metadata

Upload Documentation

Add documentation files to a session:

curl -X POST "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/documentation" \
  -F "[email protected]"

Documentation is automatically chunked and processed with LLM analysis. Each chunk becomes a separate DocumentationItem with extracted metadata like tags, categories, and endpoint counts.

Retrieve Documentation

Get all documentation items in a session:

curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/documentation"

List Session Jobs

Retrieve all jobs associated with a session:

curl -X GET "http://localhost:8000/session/550e8400-e29b-41d4-a716-446655440000/jobs"

Best Practices

Session ID Management

Store the session ID returned from creation. You’ll need it for all subsequent operations in the workflow.

Session Reuse

Sessions support caching of previous results. Use usePreviousSessionData: true in job inputs to reuse compatible outputs from previous jobs, significantly reducing processing time and LLM costs.

Session Cleanup

Delete sessions after completing the connector generation workflow to free up storage. Sessions accumulate significant data from documentation processing and intermediate results.

Error Handling

Always check job status before proceeding to the next stage. Failed jobs may leave the session in an inconsistent state requiring manual intervention or session restart.

Workflow

Learn about the complete connector generation workflow

Job Status

Track job progress and handle job lifecycle

Get Started

Core Concepts

Guides

Overview

Session Lifecycle

Session Schema

Session Data Structure

API Operations

Create a New Session

Create Session with Specific ID

Get Session Data

Update Session Data

Check Session Exists

Delete Session

Documentation Items

Upload Documentation

Retrieve Documentation

List Session Jobs

Best Practices

Workflow

Job Status

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Overview

​Session Lifecycle

​Session Schema

​Session Data Structure

​API Operations

​Create a New Session

​Create Session with Specific ID

​Get Session Data

​Update Session Data

​Check Session Exists

​Delete Session

​Documentation Items

​Upload Documentation

​Retrieve Documentation

​List Session Jobs

​Best Practices

​Related Concepts

Workflow

Job Status

Build docs developers (and LLMs) love

Overview

Session Lifecycle

Session Schema

Session Data Structure

API Operations

Create a New Session

Create Session with Specific ID

Get Session Data

Update Session Data

Check Session Exists

Delete Session

Documentation Items

Upload Documentation

Retrieve Documentation

List Session Jobs

Best Practices

Related Concepts