Pricing Extraction

POST /api/v1/transform

The transformation endpoint extracts pricing information from a SaaS website URL and converts it into the Pricing2YAML format. This is an asynchronous operation that returns a task ID for status polling.

Extraction can take several minutes depending on the complexity of the pricing page and the number of validation iterations required.

Endpoint Details

URL: http://localhost:8001/api/v1/transform Method: POST Content-Type: application/json

Request Body

url

string

required

The full URL of the SaaS pricing page to extract data from.Example: https://slack.com/pricing

model

string

default:"gpt-5.2"

The OpenAI model to use for extraction. Common options:

gpt-5.2 (default)
gpt-4o
gpt-3.5-turbo

temperature

float

default:"0.7"

Controls randomness in model responses. Range: 0.0 to 1.0

Lower values (0.0-0.3): More deterministic and focused
Higher values (0.7-1.0): More creative but less consistent

max_tries

integer

default:"50"

Maximum number of validation and fixing iterations. The system will attempt to validate and correct the extracted YAML up to this many times.

base_url

string

default:"https://api.openai.com/v1"

Custom endpoint URL for OpenAI-compatible APIs. Useful for:

Self-hosted LLM endpoints
Azure OpenAI Service
Other OpenAI-compatible providers

better_model

string

default:"gpt-5.2"

Model to use for higher-quality refinement passes.

Response

The endpoint immediately returns a response with a task ID:

task_id

string

required

Unique identifier for the transformation task. Use this to poll for status and retrieve the result.

status

string

required

Initial status of the task. Will be "pending" when first created.

message

string

Human-readable message about the task status.

Example Request

curl -X POST http://localhost:8001/api/v1/transform \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://buffer.com/pricing",
    "model": "gpt-5.2",
    "temperature": 0.7,
    "max_tries": 50
  }'

Example Response

{
  "task_id": "a3f7e8c9-4b2d-4f1e-8a9c-7d3e5f6a8b2c",
  "status": "pending",
  "message": "Transformation started"
}

Checking Task Status

Use the returned task_id to poll for completion: Endpoint: GET /api/v1/transform/status/{task_id}

Status Response

status

string

required

Current status of the task:

"pending" - Task is queued or in progress
"completed" - Task finished successfully
"error" - Task failed

result_file

string

Path to the generated YAML file (only present when status is "completed")

error

string

Error message if the task failed (only present when status is "error")

Polling Example

import requests
import time

task_id = "a3f7e8c9-4b2d-4f1e-8a9c-7d3e5f6a8b2c"

while True:
    response = requests.get(
        f"http://localhost:8001/api/v1/transform/status/{task_id}"
    )
    
    if response.status_code == 200:
        # Task completed - response is the YAML file
        with open("pricing.yaml", "wb") as f:
            f.write(response.content)
        print("Pricing data saved to pricing.yaml")
        break
    
    data = response.json()
    
    if data["status"] == "error":
        print(f"Error: {data['error']}")
        break
    
    print(f"Status: {data['status']}")
    time.sleep(5)  # Poll every 5 seconds

Extraction Process

The A-MINT service performs the following steps during transformation:

Fetch HTML Content

Uses Selenium WebDriver to load the pricing page and extract rendered HTML.From src/amint/extractors/web_driver.py

Convert to Markdown

Transforms HTML into structured Markdown using LLM-powered conversion.Normalizes table separators and removes excessive formatting.

Extract Plans

Identifies pricing tiers, costs, and billing cycles.Extracts configuration like currency and billing period.

Extract Features

Identifies features and categorizes them:

DOMAIN (core functionality)
INTEGRATION (external services)
SUPPORT (customer service)
AUTOMATION (workflow automation)
GUARANTEE (SLAs, compliance)
INFORMATION (analytics, reporting)
MANAGEMENT (admin controls)

Extract Usage Limits

Finds numeric quotas and thresholds:

RENEWABLE (monthly resets)
NON_RENEWABLE (permanent limits)
TIME_DRIVEN (time-based quotas)

Extract Add-ons

Identifies optional extensions and overage costs.

Validate & Fix

Iteratively validates the YAML against the Analysis API and fixes errors.Continues up to max_tries iterations until valid or gives up.

Save YAML

Writes the final validated YAML to the output directory.Files are saved as output/{uuid}.yaml

Error Handling

Common errors and their meanings:

Failed to fetch HTML content

The WebDriver couldn’t load the pricing page. Possible causes:

Invalid URL
Website blocking automated access
Network connectivity issues
Page requires JavaScript that failed to execute

YAML validation failed after maximum iterations

The extracted YAML couldn’t be validated even after max_tries attempts. This can happen when:

The pricing page structure is too complex
The LLM model consistently misinterprets the content
The page contains ambiguous or inconsistent information

Solution: Try increasing max_tries or using a more capable model.

Task not found

The provided task_id doesn’t exist. This can occur if:

The task ID is incorrect
The task expired (tasks are stored in memory)
The service restarted

Output Format

When the task completes successfully, the status endpoint returns the YAML file directly with: Content-Type: application/x-yaml Filename: pricing_{uuid}.yaml

YAML Structure

See the A-MINT Overview for the complete Pricing2YAML specification. The extracted YAML includes:

syntaxVersion: '2.1'
saasName: buffer
version: '2024-01-15'
currency: USD
url: https://buffer.com/pricing

features:
  # Boolean, TEXT, or NUMERIC features
  # with descriptions and type classifications

usageLimits:
  # Numeric quotas linked to features
  # with value types and renewal periods

plans:
  # Named pricing tiers with costs
  # and feature/limit overrides

addOns:
  # Optional extensions with pricing
  # and plan availability

Logging and Debugging

A-MINT logs detailed information during extraction:

Application logs: /app/logs/amint_api.log
Transformation logs: /app/logs/transformation_logs.csv

The CSV log includes:

transformation_call_id: Unique ID for the transformation
timestamp: Start time
response_time: Total processing time in seconds
raw_html_length: Size of original HTML
cleaned_html_length: Size after cleaning
llm_call_ids: Comma-separated list of LLM API calls made

Performance Considerations

Extraction is computationally expensive and can take 2-10 minutes per pricing page depending on complexity.

Factors affecting performance:

Page complexity: More plans and features take longer to extract
Model speed: Faster models like gpt-3.5-turbo reduce latency
Validation iterations: Complex pages may require many fix attempts
Network latency: OpenAI API call overhead

Optimization tips:

Use a lower temperature (0.0-0.3) for more deterministic results
Reduce max_tries if you’re willing to accept partial extractions
Consider caching results for frequently accessed pricing pages
Use faster models for initial extraction, better models for refinement

POST /api/v1/fix

Fix and validate an existing YAML file without re-extracting from HTML

Analysis API

Validate and analyze extracted YAML files

Next Steps

A-MINT Overview

Learn more about A-MINT’s architecture and capabilities

MCP Tools

See how Harvey agent uses iPricing to call A-MINT

Harvey API

MCP Server

Analysis API

A-MINT API

CSP Service

POST /api/v1/transform

Endpoint Details

Request Body

Response

Example Request

Example Response

Checking Task Status

Status Response

Polling Example

Extraction Process

Error Handling

Output Format

YAML Structure

Logging and Debugging

Performance Considerations

POST /api/v1/fix

Analysis API

Next Steps

A-MINT Overview

MCP Tools

Build docs developers (and LLMs) love

Harvey API

MCP Server

Analysis API

A-MINT API

CSP Service

​POST /api/v1/transform

​Endpoint Details

​Request Body

​Response

​Example Request

​Example Response

​Checking Task Status

​Status Response

​Polling Example

​Extraction Process

​Error Handling

​Output Format

​YAML Structure

​Logging and Debugging

​Performance Considerations

​Related Endpoints

POST /api/v1/fix

Analysis API

​Next Steps

A-MINT Overview

MCP Tools

Build docs developers (and LLMs) love

POST /api/v1/transform

Endpoint Details

Request Body

Response

Example Request

Example Response

Checking Task Status

Status Response

Polling Example

Extraction Process

Error Handling

Output Format

YAML Structure

Logging and Debugging

Performance Considerations

Related Endpoints

Next Steps