Skip to main content

Overview

The extract() function is the primary API for Struktur. It takes artifacts, a JSON schema, and an extraction strategy, then returns validated, structured data.

Function signature

export const extract = async <T>(
  options: ExtractionOptions<T>,
): Promise<ExtractionResult<T>>

Parameters

options
ExtractionOptions<T>
required
Configuration object for the extraction process.

Returns

ExtractionResult<T>
Promise<ExtractionResult<T>>
A promise that resolves to the extraction result.

Basic example

import { extract, simple } from "@mateffy/struktur";
import type { JSONSchemaType } from "ajv";
import { google } from "@ai-sdk/google";

type Output = { title: string; description: string };

const schema: JSONSchemaType<Output> = {
  type: "object",
  properties: {
    title: { type: "string" },
    description: { type: "string" },
  },
  required: ["title", "description"],
  additionalProperties: false,
};

const result = await extract({
  artifacts: [myArtifact],
  schema,
  strategy: simple({ model: google("gemini-2.0-flash-exp") }),
});

if (result.error) {
  console.error("Extraction failed:", result.error);
} else {
  console.log(result.data.title);
  console.log("Used", result.usage.totalTokens, "tokens");
}

Parallel extraction example

import { extract, parallel } from "@mateffy/struktur";
import { google } from "@ai-sdk/google";

const result = await extract({
  artifacts: multiPageDocument,
  schema,
  strategy: parallel({
    model: google("gemini-2.0-flash-exp"),
    mergeModel: google("gemini-2.0-flash-exp"),
    chunkSize: 10_000,
    concurrency: 4,
  }),
});

With event handlers

import { extract, simple } from "@mateffy/struktur";

const result = await extract({
  artifacts,
  schema,
  strategy: simple({ model }),
  events: {
    onStep: ({ step, total, label }) => {
      console.log(`Step ${step}/${total}: ${label}`);
    },
    onTokenUsage: ({ inputTokens, outputTokens, model }) => {
      console.log(`${model}: ${inputTokens} in, ${outputTokens} out`);
    },
  },
});

Error handling

The extract() function catches all errors and returns them in the result object rather than throwing:
const result = await extract({ artifacts, schema, strategy });

if (result.error) {
  // Extraction failed - handle the error
  console.error("Failed:", result.error.message);
  // result.data will be null (cast to T)
  // result.usage will show { inputTokens: 0, outputTokens: 0, totalTokens: 0 }
} else {
  // Success - use result.data
  console.log(result.data);
}

See also

Build docs developers (and LLMs) love