Skip to main content
Delta Sharing APIs return responses in two formats: Parquet Format (default) and Delta Format (advanced features). Responses use newline-delimited JSON (NDJSON) where each line contains a single JSON object.

Response Format Selection

The format is controlled by the delta-sharing-capabilities header:
delta-sharing-capabilities: responseformat=parquet
# or
delta-sharing-capabilities: responseformat=delta;readerfeatures=deletionvectors
Format Selection:
  • No header: Defaults to Parquet format
  • responseformat=parquet: Parquet format (compatible with all connectors)
  • responseformat=delta: Delta format (enables advanced features like deletion vectors, column mapping)
  • responseformat=delta,parquet: Server chooses based on table features

Parquet Format Responses

The Parquet format is used by Delta Sharing v1.0 and later. It’s the default format and compatible with all existing Delta Sharing connectors.

JSON Wrapper Object

Each line contains exactly one of these fields:
FieldTypeDescription
protocolObjectProtocol versioning information
metaDataObjectTable metadata (schema, partitions, etc.)
fileObjectIndividual data file reference
addObjectAdd file action (for CDF/streaming)
cdfObjectChange data file reference
removeObjectRemove file action (for CDF/streaming)
endStreamActionObjectEnd of stream marker (optional)

Protocol Object

Defines the minimum protocol version clients must support.
{
  "protocol": {
    "minReaderVersion": 1
  }
}
protocol.minReaderVersion
integer
required
Minimum protocol version required to read the table. Currently always 1. Will increase for non-forward-compatible changes.
Clients can safely ignore unrecognized fields in responses that support their reader version.

Metadata Object

Contains comprehensive table metadata including schema, partitioning, and configuration.
{
  "metaData": {
    "id": "f8d5c169-3d01-4ca3-ad9e-7dc3355aedb2",
    "name": "customer_table",
    "description": "Customer data table",
    "location": "s3://my-bucket/tables/customer",
    "auxiliaryLocations": [
      "s3://secondary-bucket/tables/customer"
    ],
    "accessModes": ["url", "dir"],
    "format": {
      "provider": "parquet"
    },
    "schemaString": "{\"type\":\"struct\",\"fields\":[{\"name\":\"id\",\"type\":\"long\",\"nullable\":false,\"metadata\":{}},{\"name\":\"name\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}}]}",
    "partitionColumns": ["date"],
    "configuration": {
      "enableChangeDataFeed": "true"
    },
    "version": 123,
    "size": 1048576,
    "numFiles": 42
  }
}
metaData.id
string
required
Unique identifier for the table (typically UUID)
metaData.name
string
User-provided table name
metaData.description
string
User-provided table description
metaData.location
string
Root directory of the table where delta log exists. Required for tables supporting dir access mode.
metaData.auxiliaryLocations
array
Additional storage locations for table files. Most tables only use the root location.
metaData.accessModes
array
Supported access modes: ["url"], ["dir"], or both. If absent, assume URL-only access.
metaData.format
object
required
Data file encoding specification
metaData.schemaString
string
required
Serialized JSON string representing the table schema. Parse this string to get the Schema Object.
metaData.partitionColumns
array
required
Array of column names used for partitioning. Empty array if table is not partitioned.
metaData.configuration
object
Map of table configuration options (e.g., {"enableChangeDataFeed": "true"})
metaData.version
long
Table version this metadata corresponds to (returned for versioned/CDF queries)
metaData.size
long
Total table size in bytes (if available in delta log)
metaData.numFiles
long
Total number of files in the table (if available in delta log)

File Object

Represents a single data file in the table with a pre-signed URL.
{
  "file": {
    "url": "https://bucket.s3.us-west-2.amazonaws.com/table/date%3D2021-04-28/part-00000-591723a8.snappy.parquet?X-Amz-Expires=900...",
    "id": "591723a8-6a27-4240-a90e-57426f4736d2",
    "partitionValues": {
      "date": "2021-04-28"
    },
    "size": 573,
    "stats": "{\"numRecords\":1,\"minValues\":{\"eventTime\":\"2021-04-28T23:33:48.719Z\"},\"maxValues\":{\"eventTime\":\"2021-04-28T23:33:48.719Z\"},\"nullCount\":{\"eventTime\":0}}",
    "version": 123,
    "timestamp": 1652140800000,
    "expirationTimestamp": 1652144400000
  }
}
file.url
string
required
HTTPS pre-signed URL to read the file. URLs may differ across responses for the same file.
file.id
string
required
Unique identifier guaranteed to be the same across requests. Use as cache key.
file.partitionValues
object
required
Map of partition column to value. Empty object {} for non-partitioned tables. See Partition Value Serialization.
file.size
long
required
File size in bytes
file.stats
string
Serialized JSON string with file statistics. Parse to get Statistics Object. May be absent.
file.version
long
Table version of the file (for versioned queries)
file.timestamp
long
Unix timestamp in milliseconds of the table version (for versioned queries)
file.expirationTimestamp
long
Unix timestamp in milliseconds when the URL expires

Data Change File Actions

Used in streaming and CDF queries to represent changes to the table.
Represents a file added to the table.
{
  "add": {
    "url": "https://...",
    "id": "591723a8-6a27-4240-a90e-57426f4736d2",
    "partitionValues": {"date": "2021-04-28"},
    "size": 573,
    "timestamp": 1652140800000,
    "version": 1,
    "stats": "{...}",
    "expirationTimestamp": 1652144400000
  }
}
Fields: Same as File object, but timestamp and version are required.
Represents a change data feed file.
{
  "cdf": {
    "url": "https://.../table/_change_data/date%3D2021-04-28/part-00000.snappy.parquet?...",
    "id": "591723a8-6a27-4240-a90e-57426f4736d2",
    "partitionValues": {"date": "2021-04-28"},
    "size": 689,
    "timestamp": 1652141000000,
    "version": 1,
    "expirationTimestamp": 1652144600000
  }
}
CDF files contain three metadata columns:
  • _change_type: insert, update_preimage, update_postimage, or delete
  • _commit_version: Table version of the change
  • _commit_timestamp: Unix timestamp in milliseconds
Represents a file removed from the table.
{
  "remove": {
    "url": "https://...",
    "id": "591723a8-6a27-4240-a90e-57426f4736d2",
    "partitionValues": {"date": "2021-04-28"},
    "size": 573,
    "timestamp": 1652140800000,
    "version": 1,
    "expirationTimestamp": 1652144400000
  }
}

EndStreamAction

Optional action marking the end of a response stream.
{
  "refreshToken": "Server-Encoded-Refresh-Token",
  "nextPageToken": "page-token-123",
  "minUrlExpirationTimestamp": 1652140800000
}
refreshToken
string
Token for refreshing pre-signed URLs in snapshot queries
nextPageToken
string
Token for fetching next page in paginated queries
minUrlExpirationTimestamp
long
Minimum URL expiration timestamp across all files in the response
errorMessage
string
Error message for failures during HTTP streaming. Client must fail the query when present.

Delta Format Responses

Delta format enables support for advanced Delta Lake features (deletion vectors, column mapping, etc.). Responses contain Delta actions that can be used to construct a local delta log.

JSON Wrapper Object (Delta)

Each line contains exactly one of:
FieldTypeDescription
protocolObjectDelta protocol wrapper
metaDataObjectDelta metadata wrapper
fileObjectDelta single action wrapper

Protocol (Delta Format)

{
  "protocol": {
    "deltaProtocol": {
      "minReaderVersion": 3,
      "minWriterVersion": 7,
      "readerFeatures": ["deletionVectors"],
      "writerFeatures": ["deletionVectors"]
    }
  }
}
protocol.deltaProtocol
object
required
Standard Delta Protocol object. Must be parsed by a Delta library.

Metadata (Delta Format)

{
  "metaData": {
    "version": 20,
    "size": 123456,
    "numFiles": 5,
    "location": "s3://bucket/table",
    "auxiliaryLocations": ["s3://bucket-aux/table"],
    "accessModes": ["url", "dir"],
    "deltaMetadata": {
      "id": "f8d5c169-3d01-4ca3-ad9e-7dc3355aedb2",
      "partitionColumns": ["date"],
      "format": {"provider": "parquet"},
      "schemaString": "{...}",
      "configuration": {
        "enableChangeDataFeed": "true",
        "delta.enableDeletionVectors": "true"
      }
    }
  }
}
metaData.deltaMetadata
object
required
Standard Delta Metadata object. Must be parsed by a Delta library.
metaData.version
long
Table version (for versioned/CDF queries)
metaData.size
long
Table size in bytes
metaData.numFiles
long
Number of files in table
metaData.location
string
Root directory (required for dir access)
metaData.auxiliaryLocations
array
Additional storage locations
metaData.accessModes
array
Supported access modes

File (Delta Format)

Wraps a Delta single action (Add, Remove, or CDC).
{
  "file": {
    "id": "591723a8-6a27-4240-a90e-57426f4736d2",
    "deletionVectorFileId": "dv-abc123",
    "version": 123,
    "timestamp": 1652140800000,
    "expirationTimestamp": 1652144400000,
    "deltaSingleAction": {
      "add": {
        "path": "https://bucket.s3.amazonaws.com/table/part-00000.snappy.parquet?...",
        "partitionValues": {"date": "2021-04-28"},
        "size": 573,
        "modificationTime": 1652140000000,
        "dataChange": true,
        "stats": "{...}",
        "deletionVector": {
          "storageType": "u",
          "pathOrInlineDv": "vBn[lx{q8@P<9BNH/isA",
          "offset": 1,
          "sizeInBytes": 36,
          "cardinality": 2
        }
      }
    }
  }
}
file.id
string
required
Unique file identifier (consistent across requests)
file.deletionVectorFileId
string
Unique identifier for deletion vector file (if present)
file.version
long
Table version (for versioned queries)
file.timestamp
long
Unix timestamp in milliseconds
file.expirationTimestamp
long
URL expiration timestamp
file.deltaSingleAction
object
required
Standard Delta single action (add, remove, or cdc) with path replaced by pre-signed URL. Must be parsed by Delta library.
Important: In Delta format, the path field in the deltaSingleAction contains the pre-signed URL, not the relative file path.

Schema Object

Table schemas use a subset of Spark SQL’s JSON schema representation.

Struct Type

{
  "type": "struct",
  "fields": [
    {
      "name": "id",
      "type": "long",
      "nullable": false,
      "metadata": {"comment": "Unique identifier"}
    },
    {
      "name": "data",
      "type": {
        "type": "struct",
        "fields": [
          {"name": "value", "type": "string", "nullable": true, "metadata": {}}
        ]
      },
      "nullable": true,
      "metadata": {}
    }
  ]
}
Struct Field:
  • name: Column name
  • type: Type name (primitive, struct, array, or map)
  • nullable: Whether column can be null
  • metadata: JSON map with additional info (e.g., comments)

Primitive Types

TypeDescription
stringUTF-8 string
long8-byte signed integer
integer4-byte signed integer
short2-byte signed integer
byte1-byte signed integer
float4-byte floating-point
double8-byte floating-point
booleantrue or false
binaryBinary data
dateCalendar date (year-month-day)
timestampMicrosecond precision timestamp (no timezone)
decimalFixed precision/scale decimal (max 38 digits)

Complex Types

{
  "type": "array",
  "elementType": "integer",
  "containsNull": false
}
  • elementType: Type of array elements
  • containsNull: Whether array can contain nulls
{
  "type": "map",
  "keyType": "string",
  "valueType": "integer",
  "valueContainsNull": true
}
  • keyType: Type of map keys
  • valueType: Type of map values
  • valueContainsNull: Whether values can be null

Partition Value Serialization

Partition values in partitionValues maps are serialized as strings:
TypeFormatExample
stringNo translation"value"
numericString representation"123"
date{year}-{month}-{day}"1970-01-01"
timestamp{year}-{month}-{day} {hour}:{minute}:{second}"1970-01-01 00:00:00"
boolean"true" or "false""true"
An empty string for any type represents a null partition value.

Per-file Statistics

File objects may include statistics in the stats field (serialized JSON string).

Global Statistics

StatisticDescription
numRecordsTotal number of records in file

Per-column Statistics

Statistics mirror the data schema:
StatisticDescription
nullCountNumber of null values per column
minValuesMinimum value per column
maxValuesMaximum value per column
{
  "numRecords": 1000,
  "minValues": {"id": 1, "price": 9.99},
  "maxValues": {"id": 1000, "price": 999.99},
  "nullCount": {"id": 0, "price": 5}
}
Statistics are optional and may be missing. Servers use them for query optimization and file pruning.

Next Steps

REST APIs

Explore REST API endpoints

Filtering

Learn about data filtering

Build docs developers (and LLMs) love