Skip to main content
Delta Sharing organizes data using a three-level hierarchy: Shares contain Schemas, which contain Tables. This structure provides flexible access control and logical grouping of related datasets.

Hierarchical Structure

The data model follows a clear hierarchy:
Share (vaccine_share)
└── Schema (acme_vaccine_data)
    ├── Table (vaccine_ingredients)
    └── Table (vaccine_patients)
Each level serves a specific purpose in organizing and controlling access to data.

Shares

A share is the top-level logical grouping used to distribute data to recipients. Shares define the access boundary for data sharing.

Key Characteristics

Access Control

Recipients can access all resources within a share they’re granted access to

Multi-Recipient

A single share can be shared with one or multiple recipients

Multi-Schema

A share may contain multiple schemas for organizing related data

Unique Identity

Each share has an optional immutable ID (UUID format recommended)

Share Metadata

Shares include the following metadata:
FieldTypeRequiredDescription
nameStringYesShare name (max 255 chars, case-insensitive)
idStringNoUnique immutable identifier (UUID recommended)
displayNameStringNoHuman-friendly name for display (max 255 chars)
commentStringNoDescription or notes (max 65536 chars)
propertiesMap<String, String>NoCustom key-value metadata (max 50 pairs)
The id field, when provided, remains immutable throughout the share’s lifecycle, enabling stable references even if the share name changes.

Example Share

{
  "name": "vaccine_share",
  "id": "edacc4a7-6600-4fbb-85f3-a62a5ce6761f",
  "displayName": "Vaccine Share",
  "comment": "A sample share containing vaccine-related datasets",
  "properties": {
    "owner": "vaccine-team",
    "region": "us-west-2",
    "created_date": "2024-01-15"
  }
}

Schemas

A schema is a logical grouping of tables within a share. Schemas help organize related tables and provide namespace separation.

Key Characteristics

  • Namespace: Provides logical separation between different table collections
  • Organization: Groups related tables together for easier discovery
  • Hierarchical: Belongs to exactly one share
  • Case-Insensitive: Schema names are case-insensitive across the protocol

Schema Metadata

FieldTypeRequiredDescription
nameStringYesSchema name (max 255 chars, no periods)
shareStringYesParent share name

Example Schema

{
  "name": "acme_vaccine_data",
  "share": "vaccine_share"
}
Schema names must not contain the period (.) character to avoid conflicts with table references.

Tables

A table represents a Delta Lake table or a view on top of a Delta Lake table. Tables are the actual data containers that recipients access.

Key Characteristics

Delta Format

All tables are Delta Lake tables stored in Parquet format

Versioned

Tables track version history for time travel queries

Partitioned

Support for partition columns to optimize queries

Stateful

Include per-file statistics for query optimization

Table Metadata

FieldTypeRequiredDescription
nameStringYesTable name (max 255 chars, no periods)
schemaStringYesParent schema name
shareStringYesParent share name
idStringNoUnique table identifier within share (UUID)
shareIdStringNoImmutable share identifier
locationStringNo*Root directory path (required for dir access)
auxiliaryLocationsArray<String>NoAdditional storage locations
accessModesArray<String>NoSupported access modes (url, dir)
The location field is required when the table supports directory-based access mode.

Example Table

{
  "name": "vaccine_patients",
  "schema": "acme_vaccine_data",
  "share": "vaccine_share",
  "id": "c48f3e19-2c29-4ea3-b6f7-3899e53338fa",
  "shareId": "edacc4a7-6600-4fbb-85f3-a62a5ce6761f",
  "location": "s3://deltasharing/vaccine_share/acme_vaccine_data/vaccine_patients",
  "accessModes": ["url", "dir"]
}

Table Schema and Format

Each table has a detailed schema definition and format specification:

Schema Definition

Table schemas use a JSON representation compatible with Apache Spark SQL:
{
  "type": "struct",
  "fields": [
    {
      "name": "eventTime",
      "type": "timestamp",
      "nullable": true,
      "metadata": {}
    },
    {
      "name": "date",
      "type": "date",
      "nullable": true,
      "metadata": {}
    },
    {
      "name": "patient_id",
      "type": "long",
      "nullable": false,
      "metadata": {
        "comment": "Unique patient identifier"
      }
    }
  ]
}

Supported Data Types

  • string: UTF-8 encoded text
  • long: 8-byte signed integer
  • integer: 4-byte signed integer
  • short: 2-byte signed integer
  • byte: 1-byte signed integer
  • float: 4-byte floating-point
  • double: 8-byte floating-point
  • boolean: true/false
  • binary: Binary data
  • date: Calendar date (year-month-day)
  • timestamp: Microsecond precision timestamp
  • decimal: Fixed precision decimal numbers

Partition Columns

Tables can be partitioned to optimize query performance:
{
  "metaData": {
    "partitionColumns": ["date", "region"],
    "schemaString": "..."
  }
}
Partition values are serialized as strings:
TypeFormatExample
date{year}-{month}-{day}2021-04-28
timestamp{year}-{month}-{day} {hour}:{minute}:{second}2021-04-28 23:33:48
numericString representation123
booleantrue or falsetrue
nullEmpty string""

Naming Conventions

All Delta Sharing objects must follow these naming rules:
All Objects
  • Maximum 255 characters
  • Case-insensitive
  • Cannot contain:
    • Space ( )
    • Forward slash (/)
    • ASCII control characters (00-1F hex)
    • DELETE character (7F hex)
Tables and Schemas Only
  • Additionally cannot contain period (.)

Valid Examples

valid_share_name
Vaccine_Data_2024
acme-vaccine-data

Invalid Examples

invalid share name  # Contains space
invalid/share       # Contains forward slash
table.name          # Period not allowed for tables/schemas

Querying the Hierarchy

Clients can navigate the hierarchy using REST APIs:
1

List Shares

GET {prefix}/shares
Discover all accessible shares
2

List Schemas

GET {prefix}/shares/vaccine_share/schemas
Find schemas within a share
3

List Tables

GET {prefix}/shares/vaccine_share/schemas/acme_vaccine_data/tables
Discover tables within a schema
4

Query Table

POST {prefix}/shares/vaccine_share/schemas/acme_vaccine_data/tables/vaccine_patients/query
Access table data

Pagination Support

All list operations support pagination for large result sets:
{
  "items": [...],
  "nextPageToken": "eyJvZmZzZXQiOjEwMH0="
}
Query Parameters:
  • maxResults: Maximum items per page (optional)
  • pageToken: Token from previous response to get next page
The server may return fewer items than maxResults even if more are available. Always check for nextPageToken to determine if additional pages exist.

Complete Example

Here’s a full example showing the hierarchy:
// Share
{
  "name": "vaccine_share",
  "id": "edacc4a7-6600-4fbb-85f3-a62a5ce6761f",
  "displayName": "Vaccine Share"
}

// Schema
{
  "name": "acme_vaccine_data",
  "share": "vaccine_share"
}

// Table
{
  "name": "vaccine_patients",
  "schema": "acme_vaccine_data",
  "share": "vaccine_share",
  "location": "s3://deltasharing/vaccine_share/acme_vaccine_data/vaccine_patients",
  "accessModes": ["url", "dir"]
}

// Table Metadata
{
  "metaData": {
    "id": "c48f3e19-2c29-4ea3-b6f7-3899e53338fa",
    "format": {"provider": "parquet"},
    "schemaString": "{\"type\":\"struct\",\"fields\":[...]}",
    "partitionColumns": ["date"],
    "configuration": {
      "enableChangeDataFeed": "true"
    }
  }
}

Next Steps

Access Modes

Learn about URL-based and directory-based access patterns

Protocol Overview

Understand the REST API and authentication

Profile Files

Configure recipient access with profile files

API Reference

Explore detailed API documentation

Build docs developers (and LLMs) love