Storage Pipeline

Synapse SDK uses a three-phase pipeline to store data with cryptographic verification and on-chain provenance. Understanding this flow helps you optimize uploads and debug issues.

Overview

The storage pipeline ensures your data is:

Stored on physical storage
Replicated across multiple providers
Committed on-chain with PDP verification

Client Data
    ↓
  STORE  ────> Primary Provider (Endorsed)
    │               │
    │               │ PieceCID calculated
    │               │ Data written to disk
    │               │
    ↓               ↓
  PULL   <─────────── Secondary Provider(s)
                    │   (fetch from primary)
                    │
                    ↓
  COMMIT ────> Smart Contracts
                    │   - Create data set
                    │   - Register pieces
                    │   - Start PDP verification
                    │   - Set up payment rails
                    │
                    ↓
           On-chain provenance + proofs

Phase 1: Store

Upload data to the primary storage provider.

What Happens

PieceCID Calculation

The SDK or storage provider calculates the Filecoin PieceCID:

Binary merkle tree of 128 KiB chunks
Padded to next power of 2
Last 32 bytes = root hash (used in contracts)

This is the content-addressed identifier for your data.

Streaming Upload

Data is streamed to the provider’s Curio API:

POST /pdp/piece/uploads         // Create session
PUT  /pdp/piece/uploads/{uuid}  // Stream data
POST /pdp/piece/uploads/{uuid}  // Finalize

For large files, streaming avoids loading everything into memory.

Storage Confirmation

Provider writes data to disk and returns the PieceCID. No on-chain state yet.

Example: Store Only

import { Synapse } from '@filoz/synapse-sdk'

const synapse = Synapse.create({ chain, transport, account })

// Create a context for a specific provider
const context = await synapse.storage.createContext()

// Store data (no on-chain commitment)
const data = new TextEncoder().encode('Hello, Filecoin!')
const { pieceCid, size } = await context.store(data, {
  onProgress: (bytesUploaded) => {
    console.log(`Uploaded: ${bytesUploaded} bytes`)
  },
})

console.log('PieceCID:', pieceCid)
console.log('Size:', size)

// Data is on the provider but NOT on-chain
// The provider may garbage collect it if not committed

Data stored but not committed may be garbage collected by the provider. Always commit within a reasonable timeframe (minutes to hours).

Phase 2: Pull

Secondary providers fetch data from the primary via SP-to-SP transfer.

Why Pull?

Instead of uploading the same data multiple times from your client:

Bandwidth Efficiency: Upload once, providers replicate
Faster: Storage providers have better inter-connectivity
Cost Effective: Reduces client egress costs

How It Works

Presign for Commit

Create EIP-712 signature authorizing the piece addition:

const extraData = await secondary.presignForCommit([
  { pieceCid, pieceMetadata: { category: 'documents' } }
])

This signature serves dual purpose:

Authorization for the pull (Curio validates via estimateGas)
Authorization for the on-chain commit

SP-to-SP Transfer

Secondary provider requests the piece from primary:

await secondary.pull({
  pieces: [pieceCid],
  from: (cid) => primary.getPieceUrl(cid),
  extraData,
})

The Curio API endpoint:

POST /pdp/piece/pull
{
  "pieceCid": "baga6ea...",
  "url": "https://primary-sp.example/pdp/piece/baga6ea...",
  "extraData": "0x..."
}

Validation

Secondary provider:

Fetches data from primary
Verifies PieceCID matches
Validates extraData signature via estimateGas
Writes to local storage

Example: Manual Pull

const [primary, secondary] = await synapse.storage.createContexts({ count: 2 })

// Store on primary
const { pieceCid } = await primary.store(data)

// Presign for commit (reusable signature)
const extraData = await secondary.presignForCommit([{ pieceCid }])

// Pull to secondary
const pullResult = await secondary.pull({
  pieces: [pieceCid],
  from: (cid) => primary.getPieceUrl(cid),
  extraData,
  onProgress: (cid, status) => {
    console.log(`Pull status: ${status}`) // 'pending', 'active', 'complete'
  },
})

if (pullResult.status === 'complete') {
  console.log('Secondary has the data')
}

The pull operation is idempotent. If it fails mid-transfer, you can retry safely.

Phase 3: Commit

What Happens On-Chain

Transaction Submission

For each provider, call the FWSS contract:New data set:

createDataSetAndAddPieces(
  pieceDigests[],      // 32-byte piece roots
  pieceSizes[],        // Piece sizes in bytes
  metadata,            // Dataset metadata
  pieceMetadata[],     // Per-piece metadata
  extraData            // EIP-712 signature
)

Existing data set:

addPieces(
  dataSetId,
  pieceDigests[],
  pieceSizes[],
  pieceMetadata[],
  extraData
)

Signature Validation

FWSS contract validates the EIP-712 signature:

Recovers signer address
Verifies signer is the payer
Checks signature matches piece data

PDPVerifier Callback

FWSS calls PDPVerifier.submitProofSet():

Registers pieces for verification
Schedules first challenge window
PDPVerifier callbacks to FWSS for payment setup

Payment Rails

FWSS creates continuous payment rails:

PDP rail: Base storage cost (per epoch)
CDN rail: (if withCDN=true) Egress charges
Cache miss rail: (if withCDN=true) Provider egress

Payments flow automatically from your deposit to the provider.

Example: Commit

const result = await primary.commit({
  pieces: [{ pieceCid, pieceMetadata: { filename: 'example.txt' } }],
  onSubmitted: (txHash) => {
    console.log('Transaction submitted:', txHash)
    // Can show user a block explorer link
  },
})

console.log('Data set ID:', result.dataSetId)
console.log('Piece IDs:', result.pieceIds)
console.log('Is new dataset:', result.isNewDataSet)

Transaction Confirmation

Filecoin has a 30-second block time. Expect:

Transaction submission: Instant (returns hash)
First confirmation: ~30 seconds
Finality: ~60-90 seconds (2-3 blocks)

Always wait for transaction confirmation before considering data committed. Use callbacks or wait for the promise to resolve.

Combined Flow: upload()

The high-level upload() method orchestrates all three phases:

const result = await synapse.storage.upload(data, {
  count: 2, // 1 primary + 1 secondary
  callbacks: {
    // Phase 0: Provider selection
    onProviderSelected: (provider) => {
      console.log('Selected:', provider.id)
    },
    
    // Phase 1: Store
    onProgress: (bytesUploaded) => {
      console.log('Uploaded:', bytesUploaded)
    },
    onStored: (providerId, pieceCid) => {
      console.log('Stored on', providerId)
    },
    
    // Phase 2: Pull
    onPullProgress: (providerId, pieceCid, status) => {
      console.log(`SP ${providerId} pull:`, status)
    },
    onCopyComplete: (providerId, pieceCid) => {
      console.log('Copied to', providerId)
    },
    onCopyFailed: (providerId, pieceCid, error) => {
      console.error('Copy failed:', error)
    },
    
    // Phase 3: Commit
    onPiecesAdded: (txHash, providerId, pieces) => {
      console.log('Tx submitted:', txHash)
    },
    onPiecesConfirmed: (dataSetId, providerId, pieces) => {
      console.log('Confirmed:', dataSetId)
    },
  },
})

console.log('Upload complete!')
console.log('Copies:', result.copies.length)
console.log('Failures:', result.failures)

Failure Handling

Store Failure (Fatal)

If the primary store fails, upload() throws StoreError:

try {
  await synapse.storage.upload(data)
} catch (error) {
  if (error instanceof StoreError) {
    console.error('Failed to store on primary:', error.providerId)
    console.error('Endpoint:', error.endpoint)
    console.error('Cause:', error.cause)
  }
}

Pull Failure (Non-Fatal)

Secondary pull failures are reported but don’t throw:

const result = await synapse.storage.upload(data, { count: 3 })

if (result.failures.length > 0) {
  console.warn('Some copies failed:')
  for (const failure of result.failures) {
    console.warn(`Provider ${failure.providerId}: ${failure.error}`)
  }
}

// Check if you got enough copies
if (result.copies.length < 2) {
  throw new Error('Insufficient redundancy')
}

The SDK automatically retries with different providers (up to 5 attempts per secondary).

Commit Failure (Partial)

If commit fails on the primary, upload() throws CommitError:

try {
  await synapse.storage.upload(data)
} catch (error) {
  if (error instanceof CommitError) {
    console.error('Data stored but not committed on-chain')
    console.error('You can retry the commit with the same pieceCid')
  }
}

If commit fails on a secondary, it’s reported in result.failures.

Optimization Tips

1. Batch Multiple Files

For multiple files, use split operations to reduce on-chain transactions:

const context = await synapse.storage.createContext()

// Store all files
const stored = await Promise.all(
  files.map(file => context.store(file))
)

// Commit in one transaction
await context.commit({
  pieces: stored.map(s => ({ pieceCid: s.pieceCid })),
})

This creates one transaction instead of N transactions.

2. Pre-calculate PieceCID

If you already have the PieceCID (e.g., from a previous calculation):

import * as Piece from '@filoz/synapse-core/piece'

const pieceCid = await Piece.calculate(data)

await context.store(data, { pieceCid })
// Provider verifies but doesn't recalculate

3. Reuse Contexts

Contexts are cached and reused automatically:

// First upload creates contexts
const result1 = await synapse.storage.upload(data1)

// Second upload reuses the same contexts (same providers)
const result2 = await synapse.storage.upload(data2)

This reduces provider selection overhead and keeps data on the same providers.

Next Steps

Multi-Copy Upload

Deep dive into redundancy and failure handling

Provider Selection

How the SDK chooses storage providers

Storage Operations Guide

Practical examples of uploads and downloads

Split Operations Example

See the pipeline in action with real code

Get Started

Core Concepts

Guides

Examples

Storage Pipeline

Storage Pipeline

Overview

Phase 1: Store

What Happens

Example: Store Only

Phase 2: Pull

Why Pull?

How It Works

Example: Manual Pull

Phase 3: Commit

What Happens On-Chain

Example: Commit

Transaction Confirmation

Combined Flow: upload()

Failure Handling

Store Failure (Fatal)

Pull Failure (Non-Fatal)

Commit Failure (Partial)

Optimization Tips

1. Batch Multiple Files

2. Pre-calculate PieceCID

3. Reuse Contexts

Next Steps

Multi-Copy Upload

Provider Selection

Storage Operations Guide

Split Operations Example

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

​Storage Pipeline

​Overview

​Phase 1: Store

​What Happens

​Example: Store Only

​Phase 2: Pull

​Why Pull?

​How It Works

​Example: Manual Pull

​Phase 3: Commit

​What Happens On-Chain

​Example: Commit

​Transaction Confirmation

​Combined Flow: upload()

​Failure Handling

​Store Failure (Fatal)

​Pull Failure (Non-Fatal)

​Commit Failure (Partial)

​Optimization Tips

​1. Batch Multiple Files

​2. Pre-calculate PieceCID

​3. Reuse Contexts

​Next Steps

Multi-Copy Upload

Provider Selection

Storage Operations Guide

Split Operations Example

Build docs developers (and LLMs) love

Storage Pipeline

Overview

Phase 1: Store

What Happens

Example: Store Only

Phase 2: Pull

Why Pull?

How It Works

Example: Manual Pull

Phase 3: Commit

What Happens On-Chain

Example: Commit

Transaction Confirmation

Combined Flow: upload()

Failure Handling

Store Failure (Fatal)

Pull Failure (Non-Fatal)

Commit Failure (Partial)

Optimization Tips

1. Batch Multiple Files

2. Pre-calculate PieceCID

3. Reuse Contexts

Next Steps