Storage Pipeline
Synapse SDK uses a three-phase pipeline to store data with cryptographic verification and on-chain provenance. Understanding this flow helps you optimize uploads and debug issues.
Overview
The storage pipeline ensures your data is:
Stored on physical storage
Replicated across multiple providers
Committed on-chain with PDP verification
Client Data
↓
STORE ────> Primary Provider (Endorsed)
│ │
│ │ PieceCID calculated
│ │ Data written to disk
│ │
↓ ↓
PULL <─────────── Secondary Provider(s)
│ (fetch from primary)
│
↓
COMMIT ────> Smart Contracts
│ - Create data set
│ - Register pieces
│ - Start PDP verification
│ - Set up payment rails
│
↓
On-chain provenance + proofs
Phase 1: Store
Upload data to the primary storage provider.
What Happens
PieceCID Calculation
The SDK or storage provider calculates the Filecoin PieceCID:
Binary merkle tree of 128 KiB chunks
Padded to next power of 2
Last 32 bytes = root hash (used in contracts)
This is the content-addressed identifier for your data.
Streaming Upload
Data is streamed to the provider’s Curio API: POST / pdp / piece / uploads // Create session
PUT / pdp / piece / uploads / { uuid } // Stream data
POST / pdp / piece / uploads / { uuid } // Finalize
For large files, streaming avoids loading everything into memory.
Storage Confirmation
Provider writes data to disk and returns the PieceCID. No on-chain state yet.
Example: Store Only
import { Synapse } from '@filoz/synapse-sdk'
const synapse = Synapse . create ({ chain , transport , account })
// Create a context for a specific provider
const context = await synapse . storage . createContext ()
// Store data (no on-chain commitment)
const data = new TextEncoder (). encode ( 'Hello, Filecoin!' )
const { pieceCid , size } = await context . store ( data , {
onProgress : ( bytesUploaded ) => {
console . log ( `Uploaded: ${ bytesUploaded } bytes` )
},
})
console . log ( 'PieceCID:' , pieceCid )
console . log ( 'Size:' , size )
// Data is on the provider but NOT on-chain
// The provider may garbage collect it if not committed
Data stored but not committed may be garbage collected by the provider. Always commit within a reasonable timeframe (minutes to hours).
Phase 2: Pull
Secondary providers fetch data from the primary via SP-to-SP transfer.
Why Pull?
Instead of uploading the same data multiple times from your client:
Bandwidth Efficiency : Upload once, providers replicate
Faster : Storage providers have better inter-connectivity
Cost Effective : Reduces client egress costs
How It Works
Presign for Commit
Create EIP-712 signature authorizing the piece addition: const extraData = await secondary . presignForCommit ([
{ pieceCid , pieceMetadata: { category: 'documents' } }
])
This signature serves dual purpose:
Authorization for the pull (Curio validates via estimateGas)
Authorization for the on-chain commit
SP-to-SP Transfer
Secondary provider requests the piece from primary: await secondary . pull ({
pieces: [ pieceCid ],
from : ( cid ) => primary . getPieceUrl ( cid ),
extraData ,
})
The Curio API endpoint: POST /pdp/piece/pull
{
"pieceCid": "baga6ea...",
"url": "https://primary-sp.example/pdp/piece/baga6ea...",
"extraData": "0x..."
}
Validation
Secondary provider:
Fetches data from primary
Verifies PieceCID matches
Validates extraData signature via estimateGas
Writes to local storage
Example: Manual Pull
const [ primary , secondary ] = await synapse . storage . createContexts ({ count: 2 })
// Store on primary
const { pieceCid } = await primary . store ( data )
// Presign for commit (reusable signature)
const extraData = await secondary . presignForCommit ([{ pieceCid }])
// Pull to secondary
const pullResult = await secondary . pull ({
pieces: [ pieceCid ],
from : ( cid ) => primary . getPieceUrl ( cid ),
extraData ,
onProgress : ( cid , status ) => {
console . log ( `Pull status: ${ status } ` ) // 'pending', 'active', 'complete'
},
})
if ( pullResult . status === 'complete' ) {
console . log ( 'Secondary has the data' )
}
The pull operation is idempotent. If it fails mid-transfer, you can retry safely.
Phase 3: Commit
Register pieces on-chain and start PDP verification.
What Happens On-Chain
Transaction Submission
For each provider, call the FWSS contract: New data set :createDataSetAndAddPieces (
pieceDigests[], // 32-byte piece roots
pieceSizes[], // Piece sizes in bytes
metadata, // Dataset metadata
pieceMetadata[], // Per-piece metadata
extraData // EIP-712 signature
)
Existing data set :addPieces (
dataSetId,
pieceDigests[],
pieceSizes[],
pieceMetadata[],
extraData
)
Signature Validation
FWSS contract validates the EIP-712 signature:
Recovers signer address
Verifies signer is the payer
Checks signature matches piece data
PDPVerifier Callback
FWSS calls PDPVerifier.submitProofSet():
Registers pieces for verification
Schedules first challenge window
PDPVerifier callbacks to FWSS for payment setup
Payment Rails
FWSS creates continuous payment rails:
PDP rail : Base storage cost (per epoch)
CDN rail : (if withCDN=true) Egress charges
Cache miss rail : (if withCDN=true) Provider egress
Payments flow automatically from your deposit to the provider.
Example: Commit
const result = await primary . commit ({
pieces: [{ pieceCid , pieceMetadata: { filename: 'example.txt' } }],
onSubmitted : ( txHash ) => {
console . log ( 'Transaction submitted:' , txHash )
// Can show user a block explorer link
},
})
console . log ( 'Data set ID:' , result . dataSetId )
console . log ( 'Piece IDs:' , result . pieceIds )
console . log ( 'Is new dataset:' , result . isNewDataSet )
Transaction Confirmation
Filecoin has a 30-second block time. Expect:
Transaction submission : Instant (returns hash)
First confirmation : ~30 seconds
Finality : ~60-90 seconds (2-3 blocks)
Always wait for transaction confirmation before considering data committed. Use callbacks or wait for the promise to resolve.
Combined Flow: upload()
The high-level upload() method orchestrates all three phases:
const result = await synapse . storage . upload ( data , {
count: 2 , // 1 primary + 1 secondary
callbacks: {
// Phase 0: Provider selection
onProviderSelected : ( provider ) => {
console . log ( 'Selected:' , provider . id )
},
// Phase 1: Store
onProgress : ( bytesUploaded ) => {
console . log ( 'Uploaded:' , bytesUploaded )
},
onStored : ( providerId , pieceCid ) => {
console . log ( 'Stored on' , providerId )
},
// Phase 2: Pull
onPullProgress : ( providerId , pieceCid , status ) => {
console . log ( `SP ${ providerId } pull:` , status )
},
onCopyComplete : ( providerId , pieceCid ) => {
console . log ( 'Copied to' , providerId )
},
onCopyFailed : ( providerId , pieceCid , error ) => {
console . error ( 'Copy failed:' , error )
},
// Phase 3: Commit
onPiecesAdded : ( txHash , providerId , pieces ) => {
console . log ( 'Tx submitted:' , txHash )
},
onPiecesConfirmed : ( dataSetId , providerId , pieces ) => {
console . log ( 'Confirmed:' , dataSetId )
},
},
})
console . log ( 'Upload complete!' )
console . log ( 'Copies:' , result . copies . length )
console . log ( 'Failures:' , result . failures )
Failure Handling
Store Failure (Fatal)
If the primary store fails, upload() throws StoreError:
try {
await synapse . storage . upload ( data )
} catch ( error ) {
if ( error instanceof StoreError ) {
console . error ( 'Failed to store on primary:' , error . providerId )
console . error ( 'Endpoint:' , error . endpoint )
console . error ( 'Cause:' , error . cause )
}
}
Pull Failure (Non-Fatal)
Secondary pull failures are reported but don’t throw:
const result = await synapse . storage . upload ( data , { count: 3 })
if ( result . failures . length > 0 ) {
console . warn ( 'Some copies failed:' )
for ( const failure of result . failures ) {
console . warn ( `Provider ${ failure . providerId } : ${ failure . error } ` )
}
}
// Check if you got enough copies
if ( result . copies . length < 2 ) {
throw new Error ( 'Insufficient redundancy' )
}
The SDK automatically retries with different providers (up to 5 attempts per secondary).
Commit Failure (Partial)
If commit fails on the primary, upload() throws CommitError:
try {
await synapse . storage . upload ( data )
} catch ( error ) {
if ( error instanceof CommitError ) {
console . error ( 'Data stored but not committed on-chain' )
console . error ( 'You can retry the commit with the same pieceCid' )
}
}
If commit fails on a secondary, it’s reported in result.failures.
Optimization Tips
1. Batch Multiple Files
For multiple files, use split operations to reduce on-chain transactions:
const context = await synapse . storage . createContext ()
// Store all files
const stored = await Promise . all (
files . map ( file => context . store ( file ))
)
// Commit in one transaction
await context . commit ({
pieces: stored . map ( s => ({ pieceCid: s . pieceCid })),
})
This creates one transaction instead of N transactions.
2. Pre-calculate PieceCID
If you already have the PieceCID (e.g., from a previous calculation):
import * as Piece from '@filoz/synapse-core/piece'
const pieceCid = await Piece . calculate ( data )
await context . store ( data , { pieceCid })
// Provider verifies but doesn't recalculate
3. Reuse Contexts
Contexts are cached and reused automatically:
// First upload creates contexts
const result1 = await synapse . storage . upload ( data1 )
// Second upload reuses the same contexts (same providers)
const result2 = await synapse . storage . upload ( data2 )
This reduces provider selection overhead and keeps data on the same providers.
Next Steps
Multi-Copy Upload Deep dive into redundancy and failure handling
Provider Selection How the SDK chooses storage providers
Storage Operations Guide Practical examples of uploads and downloads
Split Operations Example See the pipeline in action with real code