Compute-to-Data (C2D) is a revolutionary feature that enables algorithms to be executed on datasets without the data ever leaving the owner’s infrastructure. This preserves data privacy while enabling valuable analytics and AI/ML operations.
Overview
Compute-to-Data solves the data privacy paradox: how to gain insights from sensitive data without exposing it.
With C2D, you can:
Run algorithms on datasets without downloading them
Maintain complete data privacy and control
Comply with data protection regulations (GDPR, HIPAA, etc.)
Monetize sensitive data safely
Execute AI/ML workloads on distributed data
For Data Owners Keep your data private while enabling others to run analytics and gain insights
For Algorithm Developers Access valuable datasets without needing to download or store sensitive information
How Compute-to-Data Works
Select Dataset and Algorithm
Choose a dataset that supports compute and an algorithm to run against it.
Configure Compute Environment
Select compute resources and environment specifications for job execution.
Order and Pay
Purchase access to both the dataset and algorithm (if required).
Job Execution
The algorithm runs in an isolated container with access only to the specified dataset.
Retrieve Results
Download the algorithm output and logs after job completion.
Starting a Compute Job
Component Architecture
The Compute component handles the entire C2D workflow:
// From: src/components/Asset/AssetActions/Compute/index.tsx
export default function Compute ({
accountId ,
signer ,
asset ,
dtBalance ,
file ,
isAccountIdWhitelisted
} : {
accountId : string
signer : Signer
asset : AssetExtended
dtBalance : string
file : FileInfo
isAccountIdWhitelisted : boolean
}) : ReactElement {
const [ selectedAlgorithmAsset , setSelectedAlgorithmAsset ] =
useState < AssetExtended >()
const [ selectedComputeEnv , setSelectedComputeEnv ] =
useState < ComputeEnvironment >()
const [ jobs , setJobs ] = useState < ComputeJobMetaData []>([])
const [ isOrdering , setIsOrdering ] = useState ( false )
const [ isOrdered , setIsOrdered ] = useState ( false )
// Initialize compute environments
const initializeComputeEnvironment = useCallback ( async () => {
const computeEnvs = await getComputeEnvironments (
asset . services [ 0 ]. serviceEndpoint ,
asset . chainId
)
setComputeEnvs ( computeEnvs || [])
}, [ asset ])
useEffect (() => {
initializeComputeEnvironment ()
}, [ initializeComputeEnvironment ])
return (
< Formik
initialValues = { getInitialValues (
asset ,
selectedAlgorithmAsset ,
selectedComputeEnv ,
false ,
false
) }
validationSchema = { getComputeValidationSchema (
asset . services [ 0 ]. consumerParameters ,
selectedAlgorithmAsset ?. services [ 0 ]. consumerParameters ,
selectedAlgorithmAsset ?. metadata ?. algorithm ?. consumerParameters
) }
onSubmit = { onSubmit }
>
< FormStartComputeDataset
algorithms = { algorithmList }
selectedAlgorithmAsset = { selectedAlgorithmAsset }
setSelectedAlgorithm = { setSelectedAlgorithmAsset }
isLoading = { isOrdering }
computeEnvs = { computeEnvs }
setSelectedComputeEnv = { setSelectedComputeEnv }
/>
</ Formik >
)
}
Initialize Provider for Compute
Before starting a job, the provider must be initialized to verify permissions and calculate costs:
// From: src/@utils/provider.ts
export async function initializeProviderForCompute (
dataset : AssetExtended ,
algorithm : AssetExtended ,
accountId : string ,
computeEnv : ComputeEnvironment = null
) : Promise < ProviderComputeInitializeResults > {
const computeAsset : ComputeAsset = {
documentId: dataset . id ,
serviceId: dataset . services [ 0 ]. id ,
transferTxId: dataset . accessDetails . validOrderTx
}
const computeAlgo : ComputeAlgorithm = {
documentId: algorithm . id ,
serviceId: algorithm . services [ 0 ]. id ,
transferTxId: algorithm . accessDetails . validOrderTx
}
const validUntil = getValidUntilTime (
computeEnv ?. maxJobDuration ,
dataset . services [ 0 ]. timeout ,
algorithm . services [ 0 ]. timeout
)
try {
return await ProviderInstance . initializeCompute (
[ computeAsset ],
computeAlgo ,
computeEnv ?. id ,
validUntil ,
customProviderUrl || dataset . services [ 0 ]. serviceEndpoint ,
accountId
)
} catch ( error ) {
LoggerInstance . error ( '[Initialize Provider] Error:' , error . message )
return null
}
}
The validUntil parameter ensures compute jobs respect timeout limits set by both the dataset and algorithm publishers.
Price and Fee Calculation
C2D jobs involve multiple fee components:
// From: src/components/Asset/AssetActions/Compute/index.tsx
async function initPriceAndFees () {
try {
if ( ! selectedComputeEnv || ! selectedComputeEnv . id )
throw new Error ( `Error getting compute environment!` )
const initializedProvider = await initializeProviderForCompute (
asset ,
selectedAlgorithmAsset ,
accountId || ZERO_ADDRESS ,
selectedComputeEnv
)
if (
! initializedProvider ||
! initializedProvider ?. datasets ||
! initializedProvider ?. algorithm
)
throw new Error ( `Error initializing provider for the compute job!` )
// Set dataset price
await setDatasetPrice ( initializedProvider ?. datasets ?.[ 0 ]?. providerFee )
// Set algorithm price
await setAlgoPrice ( initializedProvider ?. algorithm ?. providerFee )
// Set compute fees
const sanitizedResponse = await setComputeFees ( initializedProvider )
setInitializedProviderResponse ( sanitizedResponse )
} catch ( error ) {
setError ( error . message )
LoggerInstance . error ( `[compute] ${ error . message } ` )
}
}
Dataset Fee
Algorithm Fee
Provider Fee
Market Fee
Payment to access and use the dataset for computation.
Payment to use the algorithm (if not owned).
Infrastructure and execution costs charged by the compute provider.
Platform fee for marketplace services.
Starting the Compute Job
Once prices are confirmed and orders placed, the compute job can start:
// From: src/components/Asset/AssetActions/Compute/index.tsx
async function startJob ( userCustomParameters : {
dataServiceParams ?: UserCustomParameters
algoServiceParams ?: UserCustomParameters
algoParams ?: UserCustomParameters
}) : Promise < void > {
try {
setIsOrdering ( true )
setIsOrdered ( false )
setError ( undefined )
const computeService = getServiceByName ( asset , 'compute' )
const computeAlgorithm : ComputeAlgorithm = {
documentId: selectedAlgorithmAsset . id ,
serviceId: selectedAlgorithmAsset . services [ 0 ]. id ,
algocustomdata: userCustomParameters ?. algoParams ,
userdata: userCustomParameters ?. algoServiceParams
}
// Verify dataset is orderable with this algorithm
const allowed = await isOrderable (
asset ,
computeService . id ,
computeAlgorithm ,
selectedAlgorithmAsset
)
if ( ! allowed )
throw new Error (
'Dataset is not orderable in combination with selected algorithm.'
)
// Order algorithm
const algorithmOrderTx = await handleComputeOrder (
signer ,
selectedAlgorithmAsset ,
algoOrderPriceAndFees ,
accountId ,
initializedProviderResponse . algorithm ,
hasAlgoAssetDatatoken ,
selectedComputeEnv . consumerAddress
)
if ( ! algorithmOrderTx ) throw new Error ( 'Failed to order algorithm.' )
// Order dataset
const datasetOrderTx = await handleComputeOrder (
signer ,
asset ,
datasetOrderPriceAndFees ,
accountId ,
initializedProviderResponse . datasets [ 0 ],
hasDatatoken ,
selectedComputeEnv . consumerAddress
)
if ( ! datasetOrderTx ) throw new Error ( 'Failed to order dataset.' )
// Start compute job
const computeAsset : ComputeAsset = {
documentId: asset . id ,
serviceId: asset . services [ 0 ]. id ,
transferTxId: datasetOrderTx ,
userdata: userCustomParameters ?. dataServiceParams
}
computeAlgorithm . transferTxId = algorithmOrderTx
const output : ComputeOutput = {
publishAlgorithmLog: true ,
publishOutput: true
}
const response = await ProviderInstance . computeStart (
asset . services [ 0 ]. serviceEndpoint ,
signer ,
selectedComputeEnv ?. id ,
computeAsset ,
computeAlgorithm ,
newAbortController (),
null ,
output
)
if ( ! response ) throw new Error ( 'Error starting compute job.' )
setIsOrdered ( true )
setRefetchJobs ( ! refetchJobs )
} catch ( error ) {
LoggerInstance . error ( '[Compute] Error:' , error . message )
setError ( error . message )
} finally {
setIsOrdering ( false )
}
}
Always verify that the algorithm is allowed to run on the selected dataset. Publishers can restrict which algorithms can access their data.
Monitoring Compute Jobs
Track your running and completed compute jobs:
// From: src/components/Asset/AssetActions/Compute/index.tsx
const fetchJobs = useCallback (
async ( type : string ) => {
if ( ! chainIds || chainIds . length === 0 || ! accountId ) {
return
}
try {
type === 'init' && setIsLoadingJobs ( true )
const computeJobs = await getComputeJobs (
[ asset ?. chainId ] || chainIds ,
address ,
asset ,
newCancelToken ()
)
setJobs ( computeJobs . computeJobs )
setIsLoadingJobs ( ! computeJobs . isLoaded )
} catch ( error ) {
LoggerInstance . error ( error . message )
setIsLoadingJobs ( false )
}
},
[ address , accountId , asset , chainIds ]
)
useEffect (() => {
fetchJobs ( 'init' )
// Periodic refresh for jobs every 10 seconds
const balanceInterval = setInterval (
() => fetchJobs ( 'repeat' ),
10000
)
return () => {
clearInterval ( balanceInterval )
}
}, [ refetchJobs ])
Job Status Display
< ComputeHistory
title = "Your Compute Jobs"
refetchJobs = { () => setRefetchJobs ( ! refetchJobs ) }
>
< ComputeJobs
minimal
jobs = { jobs }
isLoading = { isLoadingJobs }
refetchJobs = { () => setRefetchJobs ( ! refetchJobs ) }
/>
</ ComputeHistory >
Jobs are automatically refreshed every 10 seconds to show real-time progress updates.
Algorithm Publishing for C2D
When publishing algorithms for compute-to-data, you can configure privacy settings:
// From: src/components/Publish/Metadata/index.tsx
{ values . metadata . type === 'algorithm' && (
<>
< Field
{ ... getFieldContent ( 'dockerImage' , content . metadata . fields ) }
component = { Input }
name = "metadata.dockerImage"
options = { dockerImageOptions }
/>
{ values . metadata . dockerImage === 'custom' && (
<>
< Field
{ ... getFieldContent ( 'dockerImageCustom' , content . metadata . fields ) }
component = { Input }
name = "metadata.dockerImageCustom"
/>
< Field
{ ... getFieldContent ( 'dockerImageChecksum' , content . metadata . fields ) }
component = { Input }
name = "metadata.dockerImageCustomChecksum"
/>
< Field
{ ... getFieldContent ( 'dockerImageCustomEntrypoint' , content . metadata . fields ) }
component = { Input }
name = "metadata.dockerImageCustomEntrypoint"
/>
</>
) }
</>
)}
Algorithm Privacy
Algorithms can be set to private mode, preventing downloads while allowing execution:
{ asset . services [ 0 ]. type === 'compute' && (
< Alert
text = {
"This algorithm has been set to private by the publisher and can't be downloaded. You can run it against any allowed datasets though!"
}
state = "info"
/>
)}
Consumer Parameters
Both datasets and algorithms can accept custom parameters at runtime:
// Parse and pass consumer parameters to compute job
const userCustomParameters = {
dataServiceParams: parseConsumerParameterValues (
values ?. dataServiceParams ,
asset . services [ 0 ]. consumerParameters
),
algoServiceParams: parseConsumerParameterValues (
values ?. algoServiceParams ,
selectedAlgorithmAsset ?. services [ 0 ]. consumerParameters
),
algoParams: parseConsumerParameterValues (
values ?. algoParams ,
selectedAlgorithmAsset ?. metadata ?. algorithm ?. consumerParameters
)
}
await startJob ( userCustomParameters )
Consumer parameters enable dynamic algorithm behavior without modifying the algorithm code.
Whitelist Access Control
Datasets can restrict access to specific wallet addresses:
// From: src/components/Asset/AssetActions/Compute/WhitelistIndicator.tsx
{ accountId && (
< WhitelistIndicator
accountId = { accountId }
isAccountIdWhitelisted = { isAccountIdWhitelisted }
/>
)}
If a dataset has a whitelist enabled, only approved addresses can start compute jobs. Contact the dataset publisher to request access.
Compute Environments
Compute environments define the resources and specifications for job execution:
interface ComputeEnvironment {
id : string
desc : string
consumerAddress : string
cpuNumber : number
cpuType : string
gpuNumber : number
gpuType : string
ramGB : number
diskGB : number
maxJobDuration : number
priceMin : number
}
Users select the appropriate environment based on their algorithm’s requirements:
< FormStartComputeDataset
computeEnvs = { computeEnvs }
setSelectedComputeEnv = { setSelectedComputeEnv }
// ... other props
/>
Best Practices
Verify Compatibility
Ensure your algorithm is compatible with the dataset’s compute environment and restrictions.
Test with Free Datasets
Start with free or test datasets to validate your algorithm before running on paid datasets.
Monitor Job Progress
Regularly check job status and logs to catch errors early.
Optimize Resource Usage
Choose the minimum compute environment that meets your needs to reduce costs.
Handle Timeouts
Design algorithms to complete within the dataset’s timeout limits.
See Also
Asset Publishing Learn how to publish datasets with compute support
Data Marketplace Explore the marketplace for datasets and algorithms
Compute Jobs Guide Step-by-step guide to running Compute-to-Data jobs
GDPR Compliance Maintain compliance with data protection regulations