Overview
Panel matching (also called identity matching or join key exchange) allows two or more DataProviders to:- Identify overlapping users across their panels
- Compute joint measurements on matched users
- Maintain privacy through cryptographic protocols
Panel matching uses Multi-Party Computation (MPC) protocols to ensure that individual user identifiers are never revealed to other parties.
Exchange Workflows
AnExchangeWorkflow encodes the panel matching protocol between parties. Exchange workflows define:
- Participants - Which DataProviders are involved
- Steps - The sequence of cryptographic operations
- Exchange format - How data is structured and transmitted
- Privacy guarantees - What information remains private
Workflow Components
A typical exchange workflow includes:Initialization Phase
Initialization Phase
Parties agree on:
- The matching protocol to use
- Privacy parameters (epsilon, delta)
- Cryptographic keys for the exchange
- Data format and encoding
Data Preparation
Data Preparation
Each DataProvider:
- Extracts user identifiers from their panel
- Normalizes and hashes identifiers
- Encrypts identifiers using the agreed protocol
- Adds differential privacy noise if required
Exchange Steps
Exchange Steps
Multi-party computation steps:
- Round 1: Each party sends encrypted identifiers
- Round 2: Parties perform homomorphic operations
- Round 3: Final decryption reveals only matches
- No individual identifiers are revealed
Result Computation
Result Computation
After matching:
- Matched identifiers are used to join datasets
- Measurements are computed on the joined data
- Results include only aggregates, not individual records
Panel Matching Protocols
The API supports several panel matching protocols:Private Set Intersection (PSI)
PSI allows parties to find common elements in their datasets without revealing non-matching elements. How it works:- Each party encrypts their user identifiers
- Encrypted sets are exchanged
- Cryptographic operations identify matches
- Only matching identifiers are revealed to both parties
- Two-party reach measurements
- Cross-publisher frequency analysis
- Attribution across platforms
Private Join and Compute
An extension of PSI that enables computation on matched records:- Join phase: Identify matching users via PSI
- Compute phase: Run aggregation queries on matched data
- Privacy: Individual records remain private
Homomorphic Encryption-based Matching
Uses homomorphic encryption to perform matching operations:- Identifiers are encrypted with homomorphic properties
- Comparison operations can be performed on encrypted data
- Results are decrypted only after aggregation
- Stronger privacy guarantees
- Supports complex matching rules
- Can handle multi-party scenarios
- Higher computational cost
- More complex setup
Implementing Panel Matching
Define the exchange workflow
Work with other DataProviders to define the exchange workflow.Key decisions:
- Which protocol to use (PSI, Private Join and Compute, etc.)
- Privacy parameters
- Expected data volumes
- Acceptable latency
Configure exchange parameters
Set up the exchange workflow configuration:Example parameters:
epsilon: 0.01 (privacy loss budget)delta: 1e-12 (failure probability)matchingThreshold: 0.95 (minimum match confidence)
Prepare your panel data
Format your panel data for the exchange:
- Extract identifiers: Get user IDs, hashed emails, or other join keys
- Normalize: Apply consistent formatting (lowercase, trimming, etc.)
- Hash: Use cryptographic hashing (SHA-256) for identifier privacy
- Encrypt: Apply the protocol-specific encryption
Execute the exchange workflow
Follow the workflow steps to exchange encrypted identifiers:
- Each party uploads their encrypted identifiers
- The system coordinates the multi-party computation
- Intermediate results are exchanged according to the protocol
- Match results are computed and returned
The API handles the cryptographic operations. You only need to provide properly formatted input data.
Validate match results
Review the matching results:Check:
- Match rate is within expected range
- Privacy budget consumption is acceptable
- No errors or warnings in the workflow execution
Privacy Considerations
Panel matching has important privacy implications:Differential Privacy
Differential Privacy
Always apply differential privacy to match results:
- Add calibrated noise to match counts
- Use privacy budget management
- Monitor cumulative privacy loss
- Set appropriate epsilon and delta values
Minimum Threshold Enforcement
Minimum Threshold Enforcement
Enforce minimum thresholds for reporting:
- Don’t report matches below a minimum count (e.g., k=10)
- Prevents identification of small groups
- Combine with differential privacy for layered protection
Rate Limiting
Rate Limiting
Limit the frequency of panel matching:
- Prevent repeated queries that could leak information
- Implement cooldown periods between matches
- Track cumulative queries per user
Audit Logging
Audit Logging
Maintain comprehensive audit logs:
- Record all exchange workflow executions
- Log privacy parameters used
- Track which parties participated
- Enable privacy impact assessments
Example Workflow
Here’s a complete example of a two-party panel matching workflow:Troubleshooting
Low match rates
Low match rates
Possible causes:
- Identifier format mismatch (e.g., different hashing)
- Different normalization rules
- Data quality issues
- Limited panel overlap
- Verify both parties use identical hashing and normalization
- Test with a small known-overlap dataset first
- Check for data quality issues (malformed emails, etc.)
Exchange workflow failures
Exchange workflow failures
Possible causes:
- Network connectivity issues
- Mismatched protocol versions
- Invalid encrypted data format
- Party timeout
- Verify all parties are online and responsive
- Ensure consistent protocol version across parties
- Validate encrypted data format before exchange
- Increase timeout values for large datasets
Privacy budget exceeded
Privacy budget exceeded
Possible causes:
- Too many matching attempts
- Epsilon/delta values too low for use case
- Cumulative privacy loss from multiple measurements
- Implement privacy budget management
- Adjust epsilon/delta based on requirements
- Batch multiple measurements to share privacy budget
- Reset privacy budget periodically if appropriate
Best Practices
Test with Sample Data
Always test exchange workflows with sample data before production use. Verify match rates and privacy parameters.
Coordinate with Partners
Work closely with matching partners to ensure consistent identifier handling and privacy parameters.
Monitor Privacy Budget
Track privacy budget consumption across all panel matching operations. Implement alerts for budget depletion.
Document Workflows
Maintain clear documentation of exchange workflows, including privacy parameters and expected match rates.
Next Steps
Creating Measurements
Use matched panels to create measurements
Requisitions
Understand how requisitions work with matched data
Privacy & Security
Learn more about privacy protection mechanisms
