Architecture
The tokenization system is implemented in/internal/tokenization/tokenization.go and provides:
- Field-Level Tokenization: Tokenize specific identity fields
- Format-Preserving Tokenization: Maintain original data format (e.g., email remains email-like)
- AES-256-GCM Encryption: Industry-standard encryption with authenticated encryption
- Reversible Tokens: Detokenize when authorized access is needed
Configuration
Enable Tokenization
Set a 32-byte encryption key in yourblnk.json:
- Key must be exactly 32 bytes (256 bits)
- Use a cryptographically secure random generator
- Store key in a secrets management system (e.g., AWS Secrets Manager, HashiCorp Vault)
- Rotate keys periodically with a migration plan
Tokenizable Fields
The following Identity fields support tokenization (tokenization.go:28-36):Tokenization Service
Service Initialization
The service is initialized on startup:Tokenization Modes
Blnk supports two tokenization modes:1. Standard Mode (Default)
Standard tokenization uses AES-GCM encryption with base64 encoding:- Long-term storage
- Fields not used in display/logic
- Maximum security
2. Format-Preserving Mode
Format-preserving tokenization maintains the original data format:- Uppercase letters → Random uppercase letters
- Lowercase letters → Random lowercase letters
- Digits → Random digits
- Special characters → Preserved as-is
- Email validation (looks like email)
- Phone number formatting (maintains country code structure)
- Display purposes (preserves readability patterns)
- Business logic requiring format
API Usage
Tokenizing Identities
When creating or updating identities, PII fields are automatically tokenized:Detokenizing Data
Retrieve original values when authorized:Security Considerations
Encryption Strength
Blnk uses AES-256-GCM which provides:- Confidentiality: 256-bit AES encryption
- Authenticity: GCM authenticated encryption mode
- Tamper Detection: Modification attempts fail decryption
- Unique IVs: Random nonce per encryption
Key Management
Best Practices:-
External Secrets Manager: Store keys in AWS Secrets Manager, HashiCorp Vault, or similar
-
Key Rotation: Implement periodic key rotation
- Generate new 32-byte key
- Update configuration with new key
- Re-tokenize existing data (migration script)
- Decommission old key
-
Environment Separation: Use different keys per environment
-
Access Control: Limit who can:
- View tokenization keys
- Call detokenize API
- Access raw database
Compliance
Tokenization helps meet compliance requirements: PCI DSS:- Tokenize cardholder data (if storing card info)
- Reduce PCI scope by removing plaintext card data
- Pseudonymization of personal data
- Right to be forgotten (delete keys instead of data)
- Data minimization (only detokenize when needed)
- Protect Protected Health Information (PHI)
- Audit access to detokenized data
- Demonstrate encryption at rest
- Access controls for sensitive data
Audit Logging
Log all tokenization operations:Performance Impact
Tokenization Overhead
Tokenization adds minimal latency:- Tokenize: ~0.1ms per field
- Detokenize: ~0.1ms per field
- Format-preserving: ~0.2ms per field (additional HMAC)
Optimization Tips
-
Batch Operations: Tokenize multiple fields in parallel
-
Cache Detokenized Values: If same identity accessed frequently
- Use Standard Mode: Format-preserving adds ~2x overhead
Migration Guide
Enabling Tokenization on Existing System
-
Generate Key:
-
Update Configuration:
- Restart Blnk: New identities will be tokenized automatically
-
Migrate Existing Data (optional):
Key Rotation
-
Generate New Key:
-
Dual-Key Configuration (supports both old and new):
-
Re-tokenize Data:
-
Remove Old Key after all data migrated:
Troubleshooting
Tokenization Disabled Error
Symptom:-
Verify key is exactly 32 bytes:
-
Check environment variable is set:
- Restart Blnk after configuration change
Detokenization Fails
Symptom:- Wrong Key: Token encrypted with different key
- Corrupted Token: Token modified/truncated
- Version Mismatch: Token format changed between versions
- Verify correct tokenization key is configured
- Check token not corrupted in database
- Use original key that encrypted the data
Format-Preserving Not Working
Symptom: Format not preserved (getting base64 output) Solution: Explicitly specify mode:Best Practices
- Tokenize by Default: Enable tokenization for all new identities
-
Minimize Detokenization: Only detokenize when absolutely necessary
-
Audit Detokenization: Log all detokenization requests
- Use Format-Preserving Sparingly: Only when format is needed for business logic
- Regular Key Rotation: Rotate keys annually or after security incidents
-
Secure Key Storage: Never commit keys to version control