Overview
The database-building pipeline (frompkg/pipeline/) executes these steps for each package version across 11 React Native environments:
- Install the npm package into a React Native project
- Bundle with Metro (creates a JavaScript bundle)
- Compile with the Hermes compiler (produces
.hbcbytecode) - Disassemble to extract function objects
- Hash all functions (structural, content IR1, content IR2)
- Store fingerprints in MongoDB
Prerequisites
Configure environment
Generating Baselines
Baselines capture the fingerprints of an empty React Native app (framework-only functions). These are used to filter out RN internals during analysis.pkg/cmd/maintainDatabase.go:47):
- Scans
pipeline/react-natives/for RN versions - For each version without a baseline in
baselines_v3:- Bundles
baseline_entry.js(empty entry point) - Compiles with Hermes
- Extracts function fingerprints
- Stores in MongoDB
baselines_v3collection
- Bundles
Processing Packages
The main pipeline processes npm packages across all React Native environments in parallel.Basic Pipeline Run
react-native-directory-packages.json and processes each package version.
Package Source Files
Create a JSON file listing packages to process:react-native-directory-packages.json in go/hermes-decompiler/.
Processing Vulnerable Packages
To process packages from GitHub Security Advisories instead:-c flag switches to the hashes_ghsa collection and reads from security-advisories-packages.json.
Pipeline Architecture
Parallel Processing
The pipeline processes multiple React Native versions concurrently (frompkg/pipeline/rnprocessor.go:108):
- Semaphore-limited goroutines: Max 4 RN environments processed simultaneously
- Package grouping: All versions of a package are processed together to minimize
npm installchurn - Database batching: Writes are batched (100 operations per flush) for efficiency
Resume Capability
The pipeline saves progress topipeline_progress.json for crash recovery (from pkg/pipeline/progress.go):
Environment Cleanup
Before processing each package group, the pipeline restores a pristinenode_modules backup to prevent dependency conflicts:
Database Schema
Collections
| Collection | Purpose | Document Count (typical) |
|---|---|---|
packages | npm package metadata | ~1,000 packages |
hashes | Function fingerprints per package per RN version | ~500,000 functions |
hashes_ghsa | Fingerprints for vulnerable packages only | ~50,000 functions |
baselines_v3 | Empty RN app fingerprints (11 versions) | ~11 documents |
Package Hash Model
Troubleshooting
Error: npm install failed
Error: npm install failed
Cause: Package version doesn’t exist or has install errors.Solution:
- Check package name/version on npmjs.com
- View error in package
error_logfield in database: - The pipeline continues with remaining packages
Error: Metro bundling failed
Error: Metro bundling failed
Cause: Package has missing peer dependencies or incompatible RN version.Solution:
- Check
package.jsonpeerDependencies - Some packages only work with specific RN versions
- Error is logged; pipeline continues
Error: Hermes compilation failed
Error: Hermes compilation failed
Cause: Invalid JavaScript bundle or Hermes compiler crash.Solution:
- Check Metro bundle output:
pipeline/react-natives/rnXXX/package_out_*.js - Verify Hermes compiler:
node_modules/react-native/sdks/hermesc/osx-bin/hermesc -version - Some packages produce invalid bundles (logged as errors)
Pipeline is very slow
Pipeline is very slow
Causes:
- Each package requires: npm install → Metro bundle → Hermes compile
- Processing 100 packages × 11 RN versions = 1,100 pipeline runs
- Reduce RN version count (edit
pipeline/react-natives/to keep only needed versions) - Process in batches (edit
react-native-directory-packages.json) - Increase semaphore limit (edit
maintainDatabase.go:187to allow more parallel RN versions)
MongoDB disk space issues
MongoDB disk space issues
Cause: Each function stores 3 hashes + 3 raw IR strings.Solution:
- Use compression:
mongod --wiredTigerCollectionBlockCompressor=zstd - Create indexes on frequently queried fields:
- Consider storing only vulnerable packages in
hashes_ghsa
Advanced Usage
Processing a Single Package
For testing, process just one package version:Updating Security Advisories
Fetch and store GitHub Security Advisories for known packages:react-native-directory-packages.json (requires GITHUB_TOKEN in .env).
Downloading All Advisories
Fetch all npm security advisories from GitHub:security-advisories-packages.json with all vulnerable package versions, ready for processing with -p -c.
Optimizing Performance
Database Batcher
The pipeline uses batched writes (frompkg/pipeline/batcher.go):
pkg/pipeline/rnprocessor.go:34.
Package Grouping
Packages are grouped by name to reduce npm operations (frompkg/pipeline/packages.go:62):
Next Steps
- Analyze apps using your database
- Configure environment variables
- Generate opcodes for new Hermes versions