Skip to main content

Overview

Envark’s scanning engine intelligently traverses your project directory to detect environment variable usage in source code and .env files. The scanner supports multiple programming languages, caches results for performance, and provides detailed usage tracking.

Quick Start

# Scan current directory
envark scan

# Scan specific project
envark scan /path/to/project

# Filter results
envark scan --filter missing

How It Works

The scanning process follows a four-stage pipeline:

1. File Discovery

Envark walks your project directory and identifies relevant files:
  • Source Files: .js, .ts, .jsx, .tsx, .py, .rb, .go, .php, etc.
  • Environment Files: .env, .env.local, .env.development, .env.example, etc.
  • Configuration Files: config/, environment-specific configs
Envark automatically skips node_modules/, .git/, dist/, build/, and other common ignored directories for performance.

2. Hash Computation

Before parsing, Envark computes a hash of all discovered files to enable intelligent caching:
// From src/core/scanner.ts:76
const hash = computeFilesHash(allFiles);
This allows Envark to skip re-scanning if nothing has changed since the last run.

3. Cache Check

Envark maintains a cache at ~/.envark/cache/ to speed up subsequent scans:
// From src/core/scanner.ts:79-95
if (useCache) {
    const cached = readCache(normalizedPath, hash);
    if (cached.hit && cached.data) {
        return {
            // Return cached scan results
            cacheHit: true,
            duration: Date.now() - startTime,
            // ...
        };
    }
}
For a typical project:
  • First scan: 2-5 seconds
  • Cached scan: 50-200ms (10-100x faster)
Cache is automatically invalidated when files change.

4. Parse & Extract

For each source file, Envark uses regex-based parsers to detect environment variable access patterns:

JavaScript/TypeScript Patterns

// process.env.VARIABLE_NAME
process.env.API_KEY
process.env['DATABASE_URL']
process.env["SECRET_KEY"]

// import.meta.env (Vite)
import.meta.env.VITE_API_URL
import.meta.env.VITE_PUBLIC_KEY

// Destructuring
const { PORT, HOST } = process.env;

Python Patterns

# os.getenv
import os
os.getenv('DATABASE_URL')
os.getenv("API_KEY", "default")
os.environ['SECRET_KEY']
os.environ.get('PORT')

# python-dotenv
from dotenv import load_dotenv

Environment File Patterns

# Standard .env format
DATABASE_URL=postgresql://localhost/db
API_KEY="sk-1234567890abcdef"
PORT=3000

# Export syntax (shell)
export NODE_ENV=production

5. Resolution

After extraction, Envark resolves the complete picture for each variable:
  • Where it’s defined (which .env files)
  • Where it’s used (which source files and line numbers)
  • Whether it has default values in code
  • If it’s documented (.env.example)
  • If it’s missing or unused

Scan Options

Configuration

// From src/core/scanner.ts:11-15
export interface ScanOptions {
    maxFiles?: number;      // Default: 10000
    maxDepth?: number;      // Default: 50
    useCache?: boolean;     // Default: true
}

CLI Usage

# Disable caching
envark scan --no-cache

# Limit file count (for huge monorepos)
envark scan --max-files 5000

# Limit directory depth
envark scan --max-depth 10

Filters

Filter scan results to focus on specific issues:
Show everything (default)
envark scan
envark scan --filter all
Returns all discovered environment variables.

Scan Output

Summary Section

┌─ SCAN SUMMARY ────────────────────────────────────────────┐
  Total: 42  Defined: 38  Missing: 4  Critical: 2
└──────────────────────────────────────────────────────────┘
The summary provides:
  • Total: All unique environment variables found
  • Defined: Variables with values in .env files
  • Missing: Variables used but not defined
  • Critical: Variables with critical risk level

Variable Details

Variables:
  DATABASE_URL              [CRITICAL] ✓
  API_KEY                   [HIGH] ✗
  PORT                      [LOW] ✓
  NODE_ENV                  [INFO] ✓
  ...
Each variable shows:
  • Name: The environment variable identifier
  • Risk Level: Security/configuration risk assessment
  • Status: ✓ defined, ✗ missing

Detailed View

For more information, use specific commands:
# See where a variable is used
envark usage DATABASE_URL

# Analyze risks
envark risk

# Check for missing variables
envark missing

Supported Languages

Envark’s scanner detects environment variables in:

JavaScript

  • Node.js
  • React
  • Vue.js
  • Next.js
  • Express

TypeScript

  • All JS frameworks
  • Deno
  • NestJS
  • Angular

Python

  • Django
  • Flask
  • FastAPI
  • os.getenv
  • python-dotenv

Ruby

  • Rails
  • Sinatra
  • ENV[]

Go

  • os.Getenv
  • godotenv

PHP

  • Laravel
  • $_ENV
  • getenv()

Framework Detection

Envark recognizes framework-specific patterns:

Vite/Vite-based Frameworks

// Vite requires VITE_ prefix for public vars
import.meta.env.VITE_API_URLDetected
import.meta.env.SECRET_KEYWarning: Not accessible (missing VITE_ prefix)

Next.js

// Next.js public variables
process.env.NEXT_PUBLIC_API_URLDetected

// Server-side only
process.env.DATABASE_URLDetected (server)

Create React App

// CRA requires REACT_APP_ prefix
process.env.REACT_APP_API_URLDetected

Django

# Django settings.py patterns
from decouple import config
config('DATABASE_URL')            ✓ Detected

Performance Characteristics

Small Projects

< 100 files
  • First scan: ~500ms
  • Cached: ~50ms

Medium Projects

100-1000 files
  • First scan: 1-3s
  • Cached: 100-200ms

Large Projects

1000-5000 files
  • First scan: 3-8s
  • Cached: 200-500ms

Monorepos

5000+ files
  • First scan: 8-20s
  • Cached: 500ms-1s
  • Consider --max-files

Ignored Directories

Envark automatically skips these common directories:
node_modules/
.git/
.next/
.nuxt/
dist/
build/
out/
coverage/
.cache/
__pycache__/
venv/
.venv/
vendor/
target/
Add a .envarkignore file to your project root:
# .envarkignore
legacy/
temp/
*.backup
old-configs/
This works like .gitignore for scanning.

Programmatic Usage

Use the scanner in your own tools:
import { scanProject } from 'envark';

const result = scanProject('/path/to/project', {
    maxFiles: 10000,
    maxDepth: 50,
    useCache: true
});

console.log(`Found ${result.usages.length} environment variable usages`);
console.log(`Scanned ${result.scannedFiles} files in ${result.duration}ms`);
console.log(`Cache hit: ${result.cacheHit}`);

Cache Management

# View cache location
envark cache info

# Clear cache
envark cache clear

# Disable cache for a single scan
envark scan --no-cache
Cache location: ~/.envark/cache/

Troubleshooting

Solutions:
  1. Reduce --max-files for huge monorepos
  2. Ensure cache is enabled (default)
  3. Add .envarkignore to skip unnecessary directories
  4. Check for slow disk I/O (network drives, encrypted volumes)
Possible causes:
  1. Variable accessed using dynamic keys: process.env[key]
  2. Custom environment loading logic
  3. Variables loaded from external sources (Vault, AWS Secrets Manager)
  4. Unsupported language/framework pattern
Use envark usage <VAR_NAME> to verify detection.
Common scenarios:
  1. Commented-out code still detected
  2. String literals that look like env vars: "process.env.API_KEY"
  3. Documentation or example code
These are usually low-risk and can be ignored or documented.

Implementation Details

The scanner is implemented across multiple modules:
  • src/core/scanner.ts: Main scanning orchestration
  • src/core/parser.ts: Language-specific parsers
  • src/core/resolver.ts: Variable resolution logic
  • src/utils/file-walker.ts: Efficient directory traversal
  • src/utils/cache.ts: Caching layer
See src/core/scanner.ts:62-134 for the main scan implementation.

Build docs developers (and LLMs) love