Skip to main content

Pipeline Overview

Every skill published to Tank passes through a 6-stage security pipeline. Stages run sequentially, with each stage building on the output of the previous one.
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Stage 0   │───▶│   Stage 1   │───▶│   Stage 2   │
│   INGEST    │    │  STRUCTURE  │    │   STATIC    │
└─────────────┘    └─────────────┘    └─────────────┘


┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Stage 5   │◀───│   Stage 4   │◀───│   Stage 3   │
│   SUPPLY    │    │   SECRETS   │    │  INJECTION  │
└─────────────┘    └─────────────┘    └─────────────┘
Total Duration: ~3-8 seconds for typical skill (50 files, 5 dependencies)

Stage 0: Ingest & Quarantine

Purpose: Download tarball, safely extract to temp directory, validate archive structure Source: python-api/lib/scan/stage0_ingest.py (398 lines)

Checks Performed

What: Validates tarball download URL originWhy: Prevents SSRF attacks where malicious URLs could access internal servicesHow: Whitelist of allowed domains (Supabase storage + localhost for dev)
ALLOWED_DOWNLOAD_DOMAINS = [
    "supabase.co",
    "supabase.com",
    "supabase.in",
    "localhost",
    "127.0.0.1",
]
Severity: Critical (blocks download if domain not whitelisted)
What: Enforces 50MB maximum tarball sizeWhy: Prevents resource exhaustion, DoS attacksHow: Checks Content-Length header before streaming download
MAX_TARBALL_SIZE = 50 * 1024 * 1024  # 50MB
Severity: Critical (rejects download if exceeded)
What: Detects excessive compression ratiosWhy: Prevents zip bomb attacks (small compressed file that expands to gigabytes)How: Compares compressed vs uncompressed size
MAX_COMPRESSION_RATIO = 100  # decompressed/compressed
if total_uncompressed / compressed_size > 100:
    # Flag as critical
Severity: Critical
What: Rejects files with .. in path or absolute pathsWhy: Prevents extraction to arbitrary filesystem locationsExamples:
  • ../../etc/passwd → BLOCKED
  • /tmp/malicious.sh → BLOCKED
  • src/utils.py → ALLOWED
if ".." in member.name or member.name.startswith("/"):
    findings.append(Finding(
        severity="critical",
        type="path_traversal",
    ))
Severity: Critical
What: Rejects binary executables and compiled codeBlocked Extensions:
  • Executables: .exe, .dll, .so, .dylib, .wasm
  • Compiled: .class, .pyc, .pyo, .jar, .war
  • Binary: .bin, .dat
Why: Skills should be source code only, not compiled binariesSeverity: Critical
What: Computes SHA-256 hash for every extracted fileWhy: Enables future integrity verification and change trackingOutput: {"src/main.py": "a3b2c1...", "README.md": "d4e5f6..."} included in scan response

Output

  • temp_dir: Path to extracted skill files (cleaned up after pipeline)
  • file_hashes: Dict of filename → SHA-256 hash
  • file_list: Array of extracted file paths
  • total_size: Total bytes extracted

Stage 1: File & Structure Validation

Purpose: Validate skill structure, detect Unicode tricks, encoding issues Source: python-api/lib/scan/stage1_structure.py (323 lines)

Checks Performed

What: Verifies SKILL.md exists in skill rootWhy: SKILL.md is the manifest file required for all skillsSeverity: High if missing
What: Detects Unicode bidirectional override charactersWhy: Prevents “Trojan Source” attacks where code display order is reversedExample:
# This looks like: access_level = "user"
# But actually is:  access_level = "admin"
access_level = "\u202eadmin\u202d"  # ← bidirectional override
Detected Characters:
  • U+202A - LEFT-TO-RIGHT EMBEDDING
  • U+202B - RIGHT-TO-LEFT EMBEDDING
  • U+202E - RIGHT-TO-LEFT OVERRIDE (most dangerous)
  • U+2066 through U+2069 - Isolates
Severity: Critical
What: Detects invisible Unicode charactersWhy: Can hide malicious code or create identifier collisionsDetected Characters:
  • U+200B - ZERO WIDTH SPACE
  • U+200C - ZERO WIDTH NON-JOINER
  • U+200D - ZERO WIDTH JOINER
  • U+FEFF - ZERO WIDTH NO-BREAK SPACE (BOM)
Severity: Medium
What: Detects Cyrillic characters that look like Latin (e.g., Cyrillic ‘а’ vs Latin ‘a’)Why: Prevents homoglyph attacks (import requеsts where ‘е’ is Cyrillic)Examples:
  • Cyrillic а (U+0430) looks like Latin a (U+0061)
  • Cyrillic е (U+0435) looks like Latin e (U+0065)
  • Cyrillic о (U+043E) looks like Latin o (U+006F)
Detection Logic: Only flags if Cyrillic char is surrounded by ASCII lettersSeverity: High
What: Detects content that changes under Unicode NFKC normalizationWhy: Some Unicode characters normalize to different characters, enabling tricksExample: (U+FB01 ligature) normalizes to fi (two separate chars)Severity: Medium
What: Flags files not encoded as UTF-8Why: All source files should be UTF-8; other encodings are suspiciousUses: charset-normalizer library for detectionSeverity: Medium
What: Flags dotfiles not in the allowed listAllowed:
  • .gitignore
  • .editorconfig
  • .prettierrc, .eslintrc, etc.
Flagged:
  • .env (should be .env.example)
  • .git/ directories
  • Arbitrary dotfiles
Severity: Low (informational)

Stage 2: Static Code Analysis

Purpose: AST analysis and pattern matching to detect dangerous code Source: python-api/lib/scan/stage2_static.py (551 lines — largest stage)

Checks Performed

What: Parses Python files into Abstract Syntax Trees, walks nodes to find dangerous patternsTools: Custom PythonASTAnalyzer + Bandit security linterDetected Patterns:Shell Injection (Critical):
  • os.system()
  • os.popen()
  • subprocess.call(), .run(), .Popen()
Code Execution (Critical):
  • eval()
  • exec()
  • compile()
Insecure Deserialization (Critical):
  • pickle.loads(), pickle.load()
  • marshal.loads()
  • shelve.open()
Network Access (High):
  • requests.get(), .post(), etc.
  • httpx calls
  • urllib.request.urlopen()
  • socket.connect()
Environment Access (Medium):
  • os.environ
  • os.getenv()
What: Runs Bandit security linter on all Python filesCoverage: 100+ security checks including:
  • B102 (exec usage)
  • B307 (eval usage)
  • Hardcoded passwords
  • SQL injection
  • Assert usage
Severity Mapping:
  • Bandit HIGH → critical (for exec/eval)
  • Bandit MEDIUM → high
  • Bandit LOW → medium
What: Regex-based detection of dangerous JS/TS patternsDetected (Critical):
  • eval() usage
  • Function() constructor
  • new Function()
  • child_process.exec()
  • spawn() with shell: true
Detected (High):
  • fetch() calls
  • XMLHttpRequest
  • require('http') or require('https')
  • Sensitive file reads (.ssh, .aws, .env, .config)
Detected (Medium):
  • process.env access
  • require('dotenv')
What: Regex-based detection of dangerous shell patternsDetected (Critical):
  • curl <url> | bash
  • wget <url> | bash
  • eval usage in shell
Detected (High):
  • chmod 777
Detected (Medium):
  • chmod +x (making files executable)
  • export with variable substitution
What: Detects code obfuscation patternsPatterns:
  • base64.b64decode() + exec() (Critical)
  • codecs.decode(..., 'rot13') (High)
Why: Obfuscated code is suspicious — legitimate skills have no reason to hide logic
What: Compares detected capabilities vs declared permissionsExample:
  • Stage 2 finds requests.get("https://api.example.com")
  • Checks if permissions.network.outbound includes *.example.com or *
  • If NOT declared → High severity finding
Also Checks:
  • Subprocess usage vs permissions.subprocess
  • Filesystem access vs permissions.filesystem.read/write

Stage 3: Prompt Injection Detection

Purpose: Detect malicious instructions embedded in markdown files Source: python-api/lib/scan/stage3_injection.py (401 lines)

Checks Performed

What: Detects attempts to override system instructionsExamples:
  • “ignore all previous instructions”
  • “disregard above rules”
  • “forget everything you were told”
  • “your new role is…”
  • “this is the real system prompt”
Regex Count: 6 patterns with confidence 0.95-1.0Severity: Critical
What: Detects attempts to redefine the AI’s roleExamples:
  • “you are now a…”
  • “act as if you are…”
  • “pretend to be…”
  • “switch to developer mode”
  • “from now on, you are…”
Regex Count: 8 patternsSeverity: Critical
What: Attempts to confuse context or insert fake instructionsExamples:
  • “the above was just a test”
  • “never mind what I said”
  • “new instructions:”
  • “updated instructions:”
Severity: High
What: Instructions to send data to attackerExamples:
  • “send to [URL]”
  • “post this to…”
  • “include in your response the contents of…”
  • “output the system prompt”
  • “reveal your instructions”
Severity: Critical to High
What: Attempts to gain elevated privilegesExamples:
  • “run as root”
  • “sudo”
  • “disable safety checks”
  • “bypass security filters”
  • “enable admin mode”
Severity: Critical
What: Detects Claude API format tags in skill documentationExamples:
  • <tool_use>
  • <function_calls>
  • <system>, <human>, <assistant>
  • <invoke>
  • [SYSTEM], [HUMAN], [ASSISTANT]
Why: Skills should not contain agent API syntaxSeverity: Critical
What: Scans HTML/markdown comments for hidden instructionsPatterns:
  • <!-- hidden instruction -->
  • [//]: # (comment)
  • [comment]: # (text)
  • Base64 in comments
Severity: High
What: Computes overall suspicion score based on pattern densityFormula:
pattern_score = sum(weight for matched patterns) / count
density_score = imperative_keywords / total_words * 50
total_score = (pattern_score * 0.7) + (density_score * 0.3)
Threshold: Score > 0.7 → flaggedSeverity: Medium (0.7-0.9) or High (0.9+)
What: Runs Cisco’s skill-scanner for behavioral analysisPurpose: Cross-file dataflow analysis (e.g., read creds in file A, send via network in file B)Status: Non-blocking (continues if fails)

Stage 4: Secrets & Credential Scanning

Purpose: Detect hardcoded API keys, credentials, private keys Source: python-api/lib/scan/stage4_secrets.py (305 lines)

Checks Performed

What: Runs Yelp’s detect-secrets library with 12 pluginsPlugins:
  1. AWSKeyDetector
  2. AzureStorageKeyDetector
  3. BasicAuthDetector
  4. GitHubTokenDetector
  5. Base64HighEntropyString (limit 4.5)
  6. HexHighEntropyString (limit 3.0)
  7. PrivateKeyDetector
  8. SlackDetector
  9. StripeDetector
  10. TwilioKeyDetector
  11. KeywordDetector
  12. JwtTokenDetector
Severity: Critical for all findings
What: Additional patterns not covered by detect-secretsPatterns:
  • Google Cloud API keys: AIza[0-9A-Za-z_-]{35}
  • Generic API keys: api_key = "[a-zA-Z0-9]{16,}"
  • Database URLs: postgres://user:pass@host
  • SSH private keys: -----BEGIN RSA PRIVATE KEY-----
  • JWT tokens: eyJ...
  • Slack webhooks: https://hooks.slack.com/services/...
  • Discord webhooks: https://discord.com/api/webhooks/...
  • High-entropy strings (40+ chars)
Severity: Critical to Medium
What: Flags .env files with actual values (not .env.example)Logic:
  • Allow .env.example, .env.template, etc.
  • Flag .env, .env.local, .env.production if they contain KEY=value pairs
Severity: High

Stage 5: Supply Chain Audit

Purpose: Check dependencies for vulnerabilities and typosquatting Source: python-api/lib/scan/stage5_supply.py (545 lines)

Checks Performed

What: Queries OSV.dev API for known vulnerabilities in dependenciesAPI: https://api.osv.dev/v1/querybatch (free, no auth, no rate limits)Ecosystems: PyPI (Python), npm (JavaScript)Batch Processing: Up to 100 packages per requestSeverity Mapping:
  • CVSS >= 9.0 → Critical
  • CVSS >= 7.0 → High
  • CVSS >= 4.0 → Medium
  • CVSS < 4.0 → Low
What: Detects package names similar to popular packages (Levenshtein distance)Algorithm:
  • Compare against top 1000 popular PyPI/npm packages
  • Flag if Levenshtein distance is 1-2
  • Flag if single character differs at same position
Examples:
  • reqeusts vs requests (distance 1) → FLAGGED
  • numppy vs numpy (distance 1) → FLAGGED
Severity: High
What: Flags dependencies without exact version pinsFlagged:
  • requests (no version)
  • requests>=2.0 (range)
  • requests==* (wildcard)
Allowed:
  • requests==2.28.0 (exact pin)
Severity: Medium
What: Detects code that runs pip install or npm install at runtimePatterns:
  • subprocess.run(["pip", "install", ...])
  • os.system("pip install ...")
  • exec("npm install ...")
Why: Skills should declare dependencies, not install them dynamicallySeverity: Critical
What: Parses dependency files to extract package listSupported Formats:
  • requirements.txt (Python)
  • package.json (npm)
  • pyproject.toml (Python Poetry/PDM)
Extracts: Package name, version specifier, section (dependencies vs devDependencies)

Performance

StageTypical DurationMax Duration (timeout)
Stage 0500ms - 2s30s (download timeout)
Stage 1100ms - 500msN/A
Stage 21s - 3sN/A
Stage 3500ms - 2sN/A
Stage 4500ms - 1sN/A
Stage 51s - 3s20s (OSV API timeout)
Total3s - 8s~50s
Stage failures do not block subsequent stages. If Stage 2 errors, Stage 3-5 still run. Only critical findings in Stage 0 (e.g., zip bomb, path traversal) block the entire pipeline.

Next Steps

Permissions

How permissions are declared and enforced

Audit Score

Understanding the 0-10 security score

Build docs developers (and LLMs) love