Skip to main content
Scramjet employs a sophisticated rewriting system to transparently proxy web content. The rewriters modify JavaScript, HTML, and CSS to intercept and redirect network requests, DOM operations, and URL references through the proxy.

JavaScript rewriting

The JavaScript rewriter is the most complex component, powered by a Rust-based WASM module using the OXC parser for high-performance AST transformations.

Architecture

JavaScript rewriting occurs in two layers:
  1. TypeScript wrapper (src/shared/rewriters/js.ts) - Handles error recovery and sourcemap injection
  2. Rust/WASM core (rewriter/js/) - Performs AST-level transformations using OXC
import { rewriteJs } from "@rewriters/js";
import { URLMeta } from "@rewriters/url";

const meta: URLMeta = {
  origin: new URL("https://example.com"),
  base: new URL("https://example.com/page"),
};

// Rewrite inline script
const rewritten = rewriteJs(
  'fetch("/api/data")',
  "(inline script)",
  meta,
  false // not a module
);

// Rewrite ES module
const moduleCode = rewriteJs(
  'import { foo } from "./module.js"',
  "https://example.com/app.js",
  meta,
  true // ES module
);

Rewriter output

The WASM rewriter returns structured output:
type RewriterResult = {
  js: string | Uint8Array;  // Rewritten JavaScript
  map: Uint8Array | null;   // Source map for debugging
  tag: string;              // Unique identifier for sourcemap
  errors: string[];         // Parse errors (if any)
};
The rewriter can process both strings and Uint8Array for efficiency. When working with large scripts, use Uint8Array to avoid encoding overhead.

Error handling

Scramjet includes graceful error recovery:
try {
  const result = rewriteJs(code, url, meta, isModule);
  // Check for parse errors
  if (result.errors.length > 0) {
    console.warn("Parse errors detected:", result.errors);
  }
} catch (err) {
  // Fallback: return original code if allowInvalidJs flag is set
  if (flagEnabled("allowInvalidJs", meta.base)) {
    return originalCode;
  }
  throw err;
}
The allowInvalidJs flag should only be enabled for debugging. It bypasses rewriting for malformed JavaScript, which can break proxy functionality.

URL transformations in JavaScript

The rewriter intercepts:
  • Function calls: fetch(), XMLHttpRequest.open(), WebSocket()
  • Property access: location.href, document.URL
  • Dynamic imports: import(), require()
  • Worker creation: new Worker(), new SharedWorker()

HTML rewriting

The HTML rewriter uses htmlparser2 for streaming DOM parsing and dom-serializer for output generation.

Implementation

import { rewriteHtml } from "@rewriters/html";
import { CookieStore } from "@/shared/cookie";

const cookieStore = new CookieStore();
const meta: URLMeta = {
  origin: new URL("https://example.com"),
  base: new URL("https://example.com"),
};

const html = `
<!DOCTYPE html>
<html>
  <head>
    <script src="/app.js"></script>
    <link rel="stylesheet" href="/style.css">
  </head>
  <body>
    <a href="/page">Link</a>
  </body>
</html>
`;

// fromTop=true injects Scramjet client scripts
const rewritten = rewriteHtml(html, cookieStore, meta, true);

HTML rules

Scramjet maintains a set of HTML rewriting rules that define which attributes to rewrite:
// Simplified from src/shared/htmlRules.ts
const htmlRules = [
  {
    src: ["script", "img", "audio", "video", "iframe"],
    href: ["a", "link", "area"],
    action: ["form"],
    fn: (value, meta) => rewriteUrl(value, meta)
  }
];

Script injection

When fromTop=true, Scramjet injects client scripts into the <head>:
<head>
  <!-- Injected by Scramjet -->
  <script src="/scramjet.wasm.js"></script>
  <script src="/scramjet.client.js"></script>
  <script>self.COOKIE = {...}; $scramjetLoadClient().loadAndHook(...);</script>
  
  <!-- Original content -->
  <script src="/app.js"></script>
</head>
Inline event handlers: Event attributes like onclick are rewritten as JavaScript:
<!-- Original -->
<button onclick="fetch('/api')">Click</button>

<!-- Scramjet preserves original via scramjet-attr-* -->
<button 
  onclick="[rewritten code]" 
  scramjet-attr-onclick="fetch('/api')">
  Click
</button>
Import maps: JSON import maps have URLs rewritten:
<script type="importmap">
{
  "imports": {
    "lodash": "https://cdn.example.com/lodash.js"
  }
}
</script>
CSP meta tags: Content Security Policy tags are commented out:
<!-- Original CSP removed by Scramjet -->
<!-- <meta http-equiv="Content-Security-Policy" content="..."> -->

Srcset rewriting

Responsive images with srcset attributes require special handling:
function rewriteSrcset(srcset: string, meta: URLMeta) {
  // Input: "img1.jpg 1x, img2.jpg 2x, img3.jpg 3x"
  const sources = srcset.split(/ .*,/).map(src => src.trim());
  
  const rewritten = sources.map(source => {
    const [url, ...descriptors] = source.split(/\s+/);
    const rewrittenUrl = rewriteUrl(url.trim(), meta);
    
    return descriptors.length > 0
      ? `${rewrittenUrl} ${descriptors.join(" ")}`
      : rewrittenUrl;
  });
  
  return rewritten.join(", ");
}

CSS rewriting

The CSS rewriter targets URL references in stylesheets using regex-based transformations.

Basic usage

import { rewriteCss } from "@rewriters/css";

const css = `
.background {
  background-image: url('/images/bg.png');
}

@import "theme.css";
`;

const rewritten = rewriteCss(css, meta);
// Output:
// .background {
//   background-image: url('https://proxy.com/scramjet/[encoded]/images/bg.png');
// }
// @import "https://proxy.com/scramjet/[encoded]/theme.css";

CSS URL patterns

The rewriter handles:
  1. url() function: url('/path/to/resource')
  2. @import rules: @import url('...') or @import '...'
// Simplified from src/shared/rewriters/css.ts
function handleCss(type: "rewrite" | "unrewrite", css: string, meta?: URLMeta) {
  const urlRegex = /url\(['"]?(.+?)['"]?\)/gm;
  const atRuleRegex = /@import\s+(url\s*?\(.{0,9999}?\)|['"].{0,9999}?['"]|.{0,9999}?)($|\s|;)/gm;
  
  css = css.replace(urlRegex, (match, url) => {
    const encodedUrl = type === "rewrite"
      ? rewriteUrl(url.trim(), meta)
      : unrewriteUrl(url.trim());
    return match.replace(url, encodedUrl);
  });
  
  // Handle @import statements...
  return css;
}
CSS rewriting is less complex than JavaScript because CSS doesn’t contain executable code that can dynamically generate URLs.

URL metadata (URLMeta)

All rewriters share a common URLMeta type that provides context:
type URLMeta = {
  origin: URL;              // Original page URL
  base: URL;                // Base URL for relative resolution
  topFrameName?: string;    // Top iframe name (for nesting)
  parentFrameName?: string; // Parent iframe name
};

Base URL handling

The <base> tag affects relative URL resolution:
<base href="https://example.com/app/">
<script src="main.js"></script>
<!-- Resolves to: https://example.com/app/main.js -->
Scramjet updates meta.base when encountering <base> tags:
if (node.name === "base" && node.attribs.href !== undefined) {
  meta.base = new URL(node.attribs.href, meta.origin);
}

Performance considerations

WASM rewriter pooling

Scramjet maintains a pool of rewriter instances to avoid initialization overhead:
let rewriters = [];

function getRewriter(meta: URLMeta): [Rewriter, () => void] {
  let obj = rewriters.find(x => !x.inUse);
  
  if (!obj) {
    // Create new rewriter instance
    const rewriter = new Rewriter({ config, shared, flagEnabled, codec });
    obj = { rewriter, inUse: false };
    rewriters.push(obj);
  }
  
  obj.inUse = true;
  return [obj.rewriter, () => (obj.inUse = false)];
}

Timing and profiling

Enable rewriter logs to track performance:
import { flagEnabled } from "@/shared";

if (flagEnabled("rewriterLogs", meta.base)) {
  const before = performance.now();
  const result = rewriteJs(code, url, meta, isModule);
  console.log(`Rewrite took ${performance.now() - before}ms`);
}

Common patterns

Rewriting downloaded resources

// In service worker fetch handler
switch (destination) {
  case "script":
    return rewriteJs(
      new Uint8Array(await response.arrayBuffer()),
      response.finalURL,
      meta,
      scriptType === "module"
    );
    
  case "style":
    return rewriteCss(await response.text(), meta);
    
  case "document":
  case "iframe":
    if (response.headers.get("content-type")?.startsWith("text/html")) {
      return rewriteHtml(await response.text(), cookieStore, meta, true);
    }
    break;
}

Handling special URLs

Some URLs require special treatment:
function rewriteUrl(url: string, meta: URLMeta) {
  if (url.startsWith("javascript:")) {
    return "javascript:" + rewriteJs(
      url.slice("javascript:".length),
      "(javascript: url)",
      meta
    );
  }
  
  if (url.startsWith("blob:") || url.startsWith("data:")) {
    return location.origin + config.prefix + url;
  }
  
  if (url.startsWith("mailto:") || url.startsWith("about:")) {
    return url; // Don't rewrite
  }
  
  // Standard HTTP(S) URL rewriting...
}

Build docs developers (and LLMs) love