Skip to main content

Architecture overview

GitRead uses a multi-service architecture to transform GitHub repositories into professional README files. The system combines Next.js for the frontend and API layer, a Python microservice for repository ingestion, and AI models for content generation.

Request flow

Here’s what happens when you generate a README:
1

User submits repository URL

The user enters a GitHub repository URL in the frontend and clicks Generate.
app/page.tsx
const handleSubmit = async (e: React.FormEvent) => {
  e.preventDefault()
  let trimmedRepoUrl = repoUrl.endsWith('/') ? repoUrl.slice(0, -1) : repoUrl
  const repo = trimmedRepoUrl.replace('https://github.com/', '')
  
  // Validate and submit to API
  const response = await fetch('/api/generate', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ repoUrl: trimmedRepoUrl }),
  })
}
The frontend performs basic validation and checks the user’s credit balance before proceeding.
2

API validates request

The Next.js API route (/api/generate) receives the request and performs security checks:
app/api/generate/route.ts
// Authentication check
const { userId } = getAuth(req)
if (!userId) {
  return NextResponse.json({ 
    error: "Authentication required. Please sign in to generate README files."
  }, { status: 401 })
}

// URL validation
function isValidGitHubUrl(url: string): boolean {
  const parsedUrl = new URL(url)
  return (
    parsedUrl.protocol === 'https:' &&
    parsedUrl.hostname === 'github.com' &&
    parsedUrl.pathname.split('/').length >= 3
  )
}
The API verifies:
  • User authentication via Clerk
  • Valid GitHub URL format
  • Sufficient credit balance
  • Rate limiting and queue position
3

Repository ingestion

The API calls a Python microservice hosted at https://gitread-api.onrender.com/ingest.
app/api/generate/route.ts
const pythonApiUrl = "https://gitread-api.onrender.com/ingest";
const pythonApiKey = process.env.PYTHON_API_KEY!;

const response = await fetch(pythonApiUrl, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-api-key": pythonApiKey,
  },
  body: JSON.stringify({ repo_url: repoUrl }),
});
The Python service uses the gitingest library to:
  • Clone the repository
  • Analyze the directory structure
  • Extract and summarize file contents
  • Estimate token counts
scripts/git_ingest.py
from gitingest import ingest

def process_repo(repo_url):
    summary, tree, content = ingest(repo_url)
    
    # Extract estimated token count
    match = re.search(r"Estimated tokens:\s*([\d.]+)k", summary)
    if match:
        estimated_tokens = int(float(match.group(1)) * 1000)
        
        # Check token limit
        if estimated_tokens > MAX_INPUT_TOKENS:
            return {"error": "Token limit exceeded"}
    
    return {
        "content": content,
        "summary": summary,
        "tree": tree,
        "estimated_tokens": estimated_tokens
    }
4

AI generation

The processed repository data is sent to Google’s Gemini model via OpenRouter.
app/api/generate/route.ts
const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
})

const prompt = `Make a README for the following GitHub repository.

Summary:
${gitIngestOutput.summary}

Tree:
${gitIngestOutput.tree}

Content:
${gitIngestOutput.content}`

const response = await client.chat.completions.create({
  model: "google/gemini-2.5-pro-preview-03-25",
  messages: [
    { role: "system", content: "You are an expert technical writer." },
    { role: "user", content: prompt }
  ]
})
The AI analyzes the repository structure, code patterns, and documentation to generate a comprehensive README.
5

Credit deduction and storage

After successful generation, the system updates the user’s credits and saves the README:
app/api/generate/route.ts
// Deduct credit
const { data, error } = await supabaseAdmin
  .from('user_credits')
  .select('credits')
  .eq('user_id', userId)
  .single()

const newCredits = (data?.credits || 1) - 1

await supabaseAdmin
  .from('user_credits')
  .upsert({
    user_id: userId,
    credits: newCredits,
    updated_at: new Date().toISOString()
  })
The README is saved to the user’s history:
app/utils/supabase.ts
export async function saveGeneratedReadme(
  userId: string, 
  repoUrl: string, 
  readmeContent: string
) {
  const { error } = await supabase
    .from('generated_readmes')
    .insert({
      user_id: userId,
      repo_url: repoUrl,
      readme_content: readmeContent
    })
}
6

Response to frontend

The generated README is returned to the frontend with metadata:
return NextResponse.json({ 
  readme,
  inputTokens,
  outputTokens,
  warnings: gitIngestOutput.warnings,
  limits: gitIngestOutput.limits
})
The frontend displays the README with editing capabilities and triggers a confetti animation.

Core components

Frontend (Next.js + React)

The main application page (app/page.tsx) handles:
  • User interface and interaction
  • Form submission and validation
  • Credit management UI
  • README history display
  • Markdown editing and preview
app/page.tsx
const [viewMode, setViewMode] = useState<'markdown' | 'preview'>('preview')
const [readme, setReadme] = useState('')
const [credits, setCredits] = useState(1)

// Automatic credit refresh every 30 seconds
useEffect(() => {
  const interval = setInterval(async () => {
    const response = await fetch('/api/credits');
    const data = await response.json();
    setCredits(data.credits);
  }, 30000);
  return () => clearInterval(interval);
}, [isSignedIn, userId]);

API layer (Next.js API Routes)

Key API endpoints:
Main endpoint for README generation.Request:
{
  "repoUrl": "https://github.com/username/repo"
}
Response:
{
  "readme": "# Repository Name\n\n...",
  "inputTokens": 15000,
  "outputTokens": 500,
  "warnings": [],
  "limits": { ... }
}
Manages user credit balance.GET Response:
{
  "credits": 5
}
Retrieves and saves README history.GET Response:
{
  "history": [
    {
      "id": "uuid",
      "repo_url": "https://github.com/...",
      "readme_content": "...",
      "created_at": "2026-02-28T20:00:00Z"
    }
  ]
}

Python ingestion service

The ingestion service runs independently and provides:
  • Repository cloning and analysis
  • File content extraction
  • Token estimation
  • Size and complexity limits
MAX_INPUT_TOKENS = 250_000
MAX_FILE_SIZE = # from gitingest config
MAX_FILES = # from gitingest config
MAX_DIRECTORY_DEPTH = # from gitingest config

Database (Supabase)

GitRead uses three main tables: user_credits
CREATE TABLE user_credits (
  user_id TEXT PRIMARY KEY,
  credits INTEGER NOT NULL DEFAULT 1,
  updated_at TIMESTAMP WITH TIME ZONE
);
generated_readmes
CREATE TABLE generated_readmes (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  user_id TEXT NOT NULL,
  repo_url TEXT NOT NULL,
  readme_content TEXT NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
processed_stripe_events
CREATE TABLE processed_stripe_events (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  event_id TEXT UNIQUE NOT NULL,
  user_id TEXT NOT NULL,
  credits INTEGER NOT NULL,
  processed_at TIMESTAMP WITH TIME ZONE
);

Queue management

GitRead implements a simple in-memory queue to handle concurrent requests:
app/api/generate/route.ts
const MAX_QUEUE_SIZE = 20;
const requestQueue: (() => Promise<void>)[] = [];
let processing = false;

async function processQueue() {
  if (processing) return;
  processing = true;
  while (requestQueue.length > 0) {
    const next = requestQueue.shift();
    if (next) await next();
  }
  processing = false;
}

// If queue is full, return 429
if (requestQueue.length >= MAX_QUEUE_SIZE) {
  return NextResponse.json({ 
    error: 'Server is busy. Please try again in a few moments.' 
  }, { status: 429 });
}

Error handling

Comprehensive error handling at each layer:
// Display user-friendly error messages with icons
if (data.error?.includes('private')) {
  setErrorMessage('🔒 Repository not found or is private.');
} else if (data.error?.includes('rate limit')) {
  setErrorMessage('🚫 API rate limit exceeded.');
} else if (data.error?.includes('timed out')) {
  setErrorMessage('⏱️ Request timed out.');
}

Performance optimizations

  1. Token estimation - Calculate tokens before AI generation to avoid unnecessary API calls
  2. Credit caching - Refresh credits every 30 seconds instead of on every action
  3. Temporary files - Use system temp directory for repository processing
  4. Response streaming - Large README content is handled efficiently
  5. Queue system - Prevents server overload during high traffic

Security measures

Authentication

  • Clerk-based user authentication
  • Session validation on every API request
  • Secure token handling

URL validation

  • Strict GitHub URL format checking
  • Path traversal prevention
  • Query parameter blocking

API keys

  • Environment variable storage
  • Service role separation
  • API key authentication for microservices

Rate limiting

  • Per-user cooldown periods
  • Queue-based request throttling
  • Maximum concurrent request limits

Next steps

Local development

Set up GitRead for local development and testing

API reference

Detailed API documentation and integration guides

Environment variables

Complete guide to configuration options

Deployment

Deploy your own instance of GitRead

Build docs developers (and LLMs) love