Skip to main content
GitRead uses advanced AI to analyze your repository and generate comprehensive, professional README files automatically.

How it works

The AI generation process follows a multi-step pipeline that extracts repository information and generates documentation:
  1. Repository ingestion - GitIngest processes your repository and extracts content, structure, and metadata
  2. Content analysis - The system analyzes code files, directory structure, and project summary
  3. AI generation - Gemini 2.5 Pro generates a professional README based on the analysis
  4. Post-processing - The generated markdown is cleaned and formatted for immediate use

AI model

GitRead uses Google Gemini 2.5 Pro Preview (google/gemini-2.5-pro-preview-03-25) via OpenRouter:
const response = await client.chat.completions.create({
  model: "google/gemini-2.5-pro-preview-03-25",
  messages: [
    { role: "system", content: "You are an expert technical writer." },
    { role: "user", content: prompt }
  ]
})

Why Gemini 2.5 Pro?

  • Large context window - Handles repositories with up to 900,000 tokens of content
  • Code understanding - Trained specifically for technical and code-related tasks
  • Quality output - Generates well-structured, accurate documentation
  • Speed - Fast generation times for most repositories

Prompt engineering

The AI receives a carefully crafted prompt that includes:

Repository summary

A high-level overview of the project structure and purpose extracted by GitIngest.

Directory tree

The complete file structure showing organization and key files:
const prompt = `Make a README for the following GitHub repository...

Summary:\n${gitIngestOutput.summary}\n\n
Tree:\n${gitIngestOutput.tree}\n\n
Content:\n${gitIngestOutput.content}`

Source code content

Full content of relevant source files, allowing the AI to understand implementation details.

Custom instructions

The prompt includes specific requirements:
  • Generate authentic content without placeholders
  • Include a DeepWiki badge for AI-powered documentation
  • Focus on clarity and professionalism
  • Output raw markdown without code blocks
The prompt explicitly instructs: “Do not generate any placeholder text or placeholder images in the readme file.”

Token limits and validation

GitRead enforces strict limits to ensure quality and prevent failures:

Input token limit

if (inputTokens > (gitIngestOutput.limits?.max_input_tokens || 900_000)) {
  return NextResponse.json({ 
    error: `Repository content exceeds maximum token limit of ${gitIngestOutput.limits?.max_input_tokens?.toLocaleString() || '900,000'} tokens`
  }, { status: 400 })
}
Default limit: 900,000 tokens (~675,000 words)

Token counting

The system uses word-based token estimation:
function countWords(text: string): number {
  return text.trim().split(/\s+/).filter(word => word.length > 0).length;
}
Repositories exceeding the token limit will receive an error message with the estimated token count.

Repository processing

URL validation

Only valid GitHub URLs are accepted:
function isValidGitHubUrl(url: string): boolean {
  const parsedUrl = new URL(url)
  return (
    parsedUrl.protocol === 'https:' &&
    parsedUrl.hostname === 'github.com' &&
    parsedUrl.pathname.split('/').length >= 3 &&
    !parsedUrl.pathname.includes('..') &&
    !parsedUrl.search &&
    !parsedUrl.hash
  )
}
Accepted format: https://github.com/username/repository

GitIngest integration

GitRead uses a Python microservice to process repositories:
const response = await fetch("https://gitread-api.onrender.com/ingest", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-api-key": pythonApiKey,
  },
  body: JSON.stringify({ repo_url: repoUrl }),
});
The microservice returns:
  • content - Full text of repository files
  • summary - Project overview
  • tree - Directory structure
  • estimated_tokens - Token count for the content
  • warnings - Any processing issues
  • limits - Size and complexity constraints

Response processing

After generation, the markdown content is cleaned:
let readme = response.choices[0].message.content
readme = readme.replace(/^```markdown\n?/, '')
readme = readme.replace(/```$/, '')
readme = readme.trim()
This removes any markdown code block wrappers that the AI might add.

Rate limiting and queuing

GitRead implements a request queue to handle high traffic:
const MAX_QUEUE_SIZE = 20;
const requestQueue: (() => Promise<void>)[] = [];

Queue behavior

  • Queue position tracking - Users see their position if queue is not empty
  • Overload protection - Returns 429 status if queue exceeds 20 requests
  • Sequential processing - Requests are processed one at a time
If the server is busy, you’ll receive a queue position and your request will be processed automatically.

Error handling

The API provides detailed error messages for common issues:
ErrorStatusDescription
Invalid URL400GitHub URL format is incorrect
Token limit exceeded400Repository is too large
Insufficient credits402User needs to purchase credits
Rate limit429OpenRouter API limit reached
Server busy429Queue is full, try again later

Rate limit errors

if (error.message?.includes("rate") || error.status === 429) {
  throw new Error(`API rate limit exceeded. Please try again later.`)
}

Performance

Generation time varies based on repository size:
  • Small repos (< 50 files) - 10-30 seconds
  • Medium repos (50-200 files) - 30-90 seconds
  • Large repos (200+ files) - 1-3 minutes
The system shows a loading indicator with your queue position during generation.

Build docs developers (and LLMs) love