Skip to main content
POST
/
api
/
v1
/
repos
/
{owner}
/
{repo}
/
rescan
Rescan Repository
curl --request POST \
  --url https://api.example.com/api/v1/repos/{owner}/{repo}/rescan
{
  "status": "<string>",
  "repo": "<string>",
  "files_indexed": 123
}

Overview

Rescans a connected repository to rebuild its Neo4j code graph. This operation fetches the latest file tree from GitHub and updates the graph database with current repository structure. Unlike the initial connection scan (which runs in the background), rescan runs synchronously and returns the file count immediately.

Authentication

Requires a valid JWT token in the Authorization header.
Authorization: Bearer <token>

Path Parameters

owner
string
required
Repository owner (user or organization)
repo
string
required
Repository name

Request

curl -X POST "https://api.nectr.ai/api/v1/repos/acme/my-backend/rescan" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Response

status
string
Always returns "scan_complete" on success
repo
string
Full repository name in owner/repo format
files_indexed
integer
Number of files successfully indexed in the Neo4j graph

Example Response

{
  "status": "scan_complete",
  "repo": "acme/my-backend",
  "files_indexed": 247
}

Graph Rebuild Process

1. Fetch Repository Tree

Retrieves the complete file tree from GitHub using the recursive Git tree API:
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{default_branch}?recursive=1

2. Filter Files

Excludes binary and generated directories:
  • node_modules
  • .git
  • dist
  • build
  • __pycache__
  • .next
  • vendor

3. Update Neo4j Graph

Upsert Repository Node:
MERGE (r:Repository {full_name: "owner/repo"})
SET r.scanned_at = "2026-03-10T14:32:10Z"
Upsert File Nodes (batched in chunks of 200):
UNWIND $files AS f
MERGE (file:File {repo: $repo, path: f.path})
SET file.language = f.language, file.size = f.size
WITH file
MERGE (r:Repository {full_name: $repo})
MERGE (r)-[:CONTAINS]->(file)
Remove Stale Files:
MATCH (r:Repository {full_name: $repo})-[:CONTAINS]->(f:File)
WHERE NOT f.path IN $current_paths
DETACH DELETE f
Files deleted from the repository since the last scan are removed from the graph.

4. Language Detection

File language is inferred from extension:
ExtensionLanguage
.pyPython
.js, .jsxJavaScript
.ts, .tsxTypeScript
.javaJava
.goGo
.rbRuby
.rsRust
.cpp, .cC/C++
.csC#
.phpPHP
.swiftSwift
.ktKotlin
.mdMarkdown
.yaml, .ymlYAML
.jsonJSON
.sqlSQL
.tfTerraform
(other)Other

Error Responses

401 Unauthorized
JWT token is invalid, expired, or missing
{
  "detail": "Unauthorized"
}
401 Session Expired
GitHub OAuth token cannot be decrypted (SECRET_KEY changed)
{
  "detail": "Session expired — please log out and sign in again."
}
404 Not Found
Repository is not connected
{
  "detail": "Repo not connected"
}
500 Internal Server Error
Rescan operation failed
{
  "detail": "GitHub API error fetching file tree: {error}"
}
Or:
{
  "detail": "Neo4j write failed: {error}"
}
Or:
{
  "detail": "GitHub returned 0 files. Check that your OAuth token has 'repo' scope and that the repository is not empty."
}
503 Service Unavailable
Neo4j is not configured on the server
{
  "detail": "Neo4j is not configured on this server"
}

When to Rescan

Consider rescanning when:
  • Major refactoring changed file structure
  • Large number of files added/renamed/deleted
  • PR reviews seem to lack context on new files
  • Initial scan was truncated (large monorepo)
  • Repository was migrated or restructured

Performance Considerations

Large Repositories

For repositories with >500 MB or >100k files:
  • GitHub may truncate the tree API response (logged as warning)
  • Scan continues with partial file set
  • Consider using a sparse checkout or monorepo tool

Request Timeout

Rescan is synchronous and may take 10-60 seconds for large repos. Ensure your HTTP client has an appropriate timeout:
# Python example
response = httpx.post(
    "https://api.nectr.ai/api/v1/repos/acme/monorepo/rescan",
    headers=headers,
    timeout=120.0  # 2 minutes
)

Graph Consistency

Preserved Data

Rescan does not affect:
  • PullRequest nodes
  • Developer nodes
  • Issue nodes
  • Historical TOUCHES, AUTHORED_BY, CLOSES edges

Updated Data

  • File nodes (added/updated/removed)
  • Repository.scanned_at timestamp
  • CONTAINS edges

Build docs developers (and LLMs) love