Overview
Nectr integrates deeply with GitHub to provide automated PR reviews. The integration consists of three main components:
GitHub OAuth - Secure authentication flow for user login
REST API Client - Fetches PR diffs, files, and posts review comments
Webhook Manager - Installs per-repo webhooks to receive PR events in real-time
Authentication
GitHub OAuth Flow
Users authenticate via GitHub OAuth to grant Nectr access to their repositories.
User initiates login
User clicks “Login with GitHub” on the frontend
OAuth redirect
User is redirected to GitHub authorization page with configured scopes
Callback handling
GitHub redirects back to /auth/github/callback with authorization code
Token exchange
Backend exchanges code for access token and stores it encrypted with Fernet (AES-128-CBC)
Token Management
The GitHub client supports two authentication modes:
User OAuth Token (Preferred)
# Used during webhook processing - no separate PAT needed
headers = {
"Authorization" : f "Bearer { user_oauth_token } " ,
"Accept" : "application/vnd.github.v3+json" ,
}
Personal Access Token (Fallback)
# Fallback when OAuth token unavailable
token = get_github_token() # Tries gh CLI, then GITHUB_PAT env var
Required Environment Variables
# GitHub OAuth credentials
GITHUB_CLIENT_ID=Ov23li...
GITHUB_CLIENT_SECRET=1a2b3c4d5e...
# Personal Access Token (optional fallback)
GITHUB_PAT=ghp_...
# Global webhook secret (optional, per-repo secrets take precedence)
GITHUB_WEBHOOK_SECRET=your-webhook-secret
The GITHUB_PAT is optional in production when using OAuth tokens. It’s primarily used as a fallback or for development with the gh CLI.
GitHub REST API Client
The GithubClient class (app/integrations/github/client.py:38) provides async methods for all GitHub operations.
Fetching PR Data
Pull Request Details
PR Diff
Changed Files
File Content
async def get_pull_request ( owner : str , repo : str , pr_number : int ) -> dict :
"""Fetch PR metadata including title, description, author, and merge status."""
url = f "https://api.github.com/repos/ { owner } / { repo } /pulls/ { pr_number } "
async with httpx.AsyncClient( timeout = 60.0 ) as client:
response = await client.get(url, headers = self .headers)
response.raise_for_status()
return response.json()
Posting Reviews
PR Comment (Top-Level)
PR Review (Summary + Inline Comments)
Inline Review Comment
async def post_pr_comment (
owner : str ,
repo : str ,
pr_number : int ,
comment : str ,
token : str | None = None ,
) -> dict :
"""Post a top-level comment on the PR (issue comment thread)."""
url = f "https://api.github.com/repos/ { owner } / { repo } /issues/ { pr_number } /comments"
async with httpx.AsyncClient( timeout = 60.0 ) as client:
response = await client.post(
url,
headers = self ._get_headers(token),
json = { "body" : comment}
)
response.raise_for_status()
return response.json()
Repository Queries
Issues
Pull Requests
Languages
Contributors
async def get_repo_issues (
owner : str ,
repo : str ,
state : str = "all" ,
per_page : int = 50 ,
) -> list[ dict ]:
"""Get repository issues (excludes PRs)."""
url = f "https://api.github.com/repos/ { owner } / { repo } /issues"
async with httpx.AsyncClient( timeout = 60.0 ) as client:
response = await client.get(
url,
headers = self .headers,
params = {
"state" : state,
"per_page" : per_page,
"page" : 1 ,
"sort" : "updated" ,
"direction" : "desc" ,
},
)
response.raise_for_status()
return [item for item in response.json() if "pull_request" not in item]
PR State Caching
The client implements an LRU + TTL cache for PR state checks:
# Cache configuration
PR_STATUS_CACHE_TTL = 60 # seconds
PR_STATUS_CACHE_MAX = 500 # max entries
async def get_pr_state ( owner : str , repo : str , pr_number : int ) -> str :
"""Fetch current PR state with bounded LRU + TTL cache.
Returns: "open", "closed", or "merged"
TTL: 60s for open PRs, 300s for closed/merged
"""
cache_key = f " { owner } / { repo } # { pr_number } "
cached = self ._pr_status_cache.get(cache_key)
if cached and cached[ 1 ] > time.monotonic():
self ._pr_status_cache.move_to_end(cache_key)
return cached[ 0 ]
pr = await self .get_pull_request(owner, repo, pr_number)
status = "merged" if pr.get( "merged" ) else pr.get( "state" , "open" )
ttl = PR_STATUS_CACHE_TTL
if status in ( "merged" , "closed" ):
ttl = 300 # Longer cache for terminal states
self ._pr_status_cache[cache_key] = (status, time.monotonic() + ttl)
# ... eviction logic ...
Webhook Management
The webhook manager (app/integrations/github/webhook_manager.py:10) handles per-repo webhook lifecycle.
Installing Webhooks
async def install_webhook (
owner : str ,
repo : str ,
access_token : str ,
backend_url : str = "http://localhost:8000" ,
) -> tuple[ int , str ]:
"""Install a GitHub webhook on the given repo.
Returns:
(webhook_id, webhook_secret) - Store these in the Installation table
"""
webhook_secret = secrets.token_hex( 32 )
payload_url = f " { backend_url.rstrip( '/' ) } /api/v1/webhooks/github"
async with httpx.AsyncClient() as client:
resp = await client.post(
f "https://api.github.com/repos/ { owner } / { repo } /hooks" ,
headers = {
"Authorization" : f "Bearer { access_token } " ,
"Accept" : "application/vnd.github.v3+json" ,
},
json = {
"name" : "web" ,
"active" : True ,
"events" : [ "pull_request" , "issues" ], # PR opened/updated, issues linked
"config" : {
"url" : payload_url,
"content_type" : "json" ,
"secret" : webhook_secret,
"insecure_ssl" : "0" ,
},
},
)
resp.raise_for_status()
data = resp.json()
webhook_id = data[ "id" ]
logger.info( f "Installed webhook { webhook_id } on { owner } / { repo } " )
return webhook_id, webhook_secret
Uninstalling Webhooks
async def uninstall_webhook (
owner : str ,
repo : str ,
webhook_id : int ,
access_token : str ,
) -> None :
"""Delete a GitHub webhook from the given repo."""
async with httpx.AsyncClient() as client:
resp = await client.delete(
f "https://api.github.com/repos/ { owner } / { repo } /hooks/ { webhook_id } " ,
headers = {
"Authorization" : f "Bearer { access_token } " ,
"Accept" : "application/vnd.github.v3+json" ,
},
)
if resp.status_code == 404 :
logger.warning( f "Webhook { webhook_id } not found on { owner } / { repo } — already deleted?" )
return
resp.raise_for_status()
logger.info( f "Uninstalled webhook { webhook_id } from { owner } / { repo } " )
Webhook Receiver
The webhook endpoint (/api/v1/webhooks/github) receives events from GitHub and triggers PR reviews.
Signature Verification
def verify_github_signature ( payload_body : bytes , signature : str , secret : str ) -> bool :
"""Verify webhook authenticity using HMAC-SHA256.
GitHub signs every webhook with the per-repo secret.
Signature format: "sha256=<hex_digest>"
"""
if not secret:
return True # Skip verification if no secret configured
expected = "sha256=" + hmac.new(
secret.encode(),
payload_body,
hashlib.sha256,
).hexdigest()
return hmac.compare_digest(expected, signature)
Event Processing Flow
Webhook received
GitHub POSTs to /api/v1/webhooks/github with event payload
Signature verification
Verify X-Hub-Signature-256 header using per-repo webhook secret
Event deduplication
Check if identical event was processed within last hour (prevents duplicate reviews)
Event persisted
Create Event row with status=“pending”
Return 200 immediately
GitHub has 10-second timeout, AI review takes 30-60 seconds
Background processing
process_pr_in_background() runs asynchronously:
Fetch PR data via GitHub API
Pull MCP context (Linear issues, Sentry errors)
Build Neo4j + Mem0 context
Run AI review
Post review comment back to GitHub
Index PR in knowledge graph
Extract and store memories
Handling User OAuth Tokens
# Look up the user who connected this repo and use their OAuth token
repo_full_name = payload.get( "repository" , {}).get( "full_name" , "" )
github_token: str | None = None
if repo_full_name:
inst_result = await db.execute(
select(Installation, User)
.join(User, Installation.user_id == User.id)
.where(
Installation.repo_full_name == repo_full_name,
Installation.is_active == True ,
)
)
row = inst_result.first()
if row:
_, user = row
github_token = decrypt_token(user.github_access_token)
logger.info( f "Using OAuth token for user @ { user.github_username } " )
# Pass token to review service - no separate PAT needed
review_result = await pr_review_service.process_pr_review(
payload, event, db, github_token = github_token
)
This approach means Nectr posts reviews as the user who connected the repo, not as a bot account. The review appears to come from your GitHub account.
Usage Example
Here’s how the GitHub integration is used in the PR review flow:
from app.integrations.github.client import github_client
# 1. Fetch PR data when webhook arrives
owner, repo = "nectr-ai" , "nectr"
pr_number = 42
pr_data = await github_client.get_pull_request(owner, repo, pr_number)
diff = await github_client.get_pr_diff(owner, repo, pr_number, token = user_oauth_token)
files = await github_client.get_pr_files(owner, repo, pr_number, token = user_oauth_token)
# 2. Get full content for critical files (not just diff)
for file in files[: 10 ]: # Limit to 10 files to avoid API rate limits
if file [ "filename" ].endswith(( ".py" , ".ts" , ".tsx" )):
content = await github_client.get_file_content(
owner, repo,
path = file [ "filename" ],
ref = pr_data[ "head" ][ "sha" ]
)
# 3. Run AI review (using context from Mem0, Neo4j, MCP integrations)
review_body = await ai_service.review_pr(
diff = diff,
files = files,
context = review_context,
)
# 4. Post review back to GitHub
await github_client.post_pr_comment(
owner, repo, pr_number,
comment = review_body,
token = user_oauth_token,
)
# 5. Index PR in Neo4j knowledge graph
await graph_builder.index_pull_request(
repo_full_name = f " { owner } / { repo } " ,
pr_number = pr_number,
pr_data = pr_data,
files = [f[ "filename" ] for f in files],
)
API Rate Limits
GitHub’s API has rate limits:
Authenticated requests : 5,000/hour
Unauthenticated requests : 60/hour
The client uses authenticated requests (via OAuth token or PAT) to get the higher limit.
GitHub includes rate limit info in every response:
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
X-RateLimit-Reset: 1372700873 # Unix timestamp
The client currently does not implement automatic rate limit handling. Consider adding:
if int (response.headers.get( "X-RateLimit-Remaining" , 100 )) < 10 :
logger.warning( "GitHub API rate limit nearly exhausted" )
# Implement backoff or queue strategy
Troubleshooting
Webhook Not Receiving Events
Check webhook status in GitHub repo settings
Go to Settings → Webhooks
Click on your webhook
Check “Recent Deliveries” for failed attempts
Verify webhook secret matches
SELECT webhook_secret FROM installations WHERE repo_full_name = 'owner/repo' ;
Test webhook endpoint manually
curl -X POST https://your-app.railway.app/api/v1/webhooks/github \
-H "X-Hub-Signature-256: sha256=..." \
-H "X-GitHub-Event: pull_request" \
-d @webhook-payload.json
Reviews Not Posting
Check OAuth token validity
# Token might be expired or revoked
try :
test_resp = await github_client.get_pull_request(owner, repo, 1 )
except httpx.HTTPStatusError as e:
if e.response.status_code == 401 :
logger.error( "GitHub token invalid or expired" )
Verify repo permissions
OAuth token needs repo scope for private repos
PAT needs repo scope
Check API response for errors
response = await github_client.post_pr_comment( ... )
# GitHub returns 422 for validation errors (e.g., PR already closed)
app/integrations/github/client.py:38 - GithubClient implementation
app/integrations/github/webhook_manager.py:10 - Webhook install/uninstall
app/api/v1/webhooks.py:41 - Webhook receiver endpoint
app/auth/router.py - OAuth flow
app/auth/token_encryption.py - Token encryption utilities