Overview
Chat with Code enables natural language conversations about GitHub repositories. It uses LlamaIndex to index code files, Nebius AI for embeddings and generation, and provides an interactive Streamlit interface for querying codebases.
Key Features
GitHub Integration : Direct repository loading via GitHub API
Code-Aware Indexing : Indexes Python, JavaScript, TypeScript, Jupyter notebooks, and Markdown
AI-Powered Q&A : Uses Nebius DeepSeek-V3 for code understanding
Streaming Responses : Real-time answer generation
Export Functionality : Download conversation history
Branch Support : Load specific branches from repositories
Architecture
from llama_index.core import Settings, VectorStoreIndex, PromptTemplate
from llama_index.embeddings.nebius import NebiusEmbedding
from llama_index.llms.nebius import NebiusLLM
from llama_index.readers.github import GithubRepositoryReader, GithubClient
Code Indexing Pipeline
Implementation
GitHub URL Parsing
import re
def parse_github_url ( url ):
"""Parse GitHub repository URL to extract owner, repo, and branch."""
pattern = r "https ? ://github \. com/ ([ ^ / ] + ) / ([ ^ / ] + )(?: /tree/ ([ ^ / ] + )) ? "
match = re.match(pattern, url)
if not match:
raise ValueError ( "Invalid GitHub repository URL" )
owner, repo, branch = match.groups()
return owner, repo, branch if branch else "main"
# Example usage
owner, repo, branch = parse_github_url(
"https://github.com/openai/gpt-4/tree/develop"
)
# Returns: ('openai', 'gpt-4', 'develop')
GitHub Repository Loading
from llama_index.readers.github import GithubRepositoryReader, GithubClient
import os
@st.cache_resource
def load_github_data ( github_token , owner , repo , branch = "main" ):
"""Load repository data with file filtering."""
# Initialize GitHub client
github_client = GithubClient(github_token)
# Create repository reader
loader = GithubRepositoryReader(
github_client,
owner = owner,
repo = repo,
filter_file_extensions = (
[ ".py" , ".ipynb" , ".js" , ".ts" , ".md" ], # File types to include
GithubRepositoryReader.FilterType. INCLUDE
),
verbose = False ,
concurrent_requests = 5 , # Parallel API requests
)
# Load data from specific branch
documents = loader.load_data( branch = branch)
return documents
File Filtering :
.py - Python source code
.ipynb - Jupyter notebooks
.js, .ts - JavaScript/TypeScript
.md - Documentation
Concurrent requests speed up large repository loading.
Nebius Model Configuration
from llama_index.core import Settings
from llama_index.embeddings.nebius import NebiusEmbedding
from llama_index.llms.nebius import NebiusLLM
import os
# Initialize Nebius LLM (DeepSeek-V3 for code understanding)
llm = NebiusLLM(
model = "deepseek-ai/DeepSeek-V3" ,
api_key = os.getenv( "NEBIUS_API_KEY" )
)
# Initialize Nebius embeddings
embed_model = NebiusEmbedding(
model_name = "BAAI/bge-en-icl" ,
api_key = os.getenv( "NEBIUS_API_KEY" )
)
# Configure LlamaIndex global settings
Settings.llm = llm
Settings.embed_model = embed_model
Why DeepSeek-V3?
Exceptional code understanding capabilities
Strong performance on technical queries
Support for multiple programming languages
Efficient token usage
RAG Query Engine
from llama_index.core import VectorStoreIndex, PromptTemplate
def run_rag_completion ( query_text : str , docs ) -> str :
"""Run RAG completion for code queries."""
# Configure models
llm = NebiusLLM(
model = "deepseek-ai/DeepSeek-V3" ,
api_key = os.getenv( "NEBIUS_API_KEY" )
)
embed_model = NebiusEmbedding(
model_name = "BAAI/bge-en-icl" ,
api_key = os.getenv( "NEBIUS_API_KEY" )
)
Settings.llm = llm
Settings.embed_model = embed_model
# Create vector index from code documents
index = VectorStoreIndex.from_documents(docs)
# Create query engine with streaming
query_engine = index.as_query_engine(
similarity_top_k = 5 , # Retrieve top 5 code chunks
streaming = True
)
# Custom prompt for code queries
qa_prompt_tmpl = PromptTemplate(
"Context information from the codebase is below. \n "
"--------------------- \n "
" {context_str} \n "
"--------------------- \n "
"Given the code context, please answer the query. \n "
"If the answer requires code examples, include them. \n "
"Query: {query_str} \n "
"Answer: "
)
# Update query engine with custom prompt
query_engine.update_prompts(
{ "response_synthesizer:text_qa_template" : qa_prompt_tmpl}
)
# Execute query
response = query_engine.query(query_text)
return str (response)
Vector Search for Code
The RAG process for code:
Code Chunking : Files are split into semantic chunks (functions, classes)
Embedding : Each chunk is embedded using Nebius embeddings
Index Creation : Chunks stored in vector index
Query Embedding : User question is embedded
Similarity Search : Most relevant code chunks retrieved (top-k=5)
Context Augmentation : Retrieved code added to prompt
Generation : DeepSeek-V3 generates answer with code context
# Automatic in LlamaIndex:
# 1. Documents are chunked intelligently
# 2. Each chunk is embedded
index = VectorStoreIndex.from_documents(docs)
# 3. Query is embedded and searched
query_engine = index.as_query_engine( similarity_top_k = 5 )
# 4. Top 5 most relevant code chunks are retrieved
# 5. LLM generates answer with code context
response = query_engine.query( "How does the authentication work?" )
Streamlit Application
import streamlit as st
import os
st.set_page_config( page_title = "Chat with Code" , layout = "wide" )
# Header
col1, col2, col3, col4 = st.columns([ 3 , 1 , 1 , 1 ])
with col1:
st.title( "🤖 Chat with Code" )
with col3:
st.link_button( "⭐ Star Repo" , "https://github.com/Arindam200/nebius-cookbook" )
with col4:
if st.button( "🗑️ Clear Chat" ):
st.session_state.messages = []
st.rerun()
st.caption( "Powered by Nebius AI (DeepSeek-V3) and LlamaIndex" )
# Sidebar: Repository loading
with st.sidebar:
st.subheader( "GitHub Repository URL" )
repo_url = st.text_input( "" , placeholder = "https://github.com/owner/repo" )
if st.button( "Load Repository" ):
if repo_url:
try :
github_token = os.getenv( "GITHUB_TOKEN" )
nebius_api_key = os.getenv( "NEBIUS_API_KEY" )
if not github_token or not nebius_api_key:
st.error( "Missing API keys" )
st.stop()
# Parse URL
owner, repo, branch = parse_github_url(repo_url)
with st.spinner( "Loading repository..." ):
# Load with caching
st.session_state.docs = load_github_data(
github_token, owner, repo, branch
)
st.success( f "✓ Loaded { len (st.session_state.docs) } files from { owner } / { repo } " )
except Exception as e:
st.error( f "Error: { str (e) } " )
# Initialize chat history
if "messages" not in st.session_state:
st.session_state.messages = []
if "docs" not in st.session_state:
st.session_state.docs = None
# Display chat messages
for message in st.session_state.messages:
with st.chat_message(message[ "role" ]):
st.markdown(message[ "content" ])
# Chat input
if prompt := st.chat_input( "Ask about the repository..." ):
if not st.session_state.docs:
st.error( "Please load a repository first" )
st.stop()
# Add user message
st.session_state.messages.append({ "role" : "user" , "content" : prompt})
with st.chat_message( "user" ):
st.markdown(prompt)
# Generate response
with st.chat_message( "assistant" ):
with st.spinner( "Thinking..." ):
try :
response = run_rag_completion(prompt, st.session_state.docs)
st.markdown(response)
# Add to history
st.session_state.messages.append({
"role" : "assistant" ,
"content" : response
})
# Download button
st.download_button(
label = "Download message" ,
type = "secondary" ,
data = response,
file_name = "chatbot_response.md" ,
mime = "text/plain" ,
icon = ":material/download:" ,
)
except Exception as e:
st.error( f "Error: { str (e) } " )
Example Queries
Code Understanding
# Architecture questions
"What is the overall architecture of this project?"
# Implementation details
"How does the authentication system work?"
# Function explanations
"Explain what the process_data() function does."
# Dependencies
"What external libraries does this project use?"
Code Search
# Find specific implementations
"Where is the database connection logic?"
# Locate features
"Show me the API endpoint for user registration."
# Error handling
"How are errors handled in this codebase?"
Code Analysis
# Best practices
"Are there any potential security issues in the authentication code?"
# Improvements
"How can the caching mechanism be improved?"
# Patterns
"What design patterns are used in this project?"
Installation
git clone https://github.com/Arindam200/awesome-ai-apps.git
cd rag_apps/chat_with_code
uv sync
Environment Setup
Create a .env file:
GITHUB_TOKEN = your_github_personal_access_token
NEBIUS_API_KEY = your_nebius_api_key
Getting GitHub Token
Go to GitHub Settings → Developer settings → Personal access tokens
Generate new token (classic)
Select scopes: repo (for private repos) or public_repo (for public only)
Copy token to .env file
Running the Application
Use Cases
Code Onboarding Help new developers understand unfamiliar codebases
Documentation Generate documentation from code automatically
Code Review Review code for patterns, issues, and improvements
Technical Interviews Discuss code implementations in interviews
Best Practices
Public Repositories
Start with public repositories to test functionality
Specific Questions
Ask specific questions about functions, classes, or features
Context-Aware Queries
Reference specific files or modules in your questions
Verify Code Examples
Always verify AI-generated code suggestions against the actual repository
Caching
@st.cache_resource
def load_github_data ( github_token , owner , repo , branch = "main" ):
"""Cache loaded repository data."""
# This function result is cached
# Subsequent loads with same parameters use cache
pass
Streamlit’s @st.cache_resource decorator caches repository data, avoiding repeated GitHub API calls and improving response time.
Concurrent Requests
loader = GithubRepositoryReader(
github_client,
owner = owner,
repo = repo,
concurrent_requests = 5 , # Load 5 files in parallel
)
Configuration Options
Repository Loading
List of file extensions to include (e.g., [".py", ".js", ".md"])
Number of parallel GitHub API requests for faster loading
Query Engine
Number of most relevant code chunks to retrieve
Enable streaming responses for real-time output
Limitations
GitHub API Rate Limits : 5,000 requests/hour for authenticated users
Large Repositories : Very large repos may take time to load
File Type Support : Limited to configured file extensions
Token Context : Very large files may exceed model context limits
Troubleshooting
GitHub API rate limit exceeded
Wait for rate limit reset or use a different GitHub token. Check limits at: https://api.github.com/rate_limit
Verify GitHub token has correct permissions. For private repos, ensure repo scope is enabled.
Try more specific questions. Include file names or function names in queries.
Check filter_file_extensions includes the file types you need.
LlamaIndex LlamaIndex documentation and guides
GitHub API GitHub REST API documentation
Nebius AI Nebius AI model provider
DeepSeek DeepSeek model information