Skip to main content

Quick start

This guide will walk you through setting up API keys, loading your first repository, and asking questions.
Before you begin: Make sure you’ve completed the installation steps.

Step 1: Get your API keys

RepoRAGX requires two API keys to function:
1

GitHub personal access token

  1. Go to github.com/settings/tokens
  2. Click “Generate new token” → “Generate new token (classic)”
  3. Give it a descriptive name like “RepoRAGX”
  4. Select the repo scope (or just public_repo if you only need public repositories)
  5. Click “Generate token” and copy the token immediately
GitHub only shows the token once. Save it in a secure location.
2

Groq API key

  1. Go to console.groq.com
  2. Sign up for a free account (no credit card required)
  3. Navigate to console.groq.com/keys
  4. Click “Create API Key”
  5. Give it a name and copy the API key
Groq’s free tier includes generous rate limits for the llama-3.3-70b-versatile model.

Step 2: Run RepoRAGX

With your virtual environment activated, start RepoRAGX:
python -m src.main
You’ll see the RepoRAGX banner and be prompted for credentials:
/**
 *    __________                    __________    _____    ____________  ___
 *    \______   \ ____ ______   ____\______   \  /  _  \  /  _____/\   \/  /
 *     |       _// __ \\____ \ /  _ \|       _/ /  /_\  \/   \  ___ \     / 
 *     |    |   \  ___/|  |_> >  <_> )    |   \/    |    \    \_\  \/     \ 
 *     |____|_  /\___  >   __/ \____/|____|_  /\____|__  /\______  /___/\  \
 *            \/     \/|__|                 \/         \/        \/      \_/
 */

Chat with your github repository

GitHub Personal Access Token: 

Step 3: Configure your session

Enter your credentials and repository details when prompted:
GitHub Personal Access Token: ghp_xxxxxxxxxxxxxxxxxxxx
Groq API Key: gsk_xxxxxxxxxxxxxxxxxxxx
Model Name (default: llama-3.3-70b-versatile): [press Enter]
Repo (owner/repo): AnmolTutejaGitHub/RepoRAGX
Branch (default: main): main
You can use any model supported by Groq. Popular options:
  • llama-3.3-70b-versatile (default, best quality)
  • llama-3.1-70b-versatile (alternative)
  • mixtral-8x7b-32768 (faster, good for smaller queries)
See the full list at console.groq.com/docs/models.
Use the format owner/repo:
  • facebook/react
  • microsoft/vscode
  • AnmolTutejaGitHub/RepoRAGX
  • https://github.com/facebook/react (don’t include the full URL)
The default branch is main. You can specify any branch name:
  • main
  • master
  • develop
  • feature/new-auth

Step 4: Loading the repository

RepoRAGX will now fetch and process the repository:
Initilizing github loader.....
Fetching files from github....
Loaded 42 files from github!
Splitting documents into chunks...
chunking completed
handling document embedding generation using sentence transformer...
Loading embedding model: all-MiniLM-L6-v2
Model loaded successfully. Embedding dimension: 384
Generating embeddings for 187 texts...
Batches: 100%|████████████████████| 6/6 [00:03<00:00,  1.89it/s]
Generated embeddings with shape: (187, 384)
Initilizing vector database....
Vector store initialized. Collection: AnmolTutejaGitHub_RepoRAGX
Existing documents in collection: 0
Adding 187 documents to vector store...
Successfully added 187 documents to vector store
Total documents in collection: 187
Initilizing RAG Retriver pipeline
Initializing Groq LLM...
First run: The embedding model (all-MiniLM-L6-v2) will be downloaded automatically (~90MB). This only happens once.

What’s happening behind the scenes

1

File loading

GitHubCodeBaseLoader fetches all files from the repository, filtering out:
  • Binary files (.png, .jpg, .exe, etc.)
  • Dependencies (node_modules/, venv/, .git/)
  • Generated files (.pyc, .class, .min.js)
See the full exclusion list in src/rag/github_codebase_loader.py:3-24
2

Text chunking

TextSplitter uses language-aware splitting with RecursiveCharacterTextSplitter:
  • Chunk size: 1000 characters
  • Chunk overlap: 200 characters
  • Supports 25+ languages (Python, JavaScript, Java, Go, Rust, etc.)
Language detection happens automatically based on file extension. See src/rag/text_splitter.py:3-50
3

Embedding generation

EmbeddingManager generates 384-dimensional vectors using Sentence Transformers:
# From src/rag/embedding_manager.py:5
model_name = "all-MiniLM-L6-v2"
Each code chunk is converted to a vector for similarity search.
4

Vector storage

VectorStore persists embeddings in ChromaDB:
  • Location: ~/.RepoRAGX/vector_store/
  • Distance metric: Cosine similarity
  • Collection name: owner_repo (e.g., facebook_react)
See src/rag/vector_store.py:13-40

Step 5: Ask questions

Once loading is complete, you can start asking questions:
Ask anything ('exit' to quit): How does the text splitter work?
RepoRAGX will:
  1. Convert your query to a vector embedding
  2. Search ChromaDB for the top 5 most similar code chunks
  3. Send the retrieved context to Groq’s LLM
  4. Return a context-aware answer

Example queries

Ask anything ('exit' to quit): What is the project structure?
Ask anything ('exit' to quit): How does the RAG pipeline work?
Ask anything ('exit' to quit): What libraries are used for embeddings?

Sample output

Ask anything ('exit' to quit): How are embeddings generated?

Running RAG for query: How are embeddings generated?
Retrieving documents for query: 'How are embeddings generated?'
Top K: 5, Score threshold: 0.0
Generating embeddings for 1 texts...
Retrieved 5 documents (after filtering)

Embeddings are generated using the Sentence Transformers library with the 
all-MiniLM-L6-v2 model. The EmbeddingManager class in src/rag/embedding_manager.py 
handles this process:

1. The model is loaded in the _load_model() method (line 11)
2. The generate_embeddings() method takes a list of texts
3. It uses model.encode() to convert texts to 384-dimensional vectors
4. Progress is shown with a progress bar via show_progress_bar=True
5. The method returns a numpy array of shape (num_texts, 384)

The same model is used for both document embeddings and query embeddings to ensure 
compatibility during similarity search.

Ask anything ('exit' to quit): exit

Understanding retrieval

When you ask a question, RepoRAGX performs similarity search:
def retrieve(self, query, top_k=5, score_threshold=0.0):
    # 1. Convert query to embedding
    query_embedding = self.embedding_manager.generate_embeddings([query])[0]
    
    # 2. Query ChromaDB
    results = self.vector_store.collection.query(
        query_embeddings=[query_embedding.tolist()],
        n_results=top_k
    )
    
    # 3. Calculate similarity scores
    for i, (doc_id, document, metadata, distance) in enumerate(...):
        similarity_score = 1 - distance  # Convert distance to similarity
        
        if similarity_score >= score_threshold:
            retrieved_docs.append({
                'content': document,
                'metadata': metadata,
                'similarity_score': similarity_score,
                'rank': i + 1
            })
    
    return retrieved_docs
Pro tip: The more specific your question, the better the results. Instead of “How does this work?”, try “How does the EmbeddingManager generate vector embeddings?”

Advanced configuration

Adjusting retrieval parameters

You can modify retrieval settings in src/rag/rag_retriever.py:7:
def retrieve(self, query, top_k=5, score_threshold=0.0):
  • top_k: Number of chunks to retrieve (default: 5)
  • score_threshold: Minimum similarity score (0.0 = no filtering)

Changing chunk size

Modify text splitting in src/rag/text_splitter.py:53:
def __init__(self, documents, chunk_size=1000, chunk_overlap=200):
  • chunk_size: Maximum characters per chunk
  • chunk_overlap: Overlapping characters between chunks

Using different models

Groq supports multiple models. Try these alternatives:
Model Name (default: llama-3.3-70b-versatile): mixtral-8x7b-32768
See src/rag/groq_llm.py:9-26 for LLM configuration.

Persistent storage

RepoRAGX stores vector embeddings locally:
# View stored repositories
ls ~/.RepoRAGX/vector_store/

# Each repository gets its own ChromaDB collection
# Named as: owner_repo (e.g., facebook_react)
Vector stores are recreated each time you load a repository. If you want persistent storage, comment out lines 19-23 in src/rag/vector_store.py:
# try:
#     self.client.delete_collection(name=self.collection_name)
#     print(f"Deleted existing collection: {self.collection_name}")
# except Exception:
#     pass

Troubleshooting

Error: Bad credentials
Your GitHub token is invalid or expired. Generate a new one at github.com/settings/tokens.
Error: API rate limit exceeded
GitHub: Wait for the rate limit to reset, or use a token with higher limits. Groq: Free tier has rate limits. Wait a moment between queries.
No relevant context found to answer the question.
Try:
  1. Rephrasing your question to be more specific
  2. Using keywords that appear in the codebase
  3. Lowering score_threshold in src/rag/rag_retriever.py:7
Error loading model all-MiniLM-L6-v2
Check your internet connection. The model downloads from Hugging Face (~90MB). If it continues to fail, try:
pip install --upgrade sentence-transformers

Exit the session

To exit RepoRAGX, type exit at the prompt:
Ask anything ('exit' to quit): exit

Next steps

How it works

Deep dive into RepoRAGX’s internals

API reference

Explore the Python API

Configuration

Customize API keys and models

Examples

See real-world usage examples

Build docs developers (and LLMs) love