Skip to main content

Getting started

Contributions are welcome! We appreciate your interest in improving PAS2, a sophisticated system for detecting hallucinations in AI responses using a paraphrase-based approach with model-as-judge verification.

Setting up your development environment

Prerequisites

Before you begin, ensure you have:
  • Python 3.x installed
  • Git for version control
  • API keys for Mistral AI and OpenAI

Installation

  1. Clone the repository:
    git clone https://github.com/serhanylmz/pas2
    cd pas2
    
  2. Install dependencies:
    pip install -r requirements.txt
    
  3. Set up your API keys as environment variables:
    • HF_MISTRAL_API_KEY: Your Mistral AI API key
    • HF_OPENAI_API_KEY: Your OpenAI API key
    Or create a .env file:
    OPENAI_API_KEY=your_api_key_here
    

How to contribute

Reporting issues

If you find a bug or have a feature request, please open an issue on GitHub with:
  • A clear description of the problem or suggestion
  • Steps to reproduce (for bugs)
  • Expected vs actual behavior
  • Your environment details

Submitting pull requests

  1. Fork the repository
  2. Create a new branch for your feature or fix
  3. Make your changes
  4. Test your changes thoroughly
  5. Submit a pull request with a clear description

Code guidelines

  • Follow existing code style and conventions
  • Write clear, descriptive commit messages
  • Include comments for complex logic
  • Update documentation as needed

Project architecture

PAS2 uses a multi-model architecture:
  • Paraphrase generation: Automatically generates semantically equivalent variations of user queries
  • Response generation: Uses Mistral Large for generating responses
  • Hallucination detection: Uses OpenAI’s o3-mini as a judge to analyze responses
  • Data persistence: SQLite database for storing feedback and statistics

Testing

Web interface testing

Run the Gradio interface:
python pas2-gradio.py

Benchmark testing

Run the benchmark tool:
python pas2-benchmark.py --json_file your_data.json --num_samples 10

Citation requirements

This project is licensed under the MIT License with an attribution requirement. If you use PAS2 in your research or project, you must cite it as:
@software{pas2_2024,
  author = {Serhan Yilmaz},
  title = {PAS2 - Paraphrase-based AI System for Semantic Similarity},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/serhanylmz/pas2}
}

Attribution requirements

When using PAS2, you must provide appropriate attribution by:
  1. Including the copyright notice and license in any copy or substantial portion of the software
  2. Citing the project in any publications, presentations, or documentation that uses or builds upon this work
  3. Maintaining a link to the original repository in any forks or derivative works

Contact

For questions or collaboration opportunities: Serhan Yilmaz
[email protected]
Sabanci University

Build docs developers (and LLMs) love