Contribution Guide

Welcome to SGLang! We appreciate your interest in contributing. This guide provides an overview of how to set up your environment, run tests, build documentation, and open a Pull Request.

Getting Started

Fork and Clone

New contributors do not have write permission to the official SGLang repository. Please fork the repository under your GitHub account, then clone your fork locally.

git clone https://github.com/<your_username>/sglang.git
cd sglang

Install from Source

Build and install SGLang from source. This allows you to test your changes locally.

# Install dependencies
pip install -e ".[all]"

# Build kernels
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
pip install sgl-kernel

For detailed installation instructions, see the installation guide.

Development Workflow

1. Create a Feature Branch

Never commit directly to main. Always create a new branch:

git checkout -b feature/my-new-feature

2. Make Your Changes

Edit the code, add new features, or fix bugs. Follow the code style guidelines below.

3. Format Code

We use pre-commit for code formatting and linting:

# Install pre-commit
pip install pre-commit
pre-commit install

# Run checks on all files
pre-commit run --all-files

If the checks fail, fix the issues and run again. All checks must pass before creating a PR.

4. Add Tests

Add unit tests for new features or bug fixes. SGLang uses Python’s unittest framework.

# Run tests
python -m pytest test/srt/test_your_feature.py

# Run specific test
python -m pytest test/srt/test_your_feature.py::TestClass::test_method

For more details, see test/README.md.

5. Write Documentation

Document your changes in the appropriate documentation files. We recommend new contributors start by writing documentation to quickly understand the codebase. For documentation guidelines, see docs/README.md.

6. Test Accuracy (if applicable)

If your changes affect model output, run accuracy tests:

# Launch server
python -m sglang.launch_server --model-path Qwen/Qwen2-7B-Instruct

# Run GSM8K benchmark (sanity check)
python -m sglang.test.few_shot_gsm8k --num-questions 200

Note: This is a sanity check, not a rigorous test. The accuracy can vary by 1-5% due to batching and non-determinism.

7. Benchmark Performance (if applicable)

For performance-critical changes, benchmark your code:

python -m sglang.bench_serving \
  --backend sglang \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --num-prompts 1000 \
  --random-input-len 256 \
  --random-output-len 32

See Benchmark and Profiling for more details.

8. Commit and Push

Commit your changes with a descriptive message:

git add .
git commit -m "Add feature X to improve Y"
git push origin feature/my-new-feature

9. Create Pull Request

Go to GitHub and create a Pull Request from your fork to the main SGLang repository.

Title: Clear and concise description of the change
Description: Explain what the PR does, why it’s needed, and how to test it
Link issues: If fixing a bug, reference the issue number

Code Review Process

Follow the process described in MAINTAINER.md:

Merge Oncall reviews the PR
Codeowner reviews if touching their area
Other reviewers may provide feedback
Address feedback and update PR
Once approved, the PR will be merged

CI Testing

Triggering CI

Only trusted contributors can trigger CI tests. If you have permission (listed in CI_PERMISSIONS.json), you can:

/tag-run-ci-label: Add “run-ci” label to run CI on every commit
/rerun-failed-ci: Rerun failed tests
/tag-and-rerun-ci: Add label and rerun tests
/rerun-stage <stage-name>: Rerun a specific test stage

PR authors can always use /rerun-failed-ci on their own PRs. If you don’t have permission, ask a maintainer to trigger CI for you.

CI Rate Limits

To prevent abuse, CI has rate limits. The default cooldown is 120 minutes between runs. Users in CI_PERMISSIONS.json may have custom limits.

Code Style Guidelines

General Principles

Avoid code duplication: Extract repeated code (>5 lines) into functions
Minimize device synchronization: Avoid tensor.item() and tensor.cpu() in hot paths
Extreme efficiency: SGLang is a runtime—optimize everything on the critical path
Pure functions: Avoid in-place modifications of arguments
Keep files concise: Split files >2000 lines into smaller modules
Fast tests: Split test files that run >500 seconds

Hardware/Feature Support

When adding support for new hardware or features:

Don’t drastically change existing code
Use new files for hardware-specific components (e.g., allocator_ascend.py)
Common path first: Put the most common case (NVIDIA GPU) in the first if-branch

Example: Avoid Redundant Runtime Checks

# Bad: Checking every layer
class MyLayer:
    def forward(self, x):
        if self.some_condition:  # This is the same for every layer!
            return self.op_a(x)
        else:
            return self.op_b(x)

# Good: Cache the result
class MyLayer:
    def __init__(self):
        self.use_op_a = self.some_condition  # Cached once
    
    def forward(self, x):
        if self.use_op_a:
            return self.op_a(x)
        else:
            return self.op_b(x)

Updating sgl-kernel

Since sglang and sgl-kernel are separate packages, you cannot update a kernel and use it immediately in the same PR. Follow these steps:

Submit PR to update sgl-kernel source without using it (example)
Bump sgl-kernel version in a new PR (example)
- This triggers a PyPI release of the new kernel
Use the new kernel in a third PR:
- Update sgl-kernel version in pyproject.toml
- Update caller code in SGLang

Tips for Newcomers

Start small: Pick issues labeled “good first issue” or “help wanted”
Write docs: Documentation contributions help you learn the codebase
Read the code walkthrough: Check out this code walkthrough for a deeper understanding
Ask questions: Join our Slack channel for help

Resources

Thank you for contributing to SGLang! Happy coding!

Contributing

Architecture

Contribution Guide

Contribution Guide

Getting Started

Fork and Clone

Install from Source

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Format Code

4. Add Tests

5. Write Documentation

6. Test Accuracy (if applicable)

7. Benchmark Performance (if applicable)

8. Commit and Push

9. Create Pull Request

Code Review Process

CI Testing

Triggering CI

CI Rate Limits

Code Style Guidelines

General Principles

Hardware/Feature Support

Example: Avoid Redundant Runtime Checks

Updating sgl-kernel

Tips for Newcomers

Resources

Contributing

Architecture

​Contribution Guide

​Getting Started

​Fork and Clone

​Install from Source

​Development Workflow

​1. Create a Feature Branch

​2. Make Your Changes

​3. Format Code

​4. Add Tests

​5. Write Documentation

​6. Test Accuracy (if applicable)

​7. Benchmark Performance (if applicable)

​8. Commit and Push

​9. Create Pull Request

​Code Review Process

​CI Testing

​Triggering CI

​CI Rate Limits

​Code Style Guidelines

​General Principles

​Hardware/Feature Support

​Example: Avoid Redundant Runtime Checks

​Updating sgl-kernel

​Tips for Newcomers

​Resources

Contribution Guide

Getting Started

Fork and Clone

Install from Source

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Format Code

4. Add Tests

5. Write Documentation

6. Test Accuracy (if applicable)

7. Benchmark Performance (if applicable)

8. Commit and Push

9. Create Pull Request

Code Review Process

CI Testing

Triggering CI

CI Rate Limits

Code Style Guidelines

General Principles

Hardware/Feature Support

Example: Avoid Redundant Runtime Checks

Updating sgl-kernel

Tips for Newcomers

Resources