Overview
llama.cpp is a community-driven project that values high-quality contributions. This guide covers the contribution workflow, coding standards, and best practices for collaborating on the project.The project has a strict AI usage policy. Pull requests that are fully or predominantly AI-generated are not accepted. See AI Usage Policy for details.
Contributor Levels
The project differentiates between three levels of contributors:- Contributors: People who have contributed before (no special privileges)
- Collaborators (Triage): Contributors with significant contributions who may be responsible for some parts of the code and are expected to maintain and review contributions for the code they own
- Maintainers: Responsible for reviewing and merging PRs after approval from code owners
AI Usage Policy
Code that is initially generated by AI and subsequently edited will still be considered AI-generated. AI assistance is permissible only when the majority of the code is authored by a human contributor, with AI employed exclusively for corrections or to expand on verbose modifications that the contributor has already conceptualized.Requirements When Using AI
If AI is used to generate any portion of the code, contributors must:Disclose AI usage
Explicitly disclose the manner in which AI was employed in your pull request description.
Be prepared to explain
Be prepared to explain every line of code you submitted when asked by a maintainer.
Pull Request Workflow
Before Submitting Your PR
Search for existing PRs
Search for existing PRs to prevent duplicating efforts. Check both open and closed pull requests.
Understand ggml
llama.cpp uses the ggml tensor library for model evaluation. If you are unfamiliar with ggml, consider reviewing the examples in the ggml repository:
Test your changes
Execute the full CI locally on your machine before publishing:Verify that perplexity and performance are not negatively affected:
Test ggml modifications
If you modified the This requires access to at least two different
ggml source:ggml backends to verify consistent results.If you modified a ggml operator or added a new one, add corresponding test cases to test-backend-ops.Create focused PRs
- Avoid combining unrelated changes in a single PR
- For complex features, consider opening a feature request first to discuss and align expectations
- When adding support for a new model or feature, focus on CPU support only in the initial PR unless you have a good reason not to
- Add support for other backends like CUDA in follow-up PRs
After Submitting Your PR
- Expect modification requests: Maintainers will request changes to ensure code meets quality and maintainability standards
- Be available for review: Maintainers will rely on your insights when making final approval decisions
- Keep PR up to date: If your PR becomes stale, rebase it on top of latest
masterto get maintainers’ attention - Consider adding yourself to CODEOWNERS: Indicate your availability for fixing related issues and reviewing related PRs
Coding Guidelines
General Principles
Code Style Fundamentals
Code Style Fundamentals
- Avoid adding third-party dependencies, extra files, or extra headers
- Always consider cross-compatibility with other operating systems and architectures
- Avoid fancy-looking modern STL constructs, use basic
forloops, avoid templates, keep it simple - Vertical alignment makes things more readable and easier to batch edit
- Clean up trailing whitespaces
- Use 4 spaces for indentation
- Brackets on the same line
- Pointer/reference style:
void * ptr,int & a
Data Types
Struct Declarations
Declare structs withstruct foo {} instead of typedef struct foo {} foo:
This guideline is being applied to new code. Legacy code may not follow this convention yet.
Code Formatting
Try to follow existing patterns in the code. When in doubt, useclang-format (from clang-tools v15+) to format added code:
Tensor Operations
Matrix multiplication is unconventional:The dimensions in
ggml are typically in the reverse order of pytorch dimensions.Naming Guidelines
Function and Variable Names
Usesnake_case for function, variable, and type names:
Optimize for Longest Common Prefix
Enum Values
Enum values are always in upper case and prefixed with the enum name:Method Naming Pattern
The general naming pattern is<class>_<method>, with <method> being <action>_<noun>:
- The
getaction can be omitted - The noun can be omitted if not necessary
- The
_contextsuffix of the class is optional (use it to disambiguate when needed) - Use
init/freefor constructor/destructor actions
Opaque Types
Use the_t suffix when a type is supposed to be opaque to the user:
File Naming
- C/C++ filenames are all lowercase with dashes
- Headers use the
.hextension - Source files use the
.cor.cppextension - Python filenames are all lowercase with underscores
Code Maintenance
Code Ownership
Existing code should have designated collaborators and/or maintainers specified in the CODEOWNERS file responsible for:- Reviewing and merging related PRs
- Fixing related bugs
- Providing developer guidance/support
When Adding Large Code Changes
Add yourself to CODEOWNERS
If you are a collaborator, add yourself to CODEOWNERS to indicate your availability for reviewing related PRs.
Find a maintainer
If you are a contributor, find an existing collaborator willing to review and maintain your code long-term.
Provide CI workflow
Provide the necessary CI workflow (and hardware) to test your changes. See ci/README.md.
New code should follow the guidelines outlined in this document. For legacy reasons, existing code is not required to follow these guidelines.
Documentation
Documentation is a community effort:- When you need to look into source code to figure out how to use an API, consider adding a short summary to the header file for future reference
- When you notice incorrect or outdated documentation, please update it
- Document the “why” rather than the “what” when writing comments
For Maintainers
Merging Pull Requests
Format commit title
Use the following format for the squashed commit title:Example:
utils : fix typo in utils.py (#1234)Optionally pick a <module> from: https://github.com/ggml-org/llama.cpp/wiki/ModulesDeclining Pull Requests
Maintainers reserve the right to decline review or close pull requests for any reason, particularly when:- The proposed change is already mentioned in the roadmap or an existing issue and has been assigned to someone
- The pull request duplicates an existing one
- The contributor fails to adhere to this contributing guide
Resources
The GitHub issues, PRs, and discussions contain valuable information for getting familiar with the codebase. For convenience, important information is referenced from GitHub projects: https://github.com/ggml-org/llama.cpp/projectsNext Steps
Adding Models
Learn how to add new model architectures to llama.cpp
Testing
Understand the testing procedures and how to run tests

