Source Code Structure
This document provides an overview of CPython’s source code organization and the typical layout for modules, built-in types, and built-in functions.Source Code Layout
CPython organizes its source code into logical directories based on functionality. Understanding this structure is essential for navigating the codebase and contributing to CPython.Python Module Layout
For a Python module, the typical layout is:Lib/<module>.py- Pure Python implementationModules/_<module>.c- C accelerator module (optional)Lib/test/test_<module>.py- Test suiteDoc/library/<module>.rst- Documentation
Extension Module Layout
For an extension module written in C:Modules/<module>module.c- C implementationLib/test/test_<module>.py- Test suiteDoc/library/<module>.rst- Documentation
Built-in Type Layout
For built-in types (likelist, dict, str):
Objects/<builtin>object.c- C implementationLib/test/test_<builtin>.py- Test suiteDoc/library/stdtypes.rst- Documentation
Built-in Function Layout
For built-in functions:Python/bltinmodule.c- ImplementationLib/test/test_builtin.py- Test suiteDoc/library/functions.rst- Documentation
Key Directories
Python/
ThePython/ directory contains the core interpreter implementation:
- Python/ceval.c - The bytecode interpreter (evaluation loop)
- Python/compile.c - AST to bytecode compiler
- Python/symtable.c - Symbol table generation
- Python/bytecodes.c - Bytecode instruction definitions
- Python/gc.c - Garbage collector implementation
- Python/import.c - Import machinery
Objects/
TheObjects/ directory contains implementations of Python’s built-in types:
- Objects/longobject.c -
inttype implementation - Objects/unicodeobject.c -
strtype implementation - Objects/listobject.c -
listtype implementation - Objects/dictobject.c -
dicttype implementation - Objects/codeobject.c - Code objects
- Objects/frameobject.c - Frame objects
- Objects/genobject.c - Generator objects
Parser/
TheParser/ directory contains the PEG parser implementation:
- Parser/parser.c - Generated PEG parser (from python.gram)
- Parser/peg_api.c - High-level parser API
- Parser/tokenizer/ - Tokenizer implementation
- Parser/Python.asdl - AST definition in ASDL
Include/
TheInclude/ directory contains C API header files:
- Include/cpython/ - Public CPython API headers
- Include/internal/ - Internal (private) API headers
- Include/opcode.h - Bytecode opcode definitions
Modules/
TheModules/ directory contains C implementations of standard library modules:
- Modules/_json.c - JSON encoder/decoder
- Modules/_struct.c - Binary data packing
- Modules/mathmodule.c - Math functions
Grammar/
TheGrammar/ directory contains the Python grammar specification:
- Grammar/python.gram - PEG grammar for Python
- Grammar/Tokens - Token definitions
Notable Exceptions
Some modules deviate from the standard layout:Special Cases:
- Built-in type
intis atObjects/longobject.c - Built-in type
stris atObjects/unicodeobject.c - Built-in module
sysis atPython/sysmodule.c - Built-in module
marshalis atPython/marshal.c - Windows-only module
winregis atPC/winreg.c
Source File Naming Conventions
C Source Files
*object.c- Object type implementations*module.c- Extension module implementations_*.c- Internal/private implementations
Header Files
*.h- Public API headerspycore_*.h- Internal API headers (inInclude/internal/)
Directory Structure Overview
Finding Code
Locating Object Implementations
To find the implementation of a built-in type:- Check
Objects/<type>object.c(e.g.,Objects/listobject.cforlist) - Notable exceptions:
int→longobject.c,str→unicodeobject.c
Locating Module Implementations
To find a module implementation:- Pure Python:
Lib/<module>.py - C extension:
Modules/<module>module.corModules/_<module>.c - Built-in: May be in
Python/<module>module.c(e.g.,sysmodule.c)
Locating Bytecode Operations
Bytecode instruction implementations are defined in:Python/bytecodes.c- Source definitionsPython/generated_cases.c.h- Generated switch casesPython/ceval.c- Main interpreter loop
Related Topics
- Compiler Design - How source code becomes bytecode
- Bytecode Interpreter - How bytecode is executed
- Code Objects - Representation of compiled code
