scanner module provides functionality to discover Python files in a directory tree while respecting .gitignore patterns and default exclusions.
scan_python_files()
Scans a directory tree for Python files while respecting gitignore rules. Location:docugen/core/scanner.py:94
Parameters
Path to the root directory or single Python file to scan. Accepts both string paths and
Path objects. Supports ~ for home directory expansion.Returns
Sorted list of absolute
Path objects for all discovered Python files. Returns empty list if no .py files are found.Behavior
- Single file mode: If
root_pathpoints to a file, returns it only if it has.pyextension - Directory mode: Recursively walks the directory tree and discovers all
.pyfiles - Automatic filtering: Excludes directories in
DEFAULT_IGNORED_DIRSand patterns from.gitignore - Path resolution: Expands
~and resolves to absolute paths
Raises
FileNotFoundError: If the specified path does not exist
Example
GitIgnoreRule
Represents a parsed rule from a.gitignore file.
Location: docugen/core/scanner.py:20
Attributes
The gitignore pattern without special prefix/suffix characters (e.g.,
"*.pyc", "build", "docs/temp")True if the rule starts with !, meaning it negates (un-ignores) matching pathsTrue if the rule ends with /, meaning it only matches directoriesTrue if the rule starts with /, meaning the pattern is relative to the repository rootUsage
This class is immutable (frozen dataclass) and primarily used internally by the scanner to evaluate whether paths should be ignored.Example
Default Ignored Directories
Location:docugen/core/scanner.py:8
The scanner automatically excludes these directories regardless of .gitignore rules:
Helper Functions
The module includes internal helper functions that are not part of the public API:_load_gitignore_rules(root: Path) -> list[GitIgnoreRule]- Parses.gitignorefile (scanner.py:28)_match_rule(relative_path: str, is_dir: bool, rule: GitIgnoreRule) -> bool- Tests if a path matches a rule (scanner.py:64)_is_ignored(relative_path: str, is_dir: bool, rules: list[GitIgnoreRule]) -> bool- Determines if a path should be ignored (scanner.py:86)
