Skip to main content
Package building is the most complex phase of the Azure Linux build system. It involves parsing dependencies, creating dependency graphs, resolving unmet dependencies, and orchestrating parallel builds across multiple workers.

Build Process Overview

The package building process consists of five stages:
1

Dependency Extraction (specreader)

Parse all spec files and extract dependency information
2

Graph Generation (grapher)

Convert dependencies into a directed acyclic graph (DAG)
3

Dependency Resolution (graphpkgfetcher)

Resolve unmet dependencies from local cache or remote repos
4

Build Scheduling (scheduler)

Orchestrate parallel builds respecting dependency order
5

Package Building (pkgworker)

Build individual packages in isolated chroot environments

Complete Package Build Flow

Stage 1: Dependency Extraction

The specreader tool scans intermediate spec files and extracts dependency information using rpmspec -q inside the chroot worker.

What Gets Extracted

For each package in a spec file:
  • Provides - Package name(s), version, and virtual packages
  • BuildRequires - Packages needed to build (shared across all subpackages)
  • Requires - Packages needed at runtime (per subpackage)

Example Output

A simple spec might produce:
{
  "Provides": {
    "Name": "example",
    "Version": "1.0.0-1.cm1",
    "Condition": "="
  },
  "SrpmPath": "build/INTERMEDIATE_SRPMS/x86_64/example-1.0.0-1.cm1.src.rpm",
  "RpmPath": "out/RPMS/x86_64/example-1.0.0-1.x86_64.cm1.rpm",
  "SpecPath": "build/INTERMEDIATE_SPECS/example-1.0.0-1.cm1/example.spec",
  "Architecture": "x86_64",
  "Requires": [
    {
      "Name": "nano",
      "Version": "",
      "Condition": ""
    }
  ],
  "BuildRequires": null
}
Output location: ./../build/pkg_artifacts/specs.json

Rich Dependencies

Spec files can use complex requirement expressions:
and, or, with - Both options recorded, allowing maximum flexibility during install
Requires: (foo or bar)
All optional RPMs must be available, even if not used for a specific configuration.
The build system prints warnings for rich dependencies. If the build fails, ensure all conditional packages follow the guidelines or remove unavailable packages from the spec.

Stage 2: Dependency Graphing

The grapher tool converts specs.json into a directed acyclic graph (DAG) representing package dependencies.

Graph Node Types

Represents a local package that can be builtStates:
  • StateBuild - Should be built
  • StateBuildError - Dependencies satisfied but build failed
  • StateUpToDate - Package already available locally
Represents a package that can be installed or used as a dependencyStates:
  • StateMeta - Organizational node for imposing ordering
Unknown package needed as a dependency, must be resolved remotelyStates:
  • StateUnresolved - No source found yet
  • StateCached - Remote source found and package cached locally
Special node grouping packages togetherStates:
  • StateMeta - If satisfied, all grouped packages are available
The grapher automatically adds an “ALL” goal node linking to every package.
Purely organizational, used to resolve intra-package cyclesStates:
  • StateMeta - Organizational node

Graph Generation Process

1

Create Nodes

Each package gets two nodes: a BUILD node and a RUN node
2

Link Build to Run

Add edge from RUN → BUILD (can’t install until built)
3

Add BuildRequires

Add edges from current BUILD → required RUN nodes
4

Add Requires

Add edges from current RUN → required RUN nodes
5

Resolve Cycles

Fix intra-package cycles using meta nodes

Package Lookup and Versioning

Dependencies specify requirements with varying detail:
  • Simple name: Requires: example
  • Version constraint: Requires: example >= 1.0.0
  • Double constraint: Requires: example >= 1.0.0, Requires: example < 2.0.0
  • Exact version: Requires: example = 1.0.0
The grapher maintains sorted lookup lists and selects the highest version package satisfying requirements. If no match is found, an unresolved node is added.
Versions are split into version and release components. Release numbers are only considered if both versions explicitly contain one.

Cycle Resolution

Circular dependencies are generally fatal errors, except for special cases: Solvable Cycles:
  • All nodes in cycle from the same SPEC file
  • All dependencies are runtime dependencies
Solution: Insert a TypePureMeta node consolidating all cycle dependencies. Cycle nodes then depend on the meta node instead of each other.
Nodes A-a and A-b are from the same spec and require each other:Before:
A-a → B
A-a → A-b
A-b → C
A-b → A-a
After (meta node ID=8):
A-a → meta(8)
A-b → meta(8)
meta(8) → B
meta(8) → C
The meta node consolidates requirements, breaking the cycle.

Dynamic Dependencies

Some provides are only known after building. For example, bar may provide pkgconfig(bar), but this is only known after building bar. These implicit provides create dynamic dependencies that result in graphs that may not be solvable until packages are built. The scheduler handles this by analyzing built RPMs and updating the graph. Output location: ./../build/pkg_artifacts/graph.dot

Stage 3: Dependency Resolution

The graphpkgfetcher tool resolves unresolved remote nodes by finding or downloading missing packages.

Search Order

The tool searches five locations using tdnf in a chroot:
1

Local Chroot

Already installed in the worker environment
2

Built RPMs

Previously built packages in ./../out/RPMS/
3

Upstream Base

Official Azure Linux base repository
4

Upstream Preview

Preview repository (if USE_PREVIEW_REPO=y)
5

Custom Repos

Any repos listed in REPO_LIST
Set DISABLE_UPSTREAM_REPOS=y to disable all network-accessed repositories.

How It Works

  • tdnf prioritizes local packages over remote downloads
  • Local packages are accessed via mounted overlay in chroot
  • Cached RPMs are written through a writable mount
  • Local folder is converted to a repo (changes not persisted outside chroot)
  • Satisfied nodes are marked as StateCached
Output location: ./../build/pkg_artifacts/cached_graph.dot

Stage 4: Build Scheduling

The scheduler tool orchestrates parallel package builds using a pool of pkgworker agents.

Scheduling Algorithm

1

Start at Leaf Nodes

Find packages with all dependencies satisfied
2

Schedule Builds

Spawn pkgworker for each ready package
3

Update Graph

Mark built nodes complete, analyze for implicit provides
4

Find New Leaves

Identify packages newly ready to build
5

Repeat

Continue until all required packages are built

Build Optimization

The scheduler avoids unnecessary rebuilds:
  • Skips building if package already exists
  • But rebuilds if any build dependency was rebuilt
  • This ensures consistency across the build
The scheduler continuously optimizes the graph during the build, creating subgraphs that contain only needed packages when no dynamic dependencies are present.

Dynamic Dependency Handling

When a package is built:
1

Analyze RPM

Extract implicit provides from built package
2

Match Dependencies

Find if any graph node needs this implicit provide
3

Update Graph

Modify nodes and edges to reflect new information
4

Attempt Subgraph

Try to create solvable subgraph with complete info
If dynamic dependencies exist, the full graph is maintained until enough packages are built to resolve all dynamic dependencies. Output location: ./../build/pkg_artifacts/built_graph.dot

Stage 5: Package Building

The pkgworker tool (invoked by scheduler) builds individual packages in isolated chroot environments.

Build Process

1

Create Build Directory

Fresh empty folder for this package
2

Extract Chroot

Unpack chroot worker archive with toolchain
3

Mount Local RPMs

Make built packages accessible to worker
4

Install Dependencies

Use tdnf to install build dependencies
5

Build Package

Run rpmbuild to compile the package
6

Copy Output

Place built RPMs in ./../out/RPMS/
7

Cleanup

Safely unmount and remove chroot (even on error)
Each package builds in complete isolation. The pkgworker ensures safe cleanup even if the build fails, preventing chroot mount issues.

Visualizing Graphs

All graph files use GraphViz dot format and include human-readable information for debugging.

View Graphs Visually

# Generate PNG visualization
dot -Tpng -o visualized.png < graph.dot

# Or use other GraphViz tools
fdp -Tsvg graph.dot > graph.svg
neato -Tpdf graph.dot > graph.pdf
Note: Graphs for large builds are often impractically large to visualize. They’re still useful for programmatic analysis or grepping for specific packages.

Graph Files

  • graph.dot - Initial graph from grapher
  • cached_graph.dot - After dependency resolution
  • built_graph.dot - Final state after all builds

Next Steps

Image Generation

Learn how packages are composed into bootable images

Local Packages

Return to local package handling

Build docs developers (and LLMs) love