Skip to main content
This guide covers building the Apache Arrow C++ library from source using CMake. Arrow uses an out-of-source build system for flexibility.

Prerequisites

System Requirements

  • C++20-enabled compiler: GCC 12+, Clang 14+, or MSVC 2019+
  • CMake: Version 3.25 or higher
  • Build system: Make or Ninja (recommended)
  • Memory: At least 1GB RAM (4GB for debug builds, 8GB for full builds)

Installing Build Tools

sudo apt-get install \
    build-essential \
    cmake \
    ninja-build

Getting the Source

git clone https://github.com/apache/arrow.git
cd arrow/cpp

Build Configuration

Using CMake Presets

Arrow provides convenient CMake presets for common configurations. List available presets:
cmake --list-presets
Available presets include:
  • ninja-debug-minimal - Debug build without optional components
  • ninja-debug-basic - Debug build with tests and reduced dependencies
  • ninja-debug - Full debug build with tests
  • ninja-release-minimal - Minimal release build
  • ninja-release - Full release build
Inspect a preset’s configuration:
cmake -N --preset ninja-debug-minimal
Build using a preset:
mkdir build && cd build
cmake .. --preset ninja-debug-minimal
cmake --build .

Manual Configuration

For more control, configure CMake manually:
mkdir build-release && cd build-release

cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DARROW_BUILD_TESTS=OFF \
    -DARROW_COMPUTE=ON \
    -DARROW_CSV=ON \
    -DARROW_DATASET=ON \
    -DARROW_FILESYSTEM=ON \
    -DARROW_PARQUET=ON

cmake --build . --parallel $(nproc)

Key Build Options

Core Options

OptionDefaultDescription
CMAKE_BUILD_TYPEReleaseBuild type: Debug, Release, RelWithDebInfo
CMAKE_INSTALL_PREFIX/usr/localInstallation directory
ARROW_BUILD_STATICONBuild static libraries
ARROW_BUILD_SHAREDONBuild shared libraries
ARROW_BUILD_TESTSOFFBuild unit tests
ARROW_BUILD_BENCHMARKSOFFBuild benchmarks

Component Options

OptionDescription
ARROW_COMPUTECompute functions and kernels
ARROW_CSVCSV reader/writer
ARROW_DATASETDataset API for reading partitioned data
ARROW_FILESYSTEMFilesystem abstraction (S3, GCS, HDFS)
ARROW_FLIGHTArrow Flight RPC framework
ARROW_FLIGHT_SQLFlight SQL protocol
ARROW_GANDIVALLVM-based expression compiler
ARROW_IPCInter-process communication
ARROW_JSONJSON reader
ARROW_ORCORC file format support
ARROW_PARQUETParquet file format support
ARROW_ACEROAcero streaming execution engine

Advanced Options

OptionDescription
ARROW_JEMALLOCUse jemalloc for memory allocation
ARROW_MIMALLOCUse mimalloc for memory allocation
ARROW_USE_CCACHEUse ccache for faster rebuilds
ARROW_SIMD_LEVELSIMD optimization level (NONE, SSE4_2, AVX2, AVX512)
ARROW_RUNTIME_SIMD_LEVELRuntime SIMD dispatch level
Enable ARROW_JEMALLOC or ARROW_MIMALLOC for significantly better memory allocation performance in production builds.

Building with Dependencies

Bundled vs. System Dependencies

Arrow can either bundle dependencies or use system-installed versions:
# Use system dependencies (recommended for package maintainers)
cmake .. \
    -DARROW_DEPENDENCY_SOURCE=SYSTEM \
    -DARROW_PARQUET=ON

# Bundle dependencies (recommended for development)
cmake .. \
    -DARROW_DEPENDENCY_SOURCE=BUNDLED \
    -DARROW_PARQUET=ON

# Auto-detect (default)
cmake .. \
    -DARROW_DEPENDENCY_SOURCE=AUTO \
    -DARROW_PARQUET=ON

Common Dependencies

  • boost - Required by some components
  • brotli, lz4, snappy, zstd - Compression libraries
  • gflags, glog, gtest - Development utilities
  • protobuf, grpc - Required for Arrow Flight
  • thrift - Required for Parquet
  • re2, utf8proc - String processing

Installation

Install Arrow after building:
# Install to default location (/usr/local)
sudo cmake --install .

# Install to custom location
cmake --install . --prefix=/opt/arrow

Setting Installation Path

Specify installation prefix during configuration:
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/.local
cmake --build .
cmake --install .
Update your environment:
export PATH=$HOME/.local/bin:$PATH
export LD_LIBRARY_PATH=$HOME/.local/lib:$LD_LIBRARY_PATH
export CMAKE_PREFIX_PATH=$HOME/.local:$CMAKE_PREFIX_PATH

Using Arrow in Your Project

CMake Integration

Create a CMakeLists.txt file:
cmake_minimum_required(VERSION 3.25)
project(MyArrowApp)

# Find Arrow
find_package(Arrow REQUIRED)

# Create executable
add_executable(my_app main.cpp)

# Link Arrow libraries
target_link_libraries(my_app PRIVATE Arrow::arrow_shared)

# Optional: Link additional components
find_package(ArrowCompute REQUIRED)
find_package(Parquet REQUIRED)
target_link_libraries(my_app PRIVATE 
    ArrowCompute::arrow_compute_shared
    Parquet::parquet_shared)
Use Arrow::arrow_shared for shared libraries (recommended) or Arrow::arrow_static for static linking.

Available Packages

Arrow provides separate packages for each component:
  • Arrow - Core library
  • ArrowCompute - Compute functions
  • ArrowDataset - Dataset API
  • ArrowAcero - Acero execution engine
  • ArrowFlight - Flight RPC
  • ArrowFlightSql - Flight SQL
  • Parquet - Parquet format
  • Gandiva - Expression compiler
Each follows the naming pattern:
  • find_package: find_package(PackageName REQUIRED)
  • Shared target: PackageName::package_name_shared
  • Static target: PackageName::package_name_static

pkg-config

Alternatively, use pkg-config:
# Get compiler flags
pkg-config --cflags --libs arrow

# For static linking
pkg-config --cflags --libs --static arrow
Makefile example:
my_app: main.cpp
	$(CXX) -o $@ $(CXXFLAGS) $< $$(pkg-config --cflags --libs arrow)

Testing

Run tests after building:
# Run all tests
ctest

# Run tests in parallel
ctest -j$(nproc)

# Run specific test
ctest -R arrow-array-test

# Verbose output
ctest -V
Or run test executables directly:
# Run specific test executable
./debug/arrow-array-test

# Run with Google Test filters
./debug/arrow-array-test --gtest_filter=TestInt64Builder*

Troubleshooting

Out of Memory

Reduce parallel jobs:
cmake --build . --parallel 2
Or build specific targets:
cmake --build . --target arrow

Missing Dependencies

Use bundled dependencies:
cmake .. -DARROW_DEPENDENCY_SOURCE=BUNDLED

CMake Can’t Find Arrow

Set CMAKE_PREFIX_PATH:
cmake .. -DCMAKE_PREFIX_PATH=/path/to/arrow/install
For faster incremental builds, enable ccache: -DARROW_USE_CCACHE=ON

Build docs developers (and LLMs) love