Skip to main content
This guide covers building the Arrow R package from source. The R package can either use system-installed Arrow C++ libraries or bundle its own.

System Requirements

  • R 4.0 or later
  • C++20 compiler (R 4.3+ on Windows)
  • CMake 3.25 or higher (when building with bundled C++)
  • Arrow C++ libraries (unless bundling)
As of Arrow version 23.0.0, a C++20 compiler is required. This means:
  • Windows: Requires R 4.3 or later (R 4.2 has incomplete C++20 support)
  • CentOS 7: Default compiler is too old; use a newer gcc/clang

Installation Methods

There are several ways to install the Arrow R package: The simplest method for most users:
install.packages("arrow")
This typically works without additional dependencies on Windows and macOS.

From R-universe

R-universe provides pre-compiled binaries:
install.packages("arrow", 
  repos = c("https://apache.r-universe.dev", "https://cloud.r-project.org"))

From Conda

conda install -c conda-forge --strict-channel-priority r-arrow

Building from Source

For developers contributing to Arrow R, you’ll want to build from source.

Method 1: Using System Arrow C++ Libraries

If Arrow C++ is already installed on your system:
1
Install Arrow C++ libraries
2
Ubuntu/Debian
# Add Apache Arrow repository
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt install -y ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt update

# Install Arrow C++ libraries
sudo apt install -y libarrow-dev libparquet-dev
Homebrew (macOS)
brew install apache-arrow
Conda
conda install -c conda-forge arrow-cpp
3
Set environment variables
4
# Point to Arrow installation
export ARROW_HOME=/usr/local  # or path to Arrow installation
5
Install R package
6
# From CRAN (with system libraries)
install.packages("arrow")

# Or from source directory
R CMD INSTALL arrow/r

Method 2: Bundling Arrow C++ (Development Builds)

For development, the R package can bundle Arrow C++ libraries:
1
Clone Arrow repository
2
git clone https://github.com/apache/arrow.git
cd arrow
3
Install R package dependencies
4
# In R console
install.packages(c("devtools", "roxygen2", "pkgdown"))
devtools::install_dev_deps("r")
5
Build with bundled C++ libraries
6
From the r/ subdirectory:
7
cd r

# Set options for bundled build
export LIBARROW_BINARY=true
export LIBARROW_MINIMAL=false

# Install with bundled libraries
R CMD INSTALL .
8
Or use the helper script:
9
cd r
make sync-cpp  # Copy C++ source into R package
make build     # Build the package
make install   # Install the package
The make build command:
  1. Copies Arrow C++ source into r/tools/cpp
  2. Prunes unnecessary components
  3. Runs R CMD build to create the source tarball

Development Workflow

Quick Development Cycle

For rapid iteration during development:
1
Load package in development mode
2
# From arrow/r directory
devtools::load_all()
3
Make changes to R code
4
Edit files in r/R/
5
Reload
6
devtools::load_all()
7
Run tests
8
devtools::test()

Rebuilding After C++ Changes

If you modify Arrow C++ code:
1
Rebuild Arrow C++
2
# From arrow/cpp directory
cmake --build build --target install
3
Reinstall R package
4
# If using system libraries, just reinstall
devtools::install("r")

# If bundling, resync C++ first
5
cd r
make sync-cpp
make install

Build Configuration

Environment Variables

VariableDescriptionDefault
LIBARROW_BINARYUse pre-built Arrow C++ binariestrue
LIBARROW_MINIMALBuild with minimal featuresfalse
LIBARROW_BUILDBuild Arrow C++ from source if neededtrue
ARROW_HOMEPath to Arrow C++ installation-
ARROW_R_DEVEnable development modefalse
NOT_CRANBypass CRAN-specific restrictionsfalse

Feature Flags

Control which Arrow features are enabled:
# Enable specific features
export LIBARROW_MINIMAL=false
export ARROW_S3=ON
export ARROW_GCS=ON
export ARROW_PARQUET=ON
export ARROW_DATASET=ON

R CMD INSTALL r/

Testing

Run All Tests

devtools::test()

Run Specific Test File

devtools::test(filter = "parquet")

Run Tests with Coverage

covr::package_coverage()

Integration Tests

# Set test data location
export ARROW_TEST_DATA=/path/to/arrow/testing/data

# Run integration tests
cd r/tests
Rscript testthat.R

Platform-Specific Notes

macOS Architecture

Critical: On Apple Silicon (M1/M2/M3) Macs, use ARM-compiled R. On Intel Macs, use x86-compiled R. Mixing architectures will cause segfaults and crashes.
Check your R architecture:
SessionInfo()$platform
# Should show "aarch64" for ARM or "x86_64" for Intel

Linux: Building from Source

On Linux, CRAN doesn’t provide binaries, so the package often builds from source:
# Install build dependencies
sudo apt-get install -y \
    libcurl4-openssl-dev \
    libssl-dev \
    libxml2-dev

# Install with auto-download of Arrow C++
install.packages("arrow")
The package will automatically download and build Arrow C++ if needed.

Windows: CRAN vs. Development Builds

CRAN binaries on Windows include pre-built Arrow C++. For development:
# Install build tools
install.packages("pkgbuild")
pkgbuild::check_build_tools()

# Build with RTools
devtools::install("r")

CRAN Release Process

For maintainers preparing a CRAN release:
1
Create CRAN release branch
2
git checkout -b maint-X.Y.Z-r apache/maint-X.Y.Z
git push upstream maint-X.Y.Z-r
3
Build release tarball
4
cd r
make build
5
This creates arrow_X.Y.Z.tar.gz.
6
Download checksums for binaries
7
Rscript tools/update-checksums.R <version>
git add -f tools/checksums/
git commit -m "[CRAN] Add checksums"
8
Test on multiple platforms
9
# Upload to win-builder
# https://win-builder.r-project.org/upload.aspx

# Upload to macOS builder  
# https://mac.r-project.org/macbuilder/submit.html
10
Submit to CRAN

Troubleshooting

Error: Compiler doesn’t support C++20Solution:
  • Windows: Upgrade to R 4.3 or later
  • Linux: Install gcc 12+ or clang 14+
  • macOS: Update Xcode Command Line Tools
Error: Arrow C++ library not foundSolution: Set ARROW_HOME:
export ARROW_HOME=/path/to/arrow/installation
export PKG_CONFIG_PATH=$ARROW_HOME/lib/pkgconfig
Error: Immediate crash or segfaultSolution: Ensure R architecture matches Mac architecture:
# Check architecture
SessionInfo()$platform

# On ARM Mac, should be "aarch64-apple-darwin"
# On Intel Mac, should be "x86_64-apple-darwin"
Reinstall R if architecture doesn’t match.
Error: Test data not foundSolution: Set test data paths:
export ARROW_TEST_DATA=/path/to/arrow/testing/data
export PARQUET_TEST_DATA=/path/to/arrow/cpp/submodules/parquet-testing/data
Issue: Cloud filesystem features missingSolution: CRAN builds include S3 but not GCS. For GCS, build from source:
export LIBARROW_MINIMAL=false
export ARROW_GCS=ON
R CMD INSTALL r/

Cleaning Build Artifacts

# Clean R package build artifacts
cd r
make clean

# Remove bundled C++ source
rm -rf tools/cpp

# Clean installed package
R -e 'remove.packages("arrow")'

Development Resources

Developer Setup

Detailed environment setup guide

Workflow Tasks

Common development workflows

R Package Docs

R package documentation

CRAN Packaging

Release preparation checklist

Next Steps

Build docs developers (and LLMs) love