Quick Install
For most users on Windows or macOS, installation is simple:On macOS with Apple Silicon (M1, M2, etc.), ensure you’re using R compiled for arm64. Using Intel-compiled R on ARM processors can cause segfaults and crashes.
Installation Methods
There are several ways to install arrow, depending on your operating system and requirements:Method 1: CRAN Binary (Windows/macOS)
The simplest method, suitable for most users:- Parquet support
- Feather/Arrow IPC support
- CSV reading/writing
- Amazon S3 support
- Dataset functionality
- Google Cloud Storage (GCS) support
- Some optional compression algorithms
Method 2: R-Universe Binary (All Platforms)
R-Universe provides pre-compiled binaries for Linux as well:Method 3: Conda
If you use conda to manage your R environment:Linux Installation
Linux installation requires more attention because CRAN doesn’t host R package binaries for Linux.Option A: RSPM Binary (Recommended)
RStudio Package Manager hosts Linux binaries:Option B: Auto-Download C++ Binary
Let the installer download a prebuilt libarrow binary:- Much faster installation than building from source
- Full features including S3 and GCS support (if system dependencies are met)
System Dependencies for S3/GCS
To use S3 and GCS features, install these system libraries: Ubuntu/Debian:Option C: Build from Source
Building both R and C++ from source gives full control but takes longer:As of arrow 23.0.0, building from source requires a C++20 compiler. For gcc, this means version 10 or newer. CentOS 7 ships with gcc 4.8 and is not supported.
Using install_arrow()
Theinstall_arrow() function provides a convenient interface for installation:
- Automatically handles C++ dependencies
- No need to set environment variables manually
- Useful for upgrading or fixing installation issues
Environment Variables
Fine-tune the installation with environment variables:Key Variables
| Variable | Description | Default |
|---|---|---|
NOT_CRAN | Enable full-featured builds and binary downloads | FALSE |
LIBARROW_BINARY | Try to download pre-built C++ library | (unset) |
LIBARROW_MINIMAL | Build minimal or full-featured version | (unset) |
LIBARROW_BUILD | Allow building C++ from source | TRUE |
ARROW_R_DEV | Verbose output for debugging | FALSE |
ARROW_S3 | Enable S3 support when building from source | OFF |
ARROW_GCS | Enable GCS support when building from source | OFF |
Example Configurations
Development setup (maximum features):Offline Installation
For systems without internet access during installation:Step 1: On a Connected Machine
Step 2: On the Offline Machine
The offline machine needs cmake to build from source. Binary packages from RSPM or R-Universe don’t require this function.
Verifying Installation
After installation, verify that arrow is working correctly:parquet: Reading/writing Parquet filesdataset: Multi-file dataset supports3: Amazon S3 connectivitygcs: Google Cloud Storage connectivityjson: JSON file reading
Troubleshooting
Binary Installation Failed
If you see “undefined symbols” errors with binaries:Version Mismatch with System Libraries
If you have system-installed Arrow libraries with different versions:Build Failed
For detailed error messages:Save Build Logs
Platform-Specific Notes
Windows
- Requires R 4.3+ for source builds (due to C++20 requirement)
- R 4.2 has partial C++20 support and might work with special configuration
- Binary installation is recommended
macOS
- Apple Silicon users: Use R built for arm64
- Intel Mac users: Use R built for x86_64
- Mixing architectures causes crashes
Linux
- Ubuntu/Debian: Well supported with RSPM binaries
- RHEL/CentOS 8+: Supported with gcc 10+
- CentOS 7: Not supported (gcc 4.8 is too old)
- Arch/Fedora: Supported with source builds
Next Steps
Once arrow is installed, explore these features:Read and Write Files
Work with Parquet, CSV, and Arrow files
dplyr Integration
Analyze data with familiar dplyr syntax