Skip to main content
This guide will get you up and running with Apache Arrow C++ quickly. You’ll learn how to create arrays, build tables, and read/write data files.

Prerequisites

You’ll need:
  • C++17 compatible compiler (GCC 7+, Clang 6+, MSVC 2017+)
  • CMake 3.16 or higher
  • Basic familiarity with C++ and CMake
1

Install Arrow C++

Using Package Managers

sudo apt update
sudo apt install -y libarrow-dev

Building from Source

For a minimal build:
git clone https://github.com/apache/arrow.git
cd arrow/cpp
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release \
         -DARROW_BUILD_TESTS=OFF \
         -DARROW_COMPUTE=ON \
         -DARROW_CSV=ON \
         -DARROW_FILESYSTEM=ON
make -j$(nproc)
sudo make install
2

Create Your First Arrays

Arrows use builders to create arrays. Each data type has its own builder class.Create a file called arrow_basics.cc:
#include <arrow/api.h>
#include <iostream>

int main() {
  // Create an Int8 builder
  arrow::Int8Builder int8builder;
  
  // Add data to the builder
  int8_t days_raw[5] = {1, 12, 17, 23, 28};
  ARROW_RETURN_NOT_OK(int8builder.AppendValues(days_raw, 5));
  
  // Finish building the array
  std::shared_ptr<arrow::Array> days;
  ARROW_ASSIGN_OR_RAISE(days, int8builder.Finish());
  
  std::cout << "Created array with " << days->length() << " elements" << std::endl;
  std::cout << days->ToString() << std::endl;
  
  return 0;
}
Key concepts:
  • Int8Builder - Creates arrays of 8-bit integers
  • AppendValues() - Adds multiple values at once
  • Finish() - Completes the array and returns it
  • ARROW_ASSIGN_OR_RAISE - Macro for error handling
3

Build Tables from Arrays

Tables organize multiple arrays into named columns with a schema.
#include <arrow/api.h>
#include <iostream>

arrow::Status BuildTable() {
  // Create arrays for each column
  arrow::Int8Builder int8builder;
  int8_t days_raw[5] = {1, 12, 17, 23, 28};
  ARROW_RETURN_NOT_OK(int8builder.AppendValues(days_raw, 5));
  std::shared_ptr<arrow::Array> days;
  ARROW_ASSIGN_OR_RAISE(days, int8builder.Finish());
  
  int8_t months_raw[5] = {1, 3, 5, 7, 1};
  ARROW_RETURN_NOT_OK(int8builder.AppendValues(months_raw, 5));
  std::shared_ptr<arrow::Array> months;
  ARROW_ASSIGN_OR_RAISE(months, int8builder.Finish());
  
  arrow::Int16Builder int16builder;
  int16_t years_raw[5] = {1990, 2000, 1995, 2000, 1995};
  ARROW_RETURN_NOT_OK(int16builder.AppendValues(years_raw, 5));
  std::shared_ptr<arrow::Array> years;
  ARROW_ASSIGN_OR_RAISE(years, int16builder.Finish());
  
  // Define schema with field names and types
  auto schema = arrow::schema({
    arrow::field("day", arrow::int8()),
    arrow::field("month", arrow::int8()),
    arrow::field("year", arrow::int16())
  });
  
  // Create table from schema and arrays
  auto table = arrow::Table::Make(schema, {days, months, years});
  
  std::cout << table->ToString() << std::endl;
  return arrow::Status::OK();
}

int main() {
  arrow::Status st = BuildTable();
  if (!st.ok()) {
    std::cerr << st << std::endl;
    return 1;
  }
  return 0;
}
Output:
day: int8
month: int8
year: int16
----
day: [[1,12,17,23,28]]
month: [[1,3,5,7,1]]
year: [[1990,2000,1995,2000,1995]]
4

Read and Write CSV Files

Arrow provides fast CSV reading and writing capabilities.Create test.csv:
name,age,city
Alice,30,NYC
Bob,25,SF
Carol,35,LA
Now read and process it:
#include <arrow/api.h>
#include <arrow/csv/api.h>
#include <arrow/io/api.h>
#include <arrow/ipc/api.h>
#include <iostream>

arrow::Status ProcessCSV() {
  // Open CSV file for reading
  ARROW_ASSIGN_OR_RAISE(
    auto input_file,
    arrow::io::ReadableFile::Open("test.csv")
  );
  
  // Create CSV reader
  ARROW_ASSIGN_OR_RAISE(
    auto csv_reader,
    arrow::csv::TableReader::Make(
      arrow::io::default_io_context(),
      input_file,
      arrow::csv::ReadOptions::Defaults(),
      arrow::csv::ParseOptions::Defaults(),
      arrow::csv::ConvertOptions::Defaults()
    )
  );
  
  // Read entire CSV into table
  ARROW_ASSIGN_OR_RAISE(auto table, csv_reader->Read());
  
  std::cout << "Read " << table->num_rows() << " rows" << std::endl;
  std::cout << table->ToString() << std::endl;
  
  return arrow::Status::OK();
}

int main() {
  arrow::Status st = ProcessCSV();
  if (!st.ok()) {
    std::cerr << st << std::endl;
    return 1;
  }
  return 0;
}
5

Write Arrow IPC Files

The Arrow IPC format is optimized for fast reading and zero-copy data access.
#include <arrow/api.h>
#include <arrow/io/api.h>
#include <arrow/ipc/api.h>

arrow::Status WriteArrowFile(std::shared_ptr<arrow::Table> table) {
  // Open output file
  ARROW_ASSIGN_OR_RAISE(
    auto output_file,
    arrow::io::FileOutputStream::Open("data.arrow")
  );
  
  // Create IPC writer
  ARROW_ASSIGN_OR_RAISE(
    auto writer,
    arrow::ipc::MakeFileWriter(output_file, table->schema())
  );
  
  // Write table
  ARROW_RETURN_NOT_OK(writer->WriteTable(*table));
  ARROW_RETURN_NOT_OK(writer->Close());
  
  std::cout << "Wrote table to data.arrow" << std::endl;
  return arrow::Status::OK();
}

arrow::Status ReadArrowFile() {
  // Open input file
  ARROW_ASSIGN_OR_RAISE(
    auto input_file,
    arrow::io::ReadableFile::Open("data.arrow")
  );
  
  // Create IPC reader
  ARROW_ASSIGN_OR_RAISE(
    auto reader,
    arrow::ipc::RecordBatchFileReader::Open(input_file)
  );
  
  // Read all record batches into table
  ARROW_ASSIGN_OR_RAISE(auto table, reader->ReadTable());
  
  std::cout << "Read table with " << table->num_rows() << " rows" << std::endl;
  return arrow::Status::OK();
}
6

Build with CMake

Create CMakeLists.txt to compile your project:
cmake_minimum_required(VERSION 3.16)
project(ArrowQuickstart)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

find_package(Arrow REQUIRED)

add_executable(arrow_example arrow_basics.cc)
target_link_libraries(arrow_example PRIVATE Arrow::arrow_shared)
Build and run:
mkdir build && cd build
cmake ..
make
./arrow_example

Complete Example

Here’s a complete working example that ties everything together:
#include <arrow/api.h>
#include <arrow/csv/api.h>
#include <arrow/io/api.h>
#include <arrow/ipc/api.h>
#include <iostream>

arrow::Status RunExample() {
  // 1. Create arrays
  arrow::Int32Builder builder;
  ARROW_RETURN_NOT_OK(builder.AppendValues({1, 2, 3, 4, 5}));
  ARROW_ASSIGN_OR_RAISE(auto array, builder.Finish());
  
  // 2. Build a table
  auto schema = arrow::schema({arrow::field("numbers", arrow::int32())});
  auto table = arrow::Table::Make(schema, {array});
  
  std::cout << "Created table:\n" << table->ToString() << std::endl;
  
  // 3. Write to Arrow IPC file
  ARROW_ASSIGN_OR_RAISE(
    auto output,
    arrow::io::FileOutputStream::Open("example.arrow")
  );
  ARROW_ASSIGN_OR_RAISE(
    auto writer,
    arrow::ipc::MakeFileWriter(output, schema)
  );
  ARROW_RETURN_NOT_OK(writer->WriteTable(*table));
  ARROW_RETURN_NOT_OK(writer->Close());
  
  std::cout << "\nWrote table to example.arrow" << std::endl;
  
  // 4. Read it back
  ARROW_ASSIGN_OR_RAISE(
    auto input,
    arrow::io::ReadableFile::Open("example.arrow")
  );
  ARROW_ASSIGN_OR_RAISE(
    auto reader,
    arrow::ipc::RecordBatchFileReader::Open(input)
  );
  ARROW_ASSIGN_OR_RAISE(auto read_table, reader->ReadTable());
  
  std::cout << "\nRead table back:\n" << read_table->ToString() << std::endl;
  
  return arrow::Status::OK();
}

int main() {
  arrow::Status st = RunExample();
  if (!st.ok()) {
    std::cerr << "Error: " << st << std::endl;
    return 1;
  }
  return 0;
}

Next Steps

Compute Functions

Learn about Arrow’s compute functions for data processing

Parquet Files

Read and write Parquet files with Arrow

Datasets

Work with multi-file datasets and partitioning

API Reference

Explore the complete C++ API documentation

Common Patterns

Error Handling

Arrow uses macros for consistent error handling:
// Return on error
ARROW_RETURN_NOT_OK(some_operation());

// Assign result or return error
ARROW_ASSIGN_OR_RAISE(auto result, some_operation());

Memory Management

Arrow uses std::shared_ptr for automatic memory management:
std::shared_ptr<arrow::Array> array;
std::shared_ptr<arrow::Table> table;
// Memory is automatically freed when references go out of scope

Working with Nulls

arrow::Int32Builder builder;
builder.Append(1);
builder.AppendNull();  // Add a null value
builder.Append(3);

Troubleshooting

Make sure Arrow is installed and set CMAKE_PREFIX_PATH:
cmake -DCMAKE_PREFIX_PATH=/path/to/arrow/install ..
Ensure you’re linking against the correct Arrow libraries:
target_link_libraries(your_target PRIVATE 
  Arrow::arrow_shared
  Arrow::parquet_shared  # If using Parquet
)
Set LD_LIBRARY_PATH (Linux) or DYLD_LIBRARY_PATH (macOS):
export LD_LIBRARY_PATH=/path/to/arrow/lib:$LD_LIBRARY_PATH

Build docs developers (and LLMs) love