Skip to main content

Overview

An Array is an immutable data array with some logical type and some length. Most logical types are contained in the base Array class; there are also subclasses for DictionaryArray, ListArray, StructArray, and other specialized types.

Array Class

Factory Method

Array$create()

Instantiates an Array and returns the appropriate subclass.
x
vector | list | data.frame
An R vector, list, or data.frame
type
DataType
default:"NULL"
An optional data type for x. If omitted, the type will be inferred from the data
my_array <- Array$create(1:10)
my_array$type
# int32

my_array$cast(int8())
# <int8>
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Methods

$IsNull()

Return true if value at index is null. Does not boundscheck.
i
integer
Zero-based index position
na_array <- Array$create(c(1:5, NA))
na_array$IsNull(0)  # FALSE
na_array$IsNull(5)  # TRUE

$IsValid()

Return true if value at index is valid. Does not boundscheck.
i
integer
Zero-based index position
na_array <- Array$create(c(1:5, NA))
na_array$IsValid(5)  # FALSE

$length()

Size in the number of elements this array contains.
my_array <- Array$create(1:10)
my_array$length()  # 10

$nbytes()

Total number of bytes consumed by the elements of the array.
my_array <- Array$create(1:10)
my_array$nbytes()

$Equals()

Check if this array is equal to another.
other
Array
Another Array to compare with
na_array <- Array$create(c(1:5, NA))
na_array2 <- na_array
na_array$Equals(na_array2)  # TRUE

$ApproxEquals()

Check if this array is approximately equal to another.
other
Array
Another Array to compare with

$Diff()

Return a string expressing the difference between two arrays.
other
Array
Another Array to compare with

$data()

Return the underlying ArrayData.

$as_vector()

Convert to an R vector.
my_array <- Array$create(1:10)
my_array$as_vector()
# [1]  1  2  3  4  5  6  7  8  9 10

$ToString()

String representation of the array.

$Slice()

Construct a zero-copy slice of the array with the indicated offset and length.
offset
integer
Starting position (zero-based)
length
integer
default:"NULL"
Number of elements in the slice. If NULL, the slice goes until the end of the array
na_array <- Array$create(c(1:5, NA))
new_array <- na_array$Slice(5)
new_array$offset  # 5

$Take()

Return an Array with values at positions given by integers.
i
integer vector | Array
Positions to take (R vector or Arrow Array)
my_array <- Array$create(1:10)
my_array$Take(c(0, 2, 4))

$Filter()

Return an Array with values at positions where logical vector is TRUE.
i
logical vector | Array
Logical vector or Arrow boolean Array
keep_na
logical
default:"TRUE"
Whether to keep NA values
my_array <- Array$create(1:10)
my_array$Filter(c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE))

$SortIndices()

Return an Array of integer positions that can be used to rearrange the Array in ascending or descending order.
descending
logical
default:"FALSE"
Whether to sort in descending order

$RangeEquals()

Check if a range of values is equal to another array.
other
Array
Another Array to compare with
start_idx
integer
Starting index in this array
end_idx
integer
Ending index in this array
other_start_idx
integer
default:"0"
Starting index in the other array

$cast()

Alter the data in the array to change its type.
target_type
DataType
The target data type
safe
logical
default:"TRUE"
Whether to check for overflows or other unsafe conversions
options
CastOptions
default:"cast_options(safe)"
Casting options
my_array <- Array$create(1:10)
my_array$cast(int8())

$View()

Construct a zero-copy view of this array with the given type.
type
DataType
The data type for the view

$Validate()

Perform validation checks to determine obvious inconsistencies within the array’s internal data. This can be an expensive check, potentially O(length).

Active Bindings

$null_count

The number of null entries in the array.
na_array <- Array$create(c(1:5, NA))
na_array$null_count  # 1

$offset

A relative position into another array’s data, to enable zero-copy slicing.

$type

Logical type of data.
my_array <- Array$create(1:10)
my_array$type
# DataType
# int32

DictionaryArray Class

DictionaryArray is a subclass of Array for dictionary-encoded data, similar to R factors.

Factory Method

DictionaryArray$create()

x
vector | Array
An R vector or Array of integers for the dictionary indices, or an R factor
dict
vector | Array
default:"NULL"
An R vector or Array of dictionary values (like R factor levels). Not needed if x is a factor
# From a factor
factor_array <- DictionaryArray$create(factor(c("a", "b", "a", "c")))

# From indices and dictionary
indices <- c(0L, 1L, 0L, 2L)
dict <- c("a", "b", "c")
dict_array <- DictionaryArray$create(indices, dict)

Methods

$indices()

Return the indices array.

$dictionary()

Return the dictionary array.

Active Bindings

$ordered

Whether the dictionary is ordered.

StructArray Class

StructArray is a subclass of Array for struct (nested) data.

Factory Method

StructArray$create()

Create a StructArray from named arrays or vectors.
struct_array <- StructArray$create(
  x = c(1, 2, 3),
  y = c("a", "b", "c")
)

Methods

$field()

Extract a field by integer position.
i
integer
Zero-based field index

$GetFieldByName()

Extract a field by name.
name
character
Field name
struct_array <- StructArray$create(x = c(1, 2, 3), y = c("a", "b", "c"))
struct_array$GetFieldByName("x")

$Flatten()

Flatten the struct array into a list of arrays.

ListArray Class

ListArray is a subclass of Array for list data.

Methods

$values()

Return the values array (all list elements flattened).

$value_length()

Return the length of a specific list element.
i
integer
Zero-based index

$value_offset()

Return the offset of a specific list element.
i
integer
Zero-based index

$raw_value_offsets()

Return the raw offsets array.

Active Bindings

$value_type

The data type of the list values.

Helper Functions

as_arrow_array()

Convert an object to an Arrow Array. This is an S3 generic that allows methods to be defined in other packages.
x
object
An object to convert to an Arrow Array
type
DataType
default:"NULL"
A data type for the final Array. If NULL, will be inferred
as_arrow_array(1:5)
as_arrow_array(c("a", "b", "c"))

concat_arrays()

Concatenate zero or more Arrays into a single array. This operation will make a copy of its input.
...
Array
Zero or more Array objects to concatenate
type
DataType
default:"NULL"
An optional type describing the desired type for the final Array
concat_arrays(Array$create(1:3), Array$create(4:5))
# <int32>
# [1, 2, 3, 4, 5]

arrow_array()

Alias for Array$create().
x
vector | list | data.frame
An R object representable as an Arrow array
type
DataType
default:"NULL"
An optional data type. If omitted, will be inferred from the data
my_array <- arrow_array(1:10)

Build docs developers (and LLMs) love