Skip to main content
Array functions provide tools for working with array columns in Fenic DataFrames. All functions are available via fc.arr.*.

Basic Operations

size

Returns the number of elements in an array column.
fc.arr.size(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays whose length to compute.
return
Column
A Column expression representing the array length. Returns None for None arrays.

Example

df.select(fc.arr.size("tags"))

contains

Checks if array column contains a specific value.
fc.arr.contains(
    column: ColumnOrName,
    value: Union[str, int, float, bool, Column]
) -> Column
column
ColumnOrName
required
Column or column name containing the arrays to check.
value
Union[str, int, float, bool, Column]
required
Value to search for in the arrays. Can be a literal value or a Column expression.
return
Column
A boolean Column expression (True if value is found, False otherwise).

Example

df.select(fc.arr.contains("tags", "python"))

element_at

Returns the element at the given index in an array using 1-based indexing.
fc.arr.element_at(
    column: ColumnOrName,
    index: Union[int, ColumnOrName]
) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
index
Union[int, ColumnOrName]
required
Index of the element (1-based). Positive indices count from the start (1 = first element), negative indices count from the end (-1 = last element).
return
Column
A Column containing the element at the specified index.

Example

df.select(
    fc.arr.element_at("numbers", 1).alias("first"),
    fc.arr.element_at("numbers", -1).alias("last")
)

Transformations

distinct

Removes duplicate values from an array column.
fc.arr.distinct(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
return
Column
A new column that is an array of unique values from the input column.

Example

df.select(fc.arr.distinct("array_col").alias("distinct_array"))

sort

Sorts the array in ascending order.
fc.arr.sort(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays of comparable types (numeric, string, date, boolean).
return
Column
A Column with sorted arrays in ascending order. Null values are placed at the end.

Example

df.select(fc.arr.sort("numbers").alias("sorted"))

reverse

Reverses the elements of an array.
fc.arr.reverse(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
return
Column
A Column with reversed arrays.

Example

df.select(fc.arr.reverse("numbers").alias("reversed_nums"))

compact

Removes null values from an array.
fc.arr.compact(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
return
Column
A Column with arrays having null values removed.

Example

df.select(fc.arr.compact("values").alias("compact"))

slice

Extracts a subarray from an array using 1-based indexing.
fc.arr.slice(
    column: ColumnOrName,
    start: Union[int, ColumnOrName],
    length: Union[int, ColumnOrName]
) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
start
Union[int, ColumnOrName]
required
Starting position (1-based index). Positive indices count from the start, negative indices count from the end.
length
Union[int, ColumnOrName]
required
Number of elements to extract. Must be positive.
return
Column
A Column with subarrays extracted.

Example

df.select(
    fc.arr.slice("numbers", 1, 3).alias("first_three"),
    fc.arr.slice("numbers", -3, 3).alias("last_three")
)

Set Operations

union

Returns the union of two arrays without duplicates.
fc.arr.union(col1: ColumnOrName, col2: ColumnOrName) -> Column
col1
ColumnOrName
required
First array column or column name.
col2
ColumnOrName
required
Second array column or column name.
return
Column
A Column containing the distinct union of both arrays.

Example

df.select(fc.arr.union("tags1", "tags2").alias("all_tags"))

intersect

Returns the intersection of two arrays.
fc.arr.intersect(col1: ColumnOrName, col2: ColumnOrName) -> Column
col1
ColumnOrName
required
First array column or column name.
col2
ColumnOrName
required
Second array column or column name.
return
Column
A Column containing distinct elements present in both arrays.

Example

df.select(fc.arr.intersect("arr1", "arr2").alias("common"))

except_

Returns elements in the first array but not in the second.
fc.arr.except_(col1: ColumnOrName, col2: ColumnOrName) -> Column
col1
ColumnOrName
required
First array column or column name.
col2
ColumnOrName
required
Second array column or column name.
return
Column
A Column containing distinct elements in col1 but not in col2.

Example

df.select(fc.arr.except_("all_tags", "deprecated").alias("active"))

overlap

Checks if two arrays have at least one common element.
fc.arr.overlap(col1: ColumnOrName, col2: ColumnOrName) -> Column
col1
ColumnOrName
required
First array column or column name.
col2
ColumnOrName
required
Second array column or column name.
return
Column
A boolean Column (True if arrays have common elements, False otherwise).

Example

df.select(fc.arr.overlap("arr1", "arr2").alias("has_overlap"))

Aggregation

max

Returns the maximum value in an array.
fc.arr.max(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays of comparable types (numeric, string, date, boolean).
return
Column
A Column containing the maximum value from each array. Returns null if the array is null or empty.

Example

df.select(fc.arr.max("numbers").alias("max_value"))

min

Returns the minimum value in an array.
fc.arr.min(column: ColumnOrName) -> Column
column
ColumnOrName
required
Column or column name containing arrays of comparable types.
return
Column
A Column containing the minimum value from each array. Returns null if the array is null or empty.

Example

df.select(fc.arr.min("numbers").alias("min_value"))

Utility

remove

Removes all occurrences of an element from an array.
fc.arr.remove(
    column: ColumnOrName,
    element: Union[str, int, float, bool, Column]
) -> Column
column
ColumnOrName
required
Column or column name containing arrays.
element
Union[str, int, float, bool, Column]
required
Element to remove from the arrays. Can be a literal value or a Column expression.
return
Column
A Column with arrays having all occurrences of the element removed.

Example

df.select(fc.arr.remove("tags", "a").alias("no_a"))

repeat

Creates an array containing the element repeated count times.
fc.arr.repeat(
    col: ColumnOrName,
    count: Union[int, ColumnOrName]
) -> Column
col
ColumnOrName
required
Column, column name, or literal value to repeat.
count
Union[int, ColumnOrName]
required
Number of times to repeat the element. Can be an integer literal or a Column expression.
return
Column
A Column containing an array with the element repeated count times.

Example

df.select(
    fc.arr.repeat(fc.lit("x"), 3).alias("repeated"),
    fc.arr.repeat(fc.col("value"), fc.col("count")).alias("dynamic")
)

Build docs developers (and LLMs) love