NumPy integration

PDAL Python seamlessly integrates with NumPy, allowing you to read point cloud data into arrays, perform operations using NumPy, and pass the results back to PDAL for further processing.

Complete workflow example

This example demonstrates the full cycle between PDAL and Python:

Read a point cloud file into a NumPy array
Filter the array using NumPy operations
Pass the filtered array back to PDAL for additional filtering
Write the final result to output files

import pdal

data = "https://github.com/PDAL/PDAL/blob/master/test/data/las/1.2-with-color.las?raw=true"

pipeline = pdal.Reader.las(filename=data).pipeline()
print(pipeline.execute())  # 1065 points

# Get the data from the first array
# [array([(637012.24, 849028.31, 431.66, 143, 1,
# 1, 1, 0, 1,  -9., 132, 7326, 245380.78254963,  68,  77,  88),
# dtype=[('X', '<f8'), ('Y', '<f8'), ('Z', '<f8'), ('Intensity', '<u2'),
# ('ReturnNumber', 'u1'), ('NumberOfReturns', 'u1'), ('ScanDirectionFlag', 'u1'),
# ('EdgeOfFlightLine', 'u1'), ('Classification', 'u1'), ('ScanAngleRank', '<f4'),
# ('UserData', 'u1'), ('PointSourceId', '<u2'),
# ('GpsTime', '<f8'), ('Red', '<u2'), ('Green', '<u2'), ('Blue', '<u2')])]
arr = pipeline.arrays[0]

# Filter out entries that have intensity < 50
intensity = arr[arr["Intensity"] > 30]
print(len(intensity))  # 704 points

# Now use pdal to clamp points that have intensity 100 <= v < 300
pipeline = pdal.Filter.expression(expression="Intensity >= 100 && Intensity < 300").pipeline(intensity)
print(pipeline.execute())  # 387 points
clamped = pipeline.arrays[0]

# Write our intensity data to a LAS file and a TileDB array. For TileDB it is
# recommended to use Hilbert ordering by default with geospatial point cloud data,
# which requires specifying a domain extent. This can be determined automatically
# from a stats filter that computes statistics about each dimension (min, max, etc.).
pipeline = pdal.Writer.las(
    filename="clamped.las",
    offset_x="auto",
    offset_y="auto",
    offset_z="auto",
    scale_x=0.01,
    scale_y=0.01,
    scale_z=0.01,
).pipeline(clamped)
pipeline |= pdal.Filter.stats() | pdal.Writer.tiledb(array_name="clamped")
print(pipeline.execute())  # 387 points

# Dump the TileDB array schema
import tiledb
with tiledb.open("clamped") as a:
    print(a.schema)

Reading data into NumPy arrays

Execute the pipeline

First, create and execute a pipeline to read your data:

pipeline = pdal.Reader.las(filename=data).pipeline()
count = pipeline.execute()

Access the arrays

After execution, access the NumPy arrays via the arrays property:

arr = pipeline.arrays[0]

The array contains structured data with fields for each dimension (X, Y, Z, Intensity, etc.).

pipeline.arrays returns a list of NumPy arrays, one for each PointView in the pipeline output. Most pipelines produce a single array.

Filtering arrays with NumPy

Once you have data in a NumPy array, you can use standard NumPy operations to filter it:

# Filter points by intensity
intensity = arr[arr["Intensity"] > 30]
print(len(intensity))  # Shows number of filtered points

NumPy’s boolean indexing makes it easy to create complex filters:

# Multiple conditions
filtered = arr[(arr["Intensity"] > 30) & (arr["Classification"] == 2)]

# Filter by spatial bounds
subset = arr[(arr["X"] > xmin) & (arr["X"] < xmax)]

Passing filtered arrays back to PDAL

You can pass NumPy arrays back to PDAL for further processing:

# Create a pipeline from a NumPy array
pipeline = pdal.Filter.expression(
    expression="Intensity >= 100 && Intensity < 300"
).pipeline(intensity)

count = pipeline.execute()
clamped = pipeline.arrays[0]

The .pipeline() method on a stage accepts a NumPy array as input, allowing you to chain Python processing with PDAL operations.

Writing filtered data

Once you’ve filtered your data, write it to various formats:

pipeline = pdal.Writer.las(
    filename="output.las",
    offset_x="auto",
    offset_y="auto",
    offset_z="auto",
    scale_x=0.01,
    scale_y=0.01,
    scale_z=0.01,
).pipeline(clamped)

count = pipeline.execute()

You can chain multiple writers to output the same data in different formats simultaneously using the pipe operator.

Get Started

Core Concepts

Guides

Advanced

Complete workflow example

Reading data into NumPy arrays

Filtering arrays with NumPy

Passing filtered arrays back to PDAL

Writing filtered data

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Advanced

​Complete workflow example

​Reading data into NumPy arrays

​Filtering arrays with NumPy

​Passing filtered arrays back to PDAL

​Writing filtered data

Build docs developers (and LLMs) love

Complete workflow example

Reading data into NumPy arrays

Filtering arrays with NumPy

Passing filtered arrays back to PDAL

Writing filtered data