Skip to main content
Welcome to the absolute beginner’s guide to NumPy! NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering. The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional ndarray, and a large library of functions that operate efficiently on these data structures.
If you have comments or suggestions, please reach out to the NumPy community!

How to Import NumPy

After installing NumPy (see Installation), it should be imported into Python code like:
import numpy as np
This widespread convention allows access to NumPy features with a short, recognizable prefix (np.) while distinguishing NumPy features from others that have the same name.

Reading the Example Code

Throughout the NumPy documentation, you will find blocks that look like:
>>> a = np.array([[1, 2, 3],
...               [4, 5, 6]])
>>> a.shape
(2, 3)
Text preceded by >>> or ... is input, the code that you would enter in a script or at a Python prompt. Everything else is output, the results of running your code. Note that >>> and ... are not part of the code and may cause an error if entered at a Python prompt.

Why Use NumPy?

Python lists are excellent, general-purpose containers. They can be “heterogeneous”, meaning that they can contain elements of a variety of types, and they are quite fast when used to perform individual operations on a handful of elements. Depending on the characteristics of the data and the types of operations that need to be performed, other containers may be more appropriate. By exploiting these characteristics, we can improve speed, reduce memory consumption, and offer a high-level syntax for performing a variety of common processing tasks. NumPy shines when there are large quantities of “homogeneous” (same-type) data to be processed on the CPU.

Speed

Operations on NumPy arrays are executed in compiled C code, making them much faster than pure Python

Memory Efficiency

NumPy arrays use less memory than Python lists for large datasets

Convenient Syntax

Express complex operations concisely without explicit loops

Ecosystem Integration

NumPy is the foundation for pandas, scikit-learn, and most scientific Python libraries

What is an Array?

In computer programming, an array is a structure for storing and retrieving data. We often talk about an array as if it were a grid in space, with each cell storing one element of the data.

One-dimensional Array

A one-dimensional array is like a list:
import numpy as np

# Create a 1D array
a = np.array([1, 5, 2, 0])
print(a)
# Output: [1 5 2 0]

Two-dimensional Array

A two-dimensional array is like a table:
# Create a 2D array
b = np.array([[1, 5, 2, 0],
              [8, 3, 6, 1],
              [1, 7, 2, 9]])
print(b)
# Output:
# [[1 5 2 0]
#  [8 3 6 1]
#  [1 7 2 9]]

Three-dimensional and Higher

A three-dimensional array would be like a set of tables, perhaps stacked as though they were printed on separate pages. In NumPy, this idea is generalized to an arbitrary number of dimensions, and so the fundamental array class is called ndarray: it represents an “N-dimensional array”.
Most NumPy arrays have some restrictions:
  • All elements of the array must be of the same type of data
  • Once created, the total size of the array can’t change
  • The shape must be “rectangular”, not “jagged”; e.g., each row of a two-dimensional array must have the same number of columns
When these conditions are met, NumPy exploits these characteristics to make the array faster, more memory efficient, and more convenient to use than less restrictive data structures.

Array Fundamentals

Creating an Array

One way to initialize an array is using a Python sequence, such as a list:
import numpy as np

a = np.array([1, 2, 3, 4, 5, 6])
print(a)
# Output: [1 2 3 4 5 6]

Accessing Elements

Elements of an array can be accessed using integer indices within square brackets:
print(a[0])  # First element: 1
print(a[2])  # Third element: 3
As with built-in Python sequences, NumPy arrays are “0-indexed”: the first element of the array is accessed using index 0, not 1.

Modifying Arrays

Like Python lists, arrays are mutable:
a[0] = 10
print(a)
# Output: [10  2  3  4  5  6]

Slicing

Python slice notation can be used for indexing:
print(a[:3])    # First three elements: [10  2  3]
print(a[2:5])   # Elements 2, 3, 4: [3 4 5]
print(a[-2:])   # Last two elements: [5 6]
Important Difference: Slice indexing of a list copies the elements into a new list, but slicing an array returns a view: an object that refers to the data in the original array. The original array can be mutated using the view:
b = a[3:]
print(b)     # [4 5 6]
b[0] = 40
print(a)     # [10  2  3 40  5  6] - original array changed!

Two-dimensional Arrays

Two- and higher-dimensional arrays can be initialized from nested Python sequences:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(a)
# Output:
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
An element of a 2D array can be accessed by specifying the index along each axis within a single set of square brackets, separated by commas:
print(a[1, 3])  # Row 1, Column 3: 8
In NumPy, a dimension of an array is sometimes referred to as an “axis”. This terminology helps disambiguate between the dimensionality of an array and the dimensionality of the data represented by the array.

Array Attributes

ndim - Number of Dimensions

The number of dimensions of an array is contained in the ndim attribute:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(a.ndim)  # 2

shape - Array Dimensions

The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension:
print(a.shape)  # (3, 4) - 3 rows, 4 columns
print(len(a.shape) == a.ndim)  # True

size - Total Number of Elements

The fixed, total number of elements in array is contained in the size attribute:
print(a.size)  # 12

import math
print(a.size == math.prod(a.shape))  # True

dtype - Data Type

Arrays are typically “homogeneous”, meaning that they contain elements of only one “data type”. The data type is recorded in the dtype attribute:
print(a.dtype)  # dtype('int64')
a = np.array([1, 2, 3])
print(a.dtype)  # dtype('int64')

How to Create a Basic Array

Besides creating an array from a sequence of elements, you can easily create arrays filled with specific values:
1

Array of Zeros

np.zeros(5)
# array([0., 0., 0., 0., 0.])

np.zeros((3, 4))  # 2D array
# array([[0., 0., 0., 0.],
#        [0., 0., 0., 0.],
#        [0., 0., 0., 0.]])
2

Array of Ones

np.ones(5)
# array([1., 1., 1., 1., 1.])

np.ones((2, 3), dtype=np.int64)
# array([[1, 1, 1],
#        [1, 1, 1]])
3

Empty Array

# Create an empty array (content is random)
np.empty(5)
# array([3.14, 42., 1.5, 2.8, 9.1])  # values will vary
The function empty creates an array whose initial content is random and depends on the state of the memory. Use it only when you plan to fill every element afterwards.
4

Range of Elements

# Create an array with a range of elements
np.arange(5)
# array([0, 1, 2, 3, 4])

np.arange(2, 9, 2)  # start, stop, step
# array([2, 4, 6, 8])
5

Linearly Spaced Values

# Array with values spaced linearly
np.linspace(0, 10, num=5)
# array([ 0. ,  2.5,  5. ,  7.5, 10. ])

Specifying Data Type

While the default data type is floating point (np.float64), you can explicitly specify which data type you want using the dtype keyword:
x = np.ones(2, dtype=np.int64)
print(x)
# [1 1]
print(x.dtype)
# dtype('int64')

Adding, Removing, and Sorting Elements

Sorting

Sorting an array is simple with np.sort():
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
np.sort(arr)
# array([1, 2, 3, 4, 5, 6, 7, 8])
Related functions:
  • argsort - indirect sort along a specified axis
  • lexsort - indirect stable sort on multiple keys
  • searchsorted - find elements in a sorted array
  • partition - partial sort

Concatenating

You can concatenate arrays with np.concatenate():
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

np.concatenate((a, b))
# array([1, 2, 3, 4, 5, 6, 7, 8])
For 2D arrays, specify the axis:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

np.concatenate((x, y), axis=0)
# array([[1, 2],
#        [3, 4],
#        [5, 6]])

Array Shape and Size

Checking Shape and Size

array_example = np.array([[[0, 1, 2, 3],
                           [4, 5, 6, 7]],
                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]],
                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]]])

print(array_example.ndim)   # 3 dimensions
print(array_example.size)   # 24 elements
print(array_example.shape)  # (3, 2, 4)

Reshaping

Using reshape() gives a new shape to an array without changing the data:
a = np.arange(6)
print(a)
# [0 1 2 3 4 5]

b = a.reshape(3, 2)
print(b)
# [[0 1]
#  [2 3]
#  [4 5]]
You can use -1 to automatically calculate one dimension:
np.reshape(a, shape=(1, 6))  # 1 row, 6 columns
np.reshape(a, shape=(3, -1))  # 3 rows, auto-calculate columns

Indexing and Slicing

You can index and slice NumPy arrays in the same ways you can slice Python lists:
data = np.array([1, 2, 3])

print(data[1])      # 2
print(data[0:2])    # [1 2]
print(data[1:])     # [2 3]
print(data[-2:])    # [2 3]

Conditional Selection

You can select values from your array that fulfill certain conditions:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Values less than 5
print(a[a < 5])
# [1 2 3 4]

# Values greater than or equal to 5
five_up = (a >= 5)
print(a[five_up])
# [ 5  6  7  8  9 10 11 12]

# Divisible by 2
divisible_by_2 = a[a % 2 == 0]
print(divisible_by_2)
# [ 2  4  6  8 10 12]

# Multiple conditions
c = a[(a > 2) & (a < 11)]
print(c)
# [ 3  4  5  6  7  8  9 10]
Use & for AND and | for OR when combining conditions.

Basic Array Operations

Arithmetic Operations

Once you’ve created your arrays, you can perform arithmetic operations:
data = np.array([1, 2])
ones = np.ones(2, dtype=np.int_)

print(data + ones)    # [2 3]
print(data - ones)    # [0 1]
print(data * data)    # [1 4]
print(data / data)    # [1. 1.]

Aggregation Functions

NumPy provides aggregation functions:
a = np.array([1, 2, 3, 4])

print(a.sum())   # 10
print(a.min())   # 1
print(a.max())   # 4
print(a.mean())  # 2.5
print(a.std())   # Standard deviation

Operations on 2D Arrays

For 2D arrays, you can specify the axis:
b = np.array([[1, 1], [2, 2]])

print(b.sum(axis=0))  # Sum over rows: [3 3]
print(b.sum(axis=1))  # Sum over columns: [2 4]

Broadcasting

Broadcasting allows you to perform operations between an array and a scalar, or between arrays of different sizes:
data = np.array([1.0, 2.0])
print(data * 1.6)  # [1.6 3.2]

# Broadcasting with 2D arrays
data = np.array([[1, 2], [3, 4], [5, 6]])
ones_row = np.array([[1, 1]])
print(data + ones_row)
# [[2 3]
#  [4 5]
#  [6 7]]
NumPy understands that the operation should happen with each cell. This concept is called broadcasting and is a powerful feature for working with arrays of different shapes.

Creating Matrices

You can pass Python lists of lists to create a 2-D array (or “matrix”):
data = np.array([[1, 2], [3, 4], [5, 6]])
print(data)
# [[1 2]
#  [3 4]
#  [5 6]]

print(data[0, 1])  # Element at row 0, column 1: 2
print(data[1:3])   # Rows 1 and 2
# [[3 4]
#  [5 6]]
print(data[0:2, 0])  # First two rows, first column: [1 3]

Matrix Aggregation

data = np.array([[1, 2], [5, 3], [4, 6]])

print(data.max())        # 6 (max of all elements)
print(data.max(axis=0))  # [5 6] (max of each column)
print(data.max(axis=1))  # [2 5 6] (max of each row)

Generating Random Numbers

Random number generation is important for many numerical and machine learning algorithms:
rng = np.random.default_rng()  # Create a random number generator

# Random floats in [0, 1)
print(rng.random(3))
# [0.63696169 0.26978671 0.04097352]

# 2D array of random floats
print(rng.random((3, 2)))
# [[0.01652764 0.81327024]
#  [0.91275558 0.60663578]
#  [0.72949656 0.54362499]]

# Random integers
print(rng.integers(5, size=(2, 4)))  # Integers from 0 to 4
# [[2 1 1 0]
#  [0 0 0 4]]

How to Get Unique Items and Counts

You can find the unique elements in an array easily with np.unique:
a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20])

unique_values = np.unique(a)
print(unique_values)
# [11 12 13 14 15 16 17 18 19 20]

# Get indices and counts
unique_values, indices_list = np.unique(a, return_index=True)
print(indices_list)
# [ 0  2  3  4  5  6  7 12 13 14]

unique_values, occurrence_count = np.unique(a, return_counts=True)
print(occurrence_count)
# [3 2 2 2 1 1 1 1 1 1]

Transposing and Reshaping

NumPy arrays have the property T that allows you to transpose a matrix:
data = np.array([[1, 2], [3, 4], [5, 6]])
print(data.T)
# [[1 3 5]
#  [2 4 6]]
You can also use transpose():
arr = np.arange(6).reshape((2, 3))
print(arr)
# [[0 1 2]
#  [3 4 5]]

print(arr.transpose())
# [[0 3]
#  [1 4]
#  [2 5]]

How to Reverse an Array

NumPy’s flip() function allows you to flip, or reverse, the contents of an array:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
reversed_arr = np.flip(arr)
print(reversed_arr)
# [8 7 6 5 4 3 2 1]
For 2D arrays:
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Reverse all
print(np.flip(arr_2d))
# [[12 11 10  9]
#  [ 8  7  6  5]
#  [ 4  3  2  1]]

# Reverse rows only
print(np.flip(arr_2d, axis=0))
# [[ 9 10 11 12]
#  [ 5  6  7  8]
#  [ 1  2  3  4]]

# Reverse columns only
print(np.flip(arr_2d, axis=1))
# [[ 4  3  2  1]
#  [ 8  7  6  5]
#  [12 11 10  9]]

Flattening Arrays

There are two ways to flatten an array:
# Creates a copy
x = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(x.flatten())
# [ 1  2  3  4  5  6  7  8  9 10 11 12]

a1 = x.flatten()
a1[0] = 99
print(x[0, 0])  # Still 1 (original unchanged)
ravel() returns a view (reference) to the parent array, so changes affect the original. flatten() creates a copy.

How to Save and Load NumPy Objects

You can save and load NumPy arrays to/from disk:

Binary Format (.npy)

# Save
a = np.array([1, 2, 3, 4, 5, 6])
np.save('filename.npy', a)

# Load
b = np.load('filename.npy')
print(b)  # [1 2 3 4 5 6]

Text Format (.csv, .txt)

# Save as CSV
csv_arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
np.savetxt('new_file.csv', csv_arr)

# Load from CSV
loaded = np.loadtxt('new_file.csv')
print(loaded)
# [1. 2. 3. 4. 5. 6. 7. 8.]
The .npy and .npz files are smaller and faster to read than text files. Use text files when you need human-readable output or interoperability with other tools.

Importing and Exporting CSV

The best way to read CSV files with mixed data types is using pandas:
import pandas as pd
import numpy as np

# Read CSV file
data = pd.read_csv('music.csv', header=0).values
print(data)

# Select specific columns
data = pd.read_csv('music.csv', usecols=['Artist', 'Plays']).values
print(data)

Working with Mathematical Formulas

The ease of implementing mathematical formulas is one of the things that make NumPy widely used in the scientific Python community. For example, the mean square error formula:
import numpy as np

predictions = np.array([1.2, 2.5, 3.1])
labels = np.array([1.0, 2.0, 3.0])
n = len(predictions)

# Mean Square Error
error = (1/n) * np.sum(np.square(predictions - labels))
print(error)
What makes this work so well is that predictions and labels can contain one or a thousand values—they only need to be the same size.

Next Steps

Congratulations on completing the absolute beginners guide! You now have a solid foundation in NumPy.

Quickstart Tutorial

More advanced array operations and techniques

API Reference

Explore the complete NumPy API

Advanced Topics

Broadcasting, indexing, and advanced operations

Community

Join the NumPy community

Build docs developers (and LLMs) love