Understanding NumPy Arrays
The ndarray (N-dimensional array) is the fundamental data structure in NumPy. It provides a powerful, efficient way to work with homogeneous multidimensional data.
What is an ndarray?
An ndarray is a multidimensional container for items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.
import numpy as np
# Create a 1-D array
a = np.array([ 1 , 2 , 3 , 4 , 5 , 6 ])
print (a)
# Output: array([1, 2, 3, 4, 5, 6])
# Create a 2-D array
b = np.array([[ 1 , 2 , 3 ],
[ 4 , 5 , 6 ]])
print (b.shape)
# Output: (2, 3)
Key Characteristics
NumPy arrays have several important restrictions:
All elements must be of the same data type
Once created, the total size cannot change
The shape must be “rectangular”, not “jagged”
These constraints allow NumPy to optimize memory usage and computation speed significantly compared to Python lists.
Array Attributes
Every ndarray has several important attributes that describe its structure:
Attribute Description ndarray.ndimNumber of dimensions (axes) ndarray.shapeTuple indicating size of each dimension ndarray.sizeTotal number of elements ndarray.dtypeData type of elements ndarray.itemsizeSize in bytes of each element ndarray.dataBuffer containing actual array elements
import numpy as np
x = np.array([[ 1 , 2 , 3 , 4 ],
[ 5 , 6 , 7 , 8 ],
[ 9 , 10 , 11 , 12 ]])
print ( f "Dimensions: { x.ndim } " )
# Output: Dimensions: 3
print ( f "Shape: { x.shape } " )
# Output: Shape: (3, 4)
print ( f "Size: { x.size } " )
# Output: Size: 12
print ( f "Data type: { x.dtype } " )
# Output: Data type: int64
print ( f "Item size: { x.itemsize } bytes" )
# Output: Item size: 8 bytes
Creating Arrays
From Python Sequences
The most straightforward way to create an array is from a Python list or tuple:
import numpy as np
# From a list
a = np.array([ 1 , 2 , 3 , 4 , 5 ])
# From nested lists (2-D)
b = np.array([[ 1 , 2 ], [ 3 , 4 ], [ 5 , 6 ]])
# Explicitly specify the data type
c = np.array([ 1 , 2 , 3 ], dtype = np.float64)
print (c)
# Output: array([1., 2., 3.])
Using Built-in Functions
NumPy provides many functions for creating arrays:
import numpy as np
# Create array of zeros
zeros = np.zeros(( 3 , 4 ))
# Create array of ones
ones = np.ones(( 2 , 3 , 4 ))
# Create uninitialized array (faster, but contains garbage values)
empty = np.empty(( 2 , 2 ))
# Create array with a range of values
arange = np.arange( 0 , 10 , 2 ) # Start, stop, step
# Output: array([0, 2, 4, 6, 8])
# Create array with evenly spaced values
linspace = np.linspace( 0 , 1 , 5 ) # Start, stop, num_points
# Output: array([0. , 0.25, 0.5 , 0.75, 1. ])
# Create identity matrix
identity = np.eye( 3 )
# Output:
# array([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]])
Array Dimensions Explained
Think of array dimensions as nested containers:
1-D array : A list of values
2-D array : A table (rows and columns)
3-D array : A stack of tables
N-D array : Even more nested structures
Visualizing Dimensions
1-D Array (vector):
a = np.array([ 1 , 5 , 2 , 0 ])
# Shape: (4,)
2-D Array (matrix):
b = np.array([[ 1 , 5 , 2 , 0 ],
[ 8 , 3 , 6 , 1 ],
[ 1 , 7 , 2 , 9 ]])
# Shape: (3, 4)
┌─────────────┐
│ 1 5 2 0 │
│ 8 3 6 1 │
│ 1 7 2 9 │
└─────────────┘
3-D Array (stack of matrices):
c = np.array([[[ 1 , 2 ], [ 3 , 4 ]],
[[ 5 , 6 ], [ 7 , 8 ]]])
# Shape: (2, 2, 2)
Reshaping Arrays
You can change the shape of an array without changing its data:
import numpy as np
a = np.arange( 12 )
print (a)
# Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
# Reshape to 2-D
b = a.reshape( 3 , 4 )
print (b)
# Output:
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
# Reshape to 3-D
c = a.reshape( 2 , 3 , 2 )
print (c.shape)
# Output: (2, 3, 2)
The new shape must be compatible with the original shape. The total number of elements must remain the same. For example, you cannot reshape an array of 12 elements into shape (3, 5) because 3 × 5 = 15 ≠ 12.
Using -1 for Automatic Dimension Calculation
import numpy as np
a = np.arange( 12 )
# NumPy automatically calculates the missing dimension
b = a.reshape( 3 , - 1 ) # Becomes (3, 4)
c = a.reshape( - 1 , 2 ) # Becomes (6, 2)
print (b.shape) # Output: (3, 4)
print (c.shape) # Output: (6, 2)
Views vs Copies
Understanding the difference between views and copies is crucial for efficient NumPy programming.
Views (Shallow Copy)
A view is a new array object that looks at the same data. Modifying the view modifies the original array:
import numpy as np
a = np.array([ 1 , 2 , 3 , 4 , 5 ])
b = a[ 1 : 4 ] # Slicing creates a view
print (b) # Output: array([2, 3, 4])
b[ 0 ] = 99
print (a) # Output: array([1, 99, 3, 4, 5])
Basic slicing always creates views, not copies. This is different from Python lists!
Copies (Deep Copy)
A copy is a new array with a copy of the data:
import numpy as np
a = np.array([ 1 , 2 , 3 , 4 , 5 ])
b = a.copy() # Explicit copy
b[ 0 ] = 99
print (a) # Output: array([1, 2, 3, 4, 5]) - unchanged!
print (b) # Output: array([99, 2, 3, 4, 5])
When to Use Copy
Always use .copy() when:
Extracting a small portion from a large array that you no longer need
You want to modify data without affecting the original
Working with array subsets that need independent lifecycles
Memory Layout
NumPy arrays use C-order (row-major) indexing by default:
import numpy as np
a = np.array([[ 1 , 2 , 3 ],
[ 4 , 5 , 6 ]])
print (a.flags[ 'C_CONTIGUOUS' ]) # Output: True
# Elements are stored in memory as: [1, 2, 3, 4, 5, 6]
C-order vs Fortran-order:
C-order (row-major) : Last index changes fastest. Default in NumPy.
Fortran-order (column-major) : First index changes fastest. Used in Fortran and MATLAB.
You can specify the order when creating arrays: a = np.array([[ 1 , 2 ], [ 3 , 4 ]], order = 'F' ) # Fortran order
Flattening Arrays
Convert multidimensional arrays to 1-D:
import numpy as np
a = np.array([[ 1 , 2 , 3 ],
[ 4 , 5 , 6 ]])
# Flatten (returns a copy)
flat = a.flatten()
print (flat)
# Output: array([1, 2, 3, 4, 5, 6])
# Ravel (returns a view if possible)
ravel = a.ravel()
print (ravel)
# Output: array([1, 2, 3, 4, 5, 6])
Practical Example: Image Data
Arrays are commonly used to represent images:
import numpy as np
# Create a simple 4x4 grayscale image (values 0-255)
image = np.array([[ 0 , 64 , 128 , 192 ],
[ 32 , 96 , 160 , 224 ],
[ 64 , 128 , 192 , 255 ],
[ 96 , 160 , 224 , 255 ]], dtype = np.uint8)
print ( f "Image shape: { image.shape } " ) # (4, 4)
print ( f "Data type: { image.dtype } " ) # uint8
# RGB image would be 3-D: height × width × channels
rgb_image = np.zeros(( 100 , 100 , 3 ), dtype = np.uint8)
print ( f "RGB image shape: { rgb_image.shape } " ) # (100, 100, 3)
Next Steps
Now that you understand arrays, explore:
Data Types : Learn about NumPy’s rich type system
Indexing : Master array selection and slicing
Broadcasting : Understand how operations work on different shapes
Best Practices:
Pre-allocate arrays when possible (use zeros, ones, or empty)
Use vectorized operations instead of loops
Be mindful of views vs copies to avoid unnecessary memory usage
Use appropriate data types (don’t use float64 when float32 suffices)
Keep memory layout in mind for cache efficiency