Skip to main content
This guide will help you get started with MLX by creating arrays, performing operations, and using function transformations.

Creating arrays

Import mlx.core and create your first array:
import mlx.core as mx

# Create arrays
a = mx.array([1, 2, 3, 4])
print(a.shape)  # [4]
print(a.dtype)  # int32

b = mx.array([1.0, 2.0, 3.0, 4.0])
print(b.dtype)  # float32

Lazy evaluation

Operations in MLX are lazy. Arrays are only computed when needed.
You can force evaluation using eval():
c = a + b  # Not evaluated yet
mx.eval(c)  # Now evaluated
Arrays are automatically evaluated when you:
  • Print them
  • Access scalar values with .item()
  • Convert to NumPy arrays
c = a + b
print(c)  # Automatically evaluates
# array([2, 4, 6, 8], dtype=float32)

# Convert to NumPy
import numpy as np
np_array = np.array(c)  # Also evaluates
Learn more in the Lazy Evaluation guide.

Basic operations

MLX supports standard array operations:
import mlx.core as mx

a = mx.array([1, 2, 3])
b = mx.array([4, 5, 6])

# Element-wise operations
c = a + b  # [5, 7, 9]
d = a * b  # [4, 10, 18]
e = mx.exp(a)  # Exponential

Function transformations

MLX provides composable function transformations for automatic differentiation and vectorization.

Automatic differentiation

Compute gradients with grad():
import mlx.core as mx

x = mx.array(0.0)

# Gradient of sin(x)
grad_sin = mx.grad(mx.sin)
print(grad_sin(x))  # 1.0 (cos(0))

# Second derivative
grad2_sin = mx.grad(mx.grad(mx.sin))
print(grad2_sin(x))  # -0.0 (-sin(0))

Value and gradient

Compute both function value and gradient efficiently:
import mlx.core as mx

def loss_fn(x):
    return mx.sum(x ** 2)

x = mx.array([1.0, 2.0, 3.0])

# Get both value and gradient
value, grad = mx.value_and_grad(loss_fn)(x)
print(value)  # 14.0
print(grad)   # [2.0, 4.0, 6.0]

Vectorization

Vectorize functions with vmap():
import mlx.core as mx

def normalize(x):
    return x / mx.sum(x)

# Batch of vectors
batch = mx.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

# Vectorize across batch dimension
batch_normalize = mx.vmap(normalize)
result = batch_normalize(batch)
# [[0.333, 0.667],
#  [0.429, 0.571],
#  [0.455, 0.545]]

Composable transformations

Transformations can be composed arbitrarily:
import mlx.core as mx

# Gradient of vectorized function
grad_vmap_fn = mx.grad(mx.vmap(some_function))

# Vectorized gradient
vmap_grad_fn = mx.vmap(mx.grad(some_function))

# Any combination works
mx.grad(mx.vmap(mx.grad(fn)))

Multi-device execution

Operations can run on CPU or GPU without copying data:
import mlx.core as mx

# Arrays live in unified memory
a = mx.array([1, 2, 3])

# Run on GPU
b = mx.exp(a, stream=mx.gpu)

# Run on CPU
c = mx.sin(a, stream=mx.cpu)

# No data transfer needed - unified memory!
Learn more in the Unified Memory guide.

Next steps

Core Concepts

Understand lazy evaluation, unified memory, and more

Examples

See complete examples including neural networks

Python API

Explore the full API reference

Guides

Learn about indexing, NumPy comparison, and more

Build docs developers (and LLMs) love