RCU — Read-Copy Update

Read-Copy Update (RCU) is a synchronization mechanism optimized for read-mostly data. The key insight is that readers need not acquire any lock, perform atomic instructions, or execute memory barriers (on most architectures). Updates proceed by:

Copying the data to be changed.
Modifying the copy.
Publishing the new version atomically.
Waiting for a grace period — until all pre-existing readers have finished.
Freeing the old version.

#include <linux/rcupdate.h>
#include <linux/rculist.h>  /* for list_for_each_entry_rcu */

Zero-cost readers

rcu_read_lock / rcu_read_unlock are typically no-ops on non-preemptible kernels.

Safe pointer publication

rcu_assign_pointer inserts the necessary memory barriers to safely publish a new pointer.

Safe pointer dereference

rcu_dereference enforces dependency ordering so the compiler cannot speculate loads.

Deferred free

call_rcu and kfree_rcu free memory after a grace period — no explicit waiting.

Read-side critical sections

rcu_read_lock / rcu_read_unlock

void rcu_read_lock(void);
void rcu_read_unlock(void);

Delimit an RCU read-side critical section. On non-preemptible kernels these are typically no-ops (preemption is already disabled by other means). On CONFIG_PREEMPT_RCU kernels they manipulate a per-task nesting counter. Rules for read-side critical sections:

Must not block or sleep (GFP_ATOMIC allocations are acceptable).
Must not call schedule().
Must not exit to user space.
May be nested.

rcu_read_lock();
struct my_node *p = rcu_dereference(my_global_ptr);
if (p)
    do_something(p->value);
rcu_read_unlock();

RCU read-side critical sections do not provide mutual exclusion between readers and updaters. They only guarantee that the memory backing p is not freed while you hold the rcu_read_lock. If you need to modify data, use a separate lock (e.g., a spinlock or mutex) that protects the write side.

Pointer access

rcu_dereference

typeof(*p) *rcu_dereference(p);

Fetch an RCU-protected pointer. Inserts a data-dependency barrier (on architectures that need it, such as Alpha) to prevent the compiler or CPU from speculating loads that depend on the pointer value. Must be called from within an RCU read-side critical section.

__rcu *

required

An __rcu-annotated pointer. Sparse will warn if you dereference an __rcu pointer without this macro.

rcu_read_lock();
struct config *cfg = rcu_dereference(global_cfg);
/* safe to read cfg->field here */
rcu_read_unlock();

rcu_assign_pointer

rcu_assign_pointer(p, v);

Publish a new pointer value. Inserts a store-release barrier so that all prior writes to the pointed-to structure are visible to any reader that loads the pointer. Must be called with whatever lock protects the write side.

__rcu *

required

The __rcu-annotated pointer to update.

typeof(*p) *

required

The new value to publish. All fields of the new structure must be initialized before calling rcu_assign_pointer.

struct config *new_cfg = kmalloc(sizeof(*new_cfg), GFP_KERNEL);
new_cfg->threshold = 42;        /* initialize before publishing */

spin_lock(&update_lock);
old_cfg = rcu_dereference_protected(global_cfg,
                                    lockdep_is_held(&update_lock));
rcu_assign_pointer(global_cfg, new_cfg);
spin_unlock(&update_lock);

/* Wait for all readers of old_cfg to finish */
synchronize_rcu();
kfree(old_cfg);

Grace period management

A grace period is a time interval after a pointer update during which every CPU must have passed through at least one quiescent state (context switch, user-mode execution, or idle loop). After a grace period, no reader can hold a reference to the old version.

synchronize_rcu

void synchronize_rcu(void);

Block the calling thread until a full RCU grace period has elapsed. After this returns, all pre-existing RCU read-side critical sections have completed. May sleep; cannot be called from interrupt context.

/* After publishing new_ptr, wait for old readers to finish */
synchronize_rcu();
kfree(old_ptr);

call_rcu — asynchronous grace period

void call_rcu(struct rcu_head *head, rcu_callback_t func);

head

struct rcu_head *

required

An rcu_head embedded in the object to be freed. Must not be modified until the callback fires.

func

rcu_callback_t

required

Callback invoked after the grace period. Signature: void cb(struct rcu_head *head). Retrieve the enclosing object with container_of.

struct my_node {
    struct rcu_head rcu;
    int value;
};

static void my_node_free_rcu(struct rcu_head *head)
{
    struct my_node *node = container_of(head, struct my_node, rcu);
    kfree(node);
}

/* Remove node from list and schedule deferred free */
list_del_rcu(&node->list);
call_rcu(&node->rcu, my_node_free_rcu);

kfree_rcu — RCU-deferred kfree

kfree_rcu(ptr, rcu_head_field);
kfree_rcu_mightsleep(ptr);   /* when rcu_head is the only field */

Convenience macro that calls call_rcu with a kfree callback. Eliminates the need to write a custom callback just to free an object.

struct my_node {
    struct rcu_head rcu;
    int value;
};

/* Remove and defer free in one call */
list_del_rcu(&node->list);
kfree_rcu(node, rcu);

RCU-protected lists

The rculist.h API wraps list_head operations with the correct barriers for RCU traversal.

#include <linux/rculist.h>

/* Adding to a list (must hold update-side lock) */
list_add_rcu(&node->list, &my_list);
list_add_tail_rcu(&node->list, &my_list);

/* Removing from a list (must hold update-side lock) */
list_del_rcu(&node->list);

/* Traversal (inside rcu_read_lock) */
list_for_each_entry_rcu(pos, head, member) {
    /* pos is valid for the duration of this iteration */
}

/* Full traversal example */
rcu_read_lock();
list_for_each_entry_rcu(node, &my_list, list) {
    if (node->key == target_key) {
        do_work(node);
        break;
    }
}
rcu_read_unlock();

Use list_del_rcu (not list_del) when removing a node that may still be visible to RCU readers. list_del_rcu sets the prev pointer to LIST_POISON2 rather than modifying it in a way that could confuse concurrent list_for_each_entry_rcu traversals.

RCU pointer replacement — full example

Allocate and initialize the new object

struct config *new = kmalloc(sizeof(*new), GFP_KERNEL);
new->threshold = new_value;
/* All fields fully initialized before publication */

Acquire the update-side lock and publish

spin_lock(&cfg_lock);
old = rcu_dereference_protected(global_cfg,
                                lockdep_is_held(&cfg_lock));
rcu_assign_pointer(global_cfg, new);
spin_unlock(&cfg_lock);

Wait for grace period and free old object

synchronize_rcu();   /* or: kfree_rcu(old, rcu_head) */
kfree(old);

SRCU — Sleepable RCU

Standard RCU read-side critical sections may not sleep. SRCU (Sleepable RCU) lifts this restriction at the cost of a small per-subsystem overhead and slower grace periods.

#include <linux/srcu.h>

/* Static initialization */
static DEFINE_SRCU(my_srcu);

/* Dynamic initialization */
struct srcu_struct my_srcu;
init_srcu_struct(&my_srcu);
cleanup_srcu_struct(&my_srcu); /* on cleanup */

int  srcu_read_lock(struct srcu_struct *sp);
void srcu_read_unlock(struct srcu_struct *sp, int idx);
void synchronize_srcu(struct srcu_struct *sp);
void call_srcu(struct srcu_struct *sp, struct rcu_head *head, rcu_callback_t func);

struct srcu_struct *

required

The SRCU domain. Each subsystem that needs SRCU must have its own srcu_struct.

int idx = srcu_read_lock(&my_srcu);
/* may sleep here — blocking is allowed */
struct data *p = srcu_dereference(my_ptr, &my_srcu);
process(p);              /* may block */
srcu_read_unlock(&my_srcu, idx);

SRCU grace periods are slower than regular RCU grace periods because the kernel must wait for all SRCU read-side critical sections to complete rather than simply waiting for all CPUs to pass through a quiescent state. Use SRCU only when blocking inside the read-side critical section is genuinely required.

When to use RCU versus locks

Use RCU when
Use a lock when
RCU vs rwlock

Reads vastly outnumber writes (read-mostly data).
Read-side latency is critical (e.g., network packet processing, process scheduling).
Readers must not be delayed by writers under any circumstance.
You need to traverse a list from interrupt context without locking.

Classic examples: routing tables, task_struct lookups, module list, VFS dentry cache.

RCU is strictly superior to rwlock_t for read-mostly data: RCU read-side is cheaper, scales better, and can be used from interrupt context. The kernel is actively replacing rwlock_t with RCU in many subsystems. Prefer RCU over rwlock_t for any new read-mostly data structure.

Criterion	RCU	mutex	spinlock
Read-side cost	Near zero	Expensive (sleep)	Moderate (spin)
Write-side cost	Grace period wait	Sleep until available	Spin until available
Readers can sleep	No (SRCU: yes)	Yes	No
Usable from IRQ	Yes	No	Yes
Scales with readers	Perfectly	No	No

RCU usage checklist

Always use rcu_dereference inside rcu_read_lock

Dereferencing an __rcu pointer outside a read-side critical section is undefined behavior. Sparse (make C=1) will flag violations.

/* WRONG */
struct node *p = rcu_dereference(global_ptr); /* outside rcu_read_lock */

/* CORRECT */
rcu_read_lock();
struct node *p = rcu_dereference(global_ptr);
rcu_read_unlock();

Always initialize before rcu_assign_pointer

All fields of the new object must be fully initialized and visible to other CPUs before calling rcu_assign_pointer. The macro provides the necessary store-release barrier.

new->field = value;        /* initialize first */
rcu_assign_pointer(ptr, new);   /* then publish */

Wait for a grace period before freeing

Never free an RCU-protected object immediately after removing it from a data structure. Always use synchronize_rcu(), call_rcu(), or kfree_rcu() first.

Protect writers with a separate lock

RCU does not serialize concurrent writers. If multiple threads may update the same RCU-protected data structure, they must hold a conventional lock (e.g., a spinlock or mutex) with respect to each other.

Core APIs

Subsystem APIs

RCU — Read-Copy Update

Zero-cost readers

Safe pointer publication

Safe pointer dereference

Deferred free

Read-side critical sections

rcu_read_lock / rcu_read_unlock

Pointer access

rcu_dereference

rcu_assign_pointer

Grace period management

synchronize_rcu

call_rcu — asynchronous grace period

kfree_rcu — RCU-deferred kfree

RCU-protected lists

RCU pointer replacement — full example

SRCU — Sleepable RCU

When to use RCU versus locks

RCU usage checklist

Build docs developers (and LLMs) love

Core APIs

Subsystem APIs

Zero-cost readers

Safe pointer publication

Safe pointer dereference

Deferred free

​Read-side critical sections

​rcu_read_lock / rcu_read_unlock

​Pointer access

​rcu_dereference

​rcu_assign_pointer

​Grace period management

​synchronize_rcu

​call_rcu — asynchronous grace period

​kfree_rcu — RCU-deferred kfree

​RCU-protected lists

​RCU pointer replacement — full example

​SRCU — Sleepable RCU

​When to use RCU versus locks

​RCU usage checklist

Build docs developers (and LLMs) love

Read-side critical sections

rcu_read_lock / rcu_read_unlock

Pointer access

rcu_dereference

rcu_assign_pointer

Grace period management

synchronize_rcu

call_rcu — asynchronous grace period

kfree_rcu — RCU-deferred kfree

RCU-protected lists

RCU pointer replacement — full example

SRCU — Sleepable RCU

When to use RCU versus locks

RCU usage checklist