Skip to main content
The workqueue (wq) API provides asynchronous process execution contexts. Instead of creating a dedicated thread per subsystem, drivers queue work items — simple structs holding a function pointer — onto a workqueue. An independent kernel thread (a worker) picks up and executes the work. The current implementation is the Concurrency Managed Workqueue (cmwq), which automatically manages the number of worker threads to keep concurrency at a minimal but sufficient level.
#include <linux/workqueue.h>

Work item types

struct work_struct {
    atomic_long_t data;
    struct list_head entry;
    work_func_t func;
    /* ... lockdep, debug fields ... */
};

struct delayed_work {
    struct work_struct work;
    struct timer_list timer;
    struct workqueue_struct *wq;
    int cpu;
};

struct rcu_work {
    struct work_struct work;
    struct rcu_head rcu;
    struct workqueue_struct *wq;
};
A work_func_t has the signature:
typedef void (*work_func_t)(struct work_struct *work);
Retrieve the enclosing structure from the work_struct pointer using container_of:
struct my_device {
    struct work_struct work;
    int value;
};

static void my_work_fn(struct work_struct *work)
{
    struct my_device *dev = container_of(work, struct my_device, work);
    /* use dev->value */
}

Declaring and initializing work items

Static declaration

/* Declare and initialize a work_struct at compile time */
DECLARE_WORK(my_work, my_work_fn);

/* Declare and initialize a delayed_work at compile time */
DECLARE_DELAYED_WORK(my_delayed_work, my_delayed_fn);

Dynamic initialization

void INIT_WORK(struct work_struct *work, work_func_t func);
void INIT_DELAYED_WORK(struct delayed_work *dwork, work_func_t func);
void INIT_RCU_WORK(struct rcu_work *rwork, work_func_t func);
work
struct work_struct *
required
Pointer to the work item to initialize.
func
work_func_t
required
The function to execute. Signature: void fn(struct work_struct *work).
struct my_device {
    struct work_struct work;
};

static int my_probe(struct platform_device *pdev)
{
    struct my_device *dev = devm_kzalloc(&pdev->dev, sizeof(*dev), GFP_KERNEL);
    if (!dev)
        return -ENOMEM;

    INIT_WORK(&dev->work, my_work_fn);
    return 0;
}

Scheduling work

schedule_work

bool schedule_work(struct work_struct *work);
Queue work on the global system_wq workqueue for execution on the current CPU. Returns true if the work was successfully queued, false if it was already pending.

schedule_delayed_work

bool schedule_delayed_work(struct delayed_work *dwork, unsigned long delay);
Queue dwork after delay jiffies. If delay is 0, the work executes immediately. Returns true if queued, false if already pending.
delay
unsigned long
required
Delay in jiffies before the work is executed. Use msecs_to_jiffies(ms) or usecs_to_jiffies(us) for time-based delays.
/* Execute after 500 ms */
schedule_delayed_work(&dev->dwork, msecs_to_jiffies(500));

queue_work

bool queue_work(struct workqueue_struct *wq, struct work_struct *work);
Queue work on a specific workqueue wq rather than the global system_wq. Returns true if queued, false if already pending.
wq
struct workqueue_struct *
required
Target workqueue. Use a system workqueue or one created with alloc_workqueue.

queue_delayed_work

bool queue_delayed_work(struct workqueue_struct *wq,
                        struct delayed_work *dwork,
                        unsigned long delay);
Queue a delayed work item on a specific workqueue.
queue_delayed_work(my_wq, &dev->dwork, msecs_to_jiffies(100));

Flushing and canceling work

flush_work

bool flush_work(struct work_struct *work);
Wait until work has finished executing. If work is currently pending (not yet started), it is allowed to start and then waited upon. Returns true if the work was pending, false if it had already completed.
Do not call flush_work with any lock held that the work function also acquires — this is a deadlock.

cancel_work_sync

bool cancel_work_sync(struct work_struct *work);
Cancel work and wait for any currently executing instance to finish. After this returns, work is guaranteed not to be running and will not run unless re-queued. Returns true if the work was pending.

cancel_delayed_work_sync

bool cancel_delayed_work_sync(struct delayed_work *dwork);
Cancel a pending delayed work item and wait for any in-progress execution to complete.
/* In device remove / cleanup */
cancel_delayed_work_sync(&dev->dwork);

flush_workqueue

void flush_workqueue(struct workqueue_struct *wq);
Wait until all work items currently queued on wq have finished. Work items queued after this call begins are not waited on.

Creating custom workqueues

alloc_workqueue

struct workqueue_struct *alloc_workqueue(const char *fmt,
                                         unsigned int flags,
                                         int max_active, ...);
fmt
const char *
required
printf-style name for the workqueue, also used for the rescuer thread.
flags
unsigned int
required
Workqueue behavior flags (see below).
max_active
int
required
Maximum number of concurrently executing work items per CPU. Specify 0 to use the default (1024). Use alloc_ordered_workqueue when strict serial execution is required.
/* Create a high-priority, unbound workqueue */
static struct workqueue_struct *my_wq;

my_wq = alloc_workqueue("my_driver", WQ_HIGHPRI | WQ_UNBOUND, 0);
if (!my_wq)
    return -ENOMEM;

Workqueue flags

Work items are not bound to any specific CPU. Workers are managed by special unbound worker pools. Useful when work is long-running or CPU-intensive and should be scheduled freely by the system scheduler.
Work items are queued to the high-priority worker pool. Workers run with elevated scheduling priority (lower nice value). Use for latency-sensitive work.
The workqueue participates in system suspend freeze operations. Work items are drained and no new items execute until the system thaws. Use for work associated with user-facing activity that should not run during suspend.
Required for any workqueue that may be used in memory reclaim paths. Guarantees at least one execution context is available regardless of memory pressure, via a reserved rescue worker.
my_wq = alloc_workqueue("my_reclaim_wq", WQ_MEM_RECLAIM, 0);
Work items do not contribute to the concurrency level tracked by the worker pool. CPU-intensive work items do not prevent other items in the same pool from starting. Use for work that is expected to consume many CPU cycles.
BH workqueues execute in softirq context on the queueing CPU. All BH work items execute in queueing order. BH work items cannot sleep. Must use max_active = 0. Only WQ_HIGHPRI is permitted as an additional flag.

destroy_workqueue

void destroy_workqueue(struct workqueue_struct *wq);
Drain wq and free all associated resources. Implicitly calls flush_workqueue before freeing. Call this from module exit or device remove.
static void __exit my_exit(void)
{
    destroy_workqueue(my_wq);
}

System workqueues

For work items that do not require special isolation, use one of the pre-allocated system workqueues. There is no performance difference between a dedicated workqueue and a system workqueue.
WorkqueueDescription
system_wqBound, normal priority. General-purpose. Equivalent to schedule_work().
system_highpri_wqBound, high priority. For latency-sensitive work.
system_long_wqBound, normal priority. For work that may run for a long time.
system_unbound_wqUnbound. For work items that benefit from scheduler freedom.
system_freezable_wqBound, freezable. Participates in system suspend.
system_power_efficient_wqBound or unbound depending on workqueue.power_efficient.
If a subsystem may queue more than max_active work items simultaneously, it risks saturating a system workqueue and causing potential deadlocks. In that case, allocate a dedicated workqueue.

Worker thread lifetime

The cmwq manages worker pools automatically:
1

Work queued

A work item is appended to the shared worklist of the target worker pool (determined by CPU affinity and priority).
2

Worker awakened

If no worker is currently running on the CPU, a sleeping worker is woken. If all workers are busy and a new context is needed, a new worker thread is created.
3

Work executes

The worker calls the work function. While executing, the worker is considered active and counts toward the pool’s concurrency level.
4

Worker idles

When the worklist is empty, workers become idle. cmwq keeps idle workers around for a short time before terminating them to avoid the cost of constant creation and destruction.
Worker threads appear in ps output as [kworker/CPU:ID] for bound workers or [kworker/uN:ID] for unbound workers.

Non-reentrance guarantee

A work item is guaranteed not to be re-entrant (executing on multiple CPUs simultaneously) provided:
  1. The work function pointer has not been changed.
  2. The work item has not been queued to a different workqueue.
  3. The work item has not been re-initialized with INIT_WORK.
Re-queuing a work item to the same queue from within its own work function is safe.

Debugging

# Trace work item queuing to find a hot work producer
echo workqueue:workqueue_queue_work > /sys/kernel/tracing/set_event
cat /sys/kernel/tracing/trace_pipe > out.txt

# Show stack trace of a misbehaving kworker
cat /proc/THE_OFFENDING_KWORKER/stack

# Dump workqueue configuration and pool assignments
tools/workqueue/wq_dump.py

# Monitor workqueue statistics
tools/workqueue/wq_monitor.py events

Build docs developers (and LLMs) love