GGUF API

GGUF is the binary file format used by ggml to store models. A GGUF file contains a header, an arbitrary set of typed key-value metadata pairs, tensor metadata, and optionally the raw tensor data blob.

File structure

A GGUF file is laid out as follows:

File magic "GGUF" (4 bytes)
File version (uint32_t)
Number of tensors (int64_t)
Number of key-value pairs (int64_t)
Key-value pairs (keys are length-prefixed strings; values are typed)
Tensor metadata (name, shape, type, data offset)
Tensor data blob (optional, alignment-padded)

Constants

#define GGUF_MAGIC             "GGUF"
#define GGUF_VERSION           3
#define GGUF_DEFAULT_ALIGNMENT 32
#define GGUF_KEY_GENERAL_ALIGNMENT "general.alignment"

Constant	Value	Description
`GGUF_MAGIC`	`"GGUF"`	Magic bytes at the start of every GGUF file
`GGUF_VERSION`	`3`	Current format version
`GGUF_DEFAULT_ALIGNMENT`	`32`	Default byte alignment for tensor data
`GGUF_KEY_GENERAL_ALIGNMENT`	`"general.alignment"`	Optional metadata key that overrides the default alignment

gguf_type enum

Types that can be stored as GGUF key-value data.

enum gguf_type {
    GGUF_TYPE_UINT8   = 0,
    GGUF_TYPE_INT8    = 1,
    GGUF_TYPE_UINT16  = 2,
    GGUF_TYPE_INT16   = 3,
    GGUF_TYPE_UINT32  = 4,
    GGUF_TYPE_INT32   = 5,
    GGUF_TYPE_FLOAT32 = 6,
    GGUF_TYPE_BOOL    = 7,
    GGUF_TYPE_STRING  = 8,
    GGUF_TYPE_ARRAY   = 9,
    GGUF_TYPE_UINT64  = 10,
    GGUF_TYPE_INT64   = 11,
    GGUF_TYPE_FLOAT64 = 12,
    GGUF_TYPE_COUNT,
};

All enum values are stored as int32_t in the binary format. Booleans are stored as int8_t.

Context lifecycle

gguf_init_empty

Creates an empty GGUF context with no keys or tensors.

struct gguf_context * gguf_init_empty(void);

Use this when building a new GGUF file from scratch. Free with gguf_free.

gguf_init_from_file

Reads a GGUF file and populates a context with its metadata and (optionally) tensor data.

struct gguf_context * gguf_init_from_file(
    const char               * fname,
    struct gguf_init_params   params);

fname

const char *

required

Path to the GGUF file to open.

params

struct gguf_init_params

required

Initialization parameters:

no_alloc (bool) — when true, tensor data is not loaded into memory; only metadata is read.
ctx (struct ggml_context **) — when non-NULL, a new ggml_context is created and tensor data is allocated into it.

Returns a context on success, or NULL on failure. Free with gguf_free.

// Example: load metadata only
struct gguf_init_params params = { .no_alloc = true, .ctx = NULL };
struct gguf_context * ctx = gguf_init_from_file("model.gguf", params);

// Example: load metadata and tensor data into a ggml context
struct ggml_context * ggml_ctx = NULL;
struct gguf_init_params params = { .no_alloc = false, .ctx = &ggml_ctx };
struct gguf_context * ctx = gguf_init_from_file("model.gguf", params);

gguf_free

Frees a GGUF context and all memory it owns.

void gguf_free(struct gguf_context * ctx);

ctx

struct gguf_context *

required

The context to free.

Key-value getters

gguf_get_n_kv

Returns the total number of key-value pairs in the context.

int64_t gguf_get_n_kv(const struct gguf_context * ctx);

gguf_find_key

Looks up a key by name and returns its integer ID.

int64_t gguf_find_key(
    const struct gguf_context * ctx,
    const char                * key);

ctx

const struct gguf_context *

required

The GGUF context to search.

key

const char *

required

The key name to look up.

Returns the key ID (>= 0) if found, or -1 if the key does not exist.

gguf_get_key

Returns the key name for a given key ID.

const char * gguf_get_key(
    const struct gguf_context * ctx,
    int64_t                     key_id);

ctx

const struct gguf_context *

required

The GGUF context.

key_id

int64_t

required

A valid key ID in [0, gguf_get_n_kv(ctx)).

gguf_get_kv_type

Returns the type of the value stored at the given key ID.

enum gguf_type gguf_get_kv_type(
    const struct gguf_context * ctx,
    int64_t                     key_id);

gguf_get_arr_type

For array-typed keys, returns the element type of the array.

enum gguf_type gguf_get_arr_type(
    const struct gguf_context * ctx,
    int64_t                     key_id);

gguf_get_arr_n

Returns the number of elements in an array-typed key.

size_t gguf_get_arr_n(
    const struct gguf_context * ctx,
    int64_t                     key_id);

Typed value getters

Each getter reads a scalar value of the corresponding type. Calling a getter with the wrong type will abort the program.

uint8_t      gguf_get_val_u8  (const struct gguf_context * ctx, int64_t key_id);
int8_t       gguf_get_val_i8  (const struct gguf_context * ctx, int64_t key_id);
uint16_t     gguf_get_val_u16 (const struct gguf_context * ctx, int64_t key_id);
int16_t      gguf_get_val_i16 (const struct gguf_context * ctx, int64_t key_id);
uint32_t     gguf_get_val_u32 (const struct gguf_context * ctx, int64_t key_id);
int32_t      gguf_get_val_i32 (const struct gguf_context * ctx, int64_t key_id);
float        gguf_get_val_f32 (const struct gguf_context * ctx, int64_t key_id);
uint64_t     gguf_get_val_u64 (const struct gguf_context * ctx, int64_t key_id);
int64_t      gguf_get_val_i64 (const struct gguf_context * ctx, int64_t key_id);
double       gguf_get_val_f64 (const struct gguf_context * ctx, int64_t key_id);
bool         gguf_get_val_bool(const struct gguf_context * ctx, int64_t key_id);
const char * gguf_get_val_str (const struct gguf_context * ctx, int64_t key_id);

Always call gguf_get_kv_type first and verify the type before calling a typed getter. Calling with a mismatched type aborts the program.

Common usage pattern

int64_t key_id = gguf_find_key(ctx, "general.architecture");
if (key_id >= 0 && gguf_get_kv_type(ctx, key_id) == GGUF_TYPE_STRING) {
    const char * arch = gguf_get_val_str(ctx, key_id);
    printf("Architecture: %s\n", arch);
}

KV setters

Setters add a new key-value pair or overwrite an existing one. The new or updated pair is always placed at the end of the list.

void gguf_set_val_u8  (struct gguf_context * ctx, const char * key, uint8_t      val);
void gguf_set_val_i8  (struct gguf_context * ctx, const char * key, int8_t       val);
void gguf_set_val_u16 (struct gguf_context * ctx, const char * key, uint16_t     val);
void gguf_set_val_i16 (struct gguf_context * ctx, const char * key, int16_t      val);
void gguf_set_val_u32 (struct gguf_context * ctx, const char * key, uint32_t     val);
void gguf_set_val_i32 (struct gguf_context * ctx, const char * key, int32_t      val);
void gguf_set_val_f32 (struct gguf_context * ctx, const char * key, float        val);
void gguf_set_val_u64 (struct gguf_context * ctx, const char * key, uint64_t     val);
void gguf_set_val_i64 (struct gguf_context * ctx, const char * key, int64_t      val);
void gguf_set_val_f64 (struct gguf_context * ctx, const char * key, double       val);
void gguf_set_val_bool(struct gguf_context * ctx, const char * key, bool         val);
void gguf_set_val_str (struct gguf_context * ctx, const char * key, const char * val);

gguf_set_arr_data

Creates or replaces an array key with n elements of a primitive type.

void gguf_set_arr_data(
    struct gguf_context * ctx,
    const char          * key,
    enum gguf_type        type,
    const void          * data,
    size_t                n);

ctx

struct gguf_context *

required

The GGUF context to modify.

key

const char *

required

The key name.

type

enum gguf_type

required

Element type. Must not be GGUF_TYPE_ARRAY or GGUF_TYPE_STRING.

data

const void *

required

Raw data. The function copies n * sizeof(element) bytes.

size_t

required

Number of elements in the array.

gguf_set_arr_str

Creates or replaces an array key with n string elements.

void gguf_set_arr_str(
    struct gguf_context * ctx,
    const char          * key,
    const char         ** data,
    size_t                n);

ctx

struct gguf_context *

required

The GGUF context to modify.

key

const char *

required

The key name.

data

const char **

required

Array of n null-terminated C strings. The function copies all strings.

size_t

required

Number of strings in the array.

Tensor operations

gguf_get_n_tensors

Returns the total number of tensors registered in the context.

int64_t gguf_get_n_tensors(const struct gguf_context * ctx);

gguf_find_tensor

Looks up a tensor by name and returns its integer ID.

int64_t gguf_find_tensor(
    const struct gguf_context * ctx,
    const char                * name);

ctx

const struct gguf_context *

required

The GGUF context to search.

name

const char *

required

The tensor name to look up.

Returns the tensor ID (>= 0) if found, or -1 if not found.

gguf_get_tensor_name

Returns the name of the tensor at the given index.

const char * gguf_get_tensor_name(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);

gguf_get_tensor_type

Returns the ggml_type of the tensor at the given index.

enum ggml_type gguf_get_tensor_type(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);

gguf_get_tensor_offset

Returns the byte offset of the tensor’s data within the tensor data blob.

size_t gguf_get_tensor_offset(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);

Add gguf_get_data_offset(ctx) to convert this to an offset from the start of the file.

gguf_set_tensor_type

Changes the stored type of a tensor. All tensor offsets following this tensor are recalculated immediately to keep the data contiguous.

void gguf_set_tensor_type(
    struct gguf_context * ctx,
    const char          * name,
    enum ggml_type        type);

ctx

struct gguf_context *

required

The GGUF context.

name

const char *

required

Name of the tensor to update.

type

enum ggml_type

required

New data type for the tensor.

gguf_set_tensor_data

Sets the tensor data by copying from the provided pointer. The source must contain at least gguf_get_tensor_size(ctx, id) bytes.

void gguf_set_tensor_data(
    struct gguf_context * ctx,
    const char          * name,
    const void          * data);

ctx

struct gguf_context *

required

The GGUF context.

name

const char *

required

Name of the tensor to update.

data

const void *

required

Source data. Must be at least gguf_get_tensor_size bytes.

Writing GGUF files

gguf_write_to_file

Writes the entire context (metadata and optionally tensor data) to a binary file.

bool gguf_write_to_file(
    const struct gguf_context * ctx,
    const char                * fname,
    bool                        only_meta);

ctx

const struct gguf_context *

required

The GGUF context to serialize.

fname

const char *

required

Output file path. The file is created or overwritten.

only_meta

bool

required

When true, only the header, KV pairs, and tensor metadata are written — tensor data is omitted. Use this for the two-pass write patterns shown below.

Returns true on success.

Write patterns

There are three supported ways to write a GGUF file:

Single-pass write

Write everything in one call.

gguf_write_to_file(ctx, "output.gguf", /*only_meta=*/ false);

Two-pass: metadata then data

Write metadata first, then append tensor data separately.

gguf_write_to_file(ctx, "output.gguf", /*only_meta=*/ true);

FILE * f = fopen("output.gguf", "ab");
fwrite(tensor_data, 1, tensor_data_size, f);
fclose(f);

Pre-allocated header

Reserve space for metadata at the front, write tensor data, then write metadata. Useful when tensor data is produced incrementally.

FILE * f = fopen("output.gguf", "wb");

// Reserve space for metadata
const size_t size_meta = gguf_get_meta_size(ctx);
fseek(f, size_meta, SEEK_SET);

// Write tensor data
fwrite(tensor_data, 1, tensor_data_size, f);

// Write metadata at the front
void * meta = malloc(size_meta);
gguf_get_meta_data(ctx, meta);
rewind(f);
fwrite(meta, 1, size_meta, f);
free(meta);

fclose(f);

Metadata helpers

gguf_get_data_offset

Returns the byte offset from the start of the file at which tensor data begins.

size_t gguf_get_data_offset(const struct gguf_context * ctx);

Use this to seek to tensor data in the file: fseek(f, gguf_get_data_offset(ctx), SEEK_SET).

gguf_get_meta_size

Returns the total size in bytes of the metadata section (header + KV pairs + tensor info + padding).

size_t gguf_get_meta_size(const struct gguf_context * ctx);

This value equals gguf_get_data_offset(ctx) for a fully populated context.

gguf_get_meta_data

Serializes the metadata into a caller-provided buffer.

void gguf_get_meta_data(
    const struct gguf_context * ctx,
    void                      * data);

ctx

const struct gguf_context *

required

The GGUF context to serialize.

data

void *

required

Output buffer. Must be at least gguf_get_meta_size(ctx) bytes.

Core API

Backend API

Optimization API

GGUF API

File structure

Constants

gguf_type enum

Context lifecycle

Key-value getters

Typed value getters

Common usage pattern

KV setters

Tensor operations

Writing GGUF files

Write patterns

Metadata helpers

Build docs developers (and LLMs) love

Core API

Backend API

Optimization API

GGUF API

​File structure

​Constants

​gguf_type enum

​Context lifecycle

​Key-value getters

​Typed value getters

​Common usage pattern

​KV setters

​Tensor operations

​Writing GGUF files

​Write patterns

​Metadata helpers

Build docs developers (and LLMs) love

File structure

Constants

gguf_type enum

Context lifecycle

Key-value getters

Typed value getters

Common usage pattern

KV setters

Tensor operations

Writing GGUF files

Write patterns

Metadata helpers