Skip to main content
GGUF is the binary file format used by ggml to store models. A GGUF file contains a header, an arbitrary set of typed key-value metadata pairs, tensor metadata, and optionally the raw tensor data blob.

File structure

A GGUF file is laid out as follows:
  1. File magic "GGUF" (4 bytes)
  2. File version (uint32_t)
  3. Number of tensors (int64_t)
  4. Number of key-value pairs (int64_t)
  5. Key-value pairs (keys are length-prefixed strings; values are typed)
  6. Tensor metadata (name, shape, type, data offset)
  7. Tensor data blob (optional, alignment-padded)

Constants

#define GGUF_MAGIC             "GGUF"
#define GGUF_VERSION           3
#define GGUF_DEFAULT_ALIGNMENT 32
#define GGUF_KEY_GENERAL_ALIGNMENT "general.alignment"
ConstantValueDescription
GGUF_MAGIC"GGUF"Magic bytes at the start of every GGUF file
GGUF_VERSION3Current format version
GGUF_DEFAULT_ALIGNMENT32Default byte alignment for tensor data
GGUF_KEY_GENERAL_ALIGNMENT"general.alignment"Optional metadata key that overrides the default alignment

gguf_type enum

Types that can be stored as GGUF key-value data.
enum gguf_type {
    GGUF_TYPE_UINT8   = 0,
    GGUF_TYPE_INT8    = 1,
    GGUF_TYPE_UINT16  = 2,
    GGUF_TYPE_INT16   = 3,
    GGUF_TYPE_UINT32  = 4,
    GGUF_TYPE_INT32   = 5,
    GGUF_TYPE_FLOAT32 = 6,
    GGUF_TYPE_BOOL    = 7,
    GGUF_TYPE_STRING  = 8,
    GGUF_TYPE_ARRAY   = 9,
    GGUF_TYPE_UINT64  = 10,
    GGUF_TYPE_INT64   = 11,
    GGUF_TYPE_FLOAT64 = 12,
    GGUF_TYPE_COUNT,
};
All enum values are stored as int32_t in the binary format. Booleans are stored as int8_t.

Context lifecycle

Creates an empty GGUF context with no keys or tensors.
struct gguf_context * gguf_init_empty(void);
Use this when building a new GGUF file from scratch. Free with gguf_free.
Reads a GGUF file and populates a context with its metadata and (optionally) tensor data.
struct gguf_context * gguf_init_from_file(
    const char               * fname,
    struct gguf_init_params   params);
fname
const char *
required
Path to the GGUF file to open.
params
struct gguf_init_params
required
Initialization parameters:
  • no_alloc (bool) — when true, tensor data is not loaded into memory; only metadata is read.
  • ctx (struct ggml_context **) — when non-NULL, a new ggml_context is created and tensor data is allocated into it.
Returns a context on success, or NULL on failure. Free with gguf_free.
// Example: load metadata only
struct gguf_init_params params = { .no_alloc = true, .ctx = NULL };
struct gguf_context * ctx = gguf_init_from_file("model.gguf", params);

// Example: load metadata and tensor data into a ggml context
struct ggml_context * ggml_ctx = NULL;
struct gguf_init_params params = { .no_alloc = false, .ctx = &ggml_ctx };
struct gguf_context * ctx = gguf_init_from_file("model.gguf", params);
Frees a GGUF context and all memory it owns.
void gguf_free(struct gguf_context * ctx);
ctx
struct gguf_context *
required
The context to free.

Key-value getters

Returns the total number of key-value pairs in the context.
int64_t gguf_get_n_kv(const struct gguf_context * ctx);
Looks up a key by name and returns its integer ID.
int64_t gguf_find_key(
    const struct gguf_context * ctx,
    const char                * key);
ctx
const struct gguf_context *
required
The GGUF context to search.
key
const char *
required
The key name to look up.
Returns the key ID (>= 0) if found, or -1 if the key does not exist.
Returns the key name for a given key ID.
const char * gguf_get_key(
    const struct gguf_context * ctx,
    int64_t                     key_id);
ctx
const struct gguf_context *
required
The GGUF context.
key_id
int64_t
required
A valid key ID in [0, gguf_get_n_kv(ctx)).
Returns the type of the value stored at the given key ID.
enum gguf_type gguf_get_kv_type(
    const struct gguf_context * ctx,
    int64_t                     key_id);
For array-typed keys, returns the element type of the array.
enum gguf_type gguf_get_arr_type(
    const struct gguf_context * ctx,
    int64_t                     key_id);
Returns the number of elements in an array-typed key.
size_t gguf_get_arr_n(
    const struct gguf_context * ctx,
    int64_t                     key_id);

Typed value getters

Each getter reads a scalar value of the corresponding type. Calling a getter with the wrong type will abort the program.
uint8_t      gguf_get_val_u8  (const struct gguf_context * ctx, int64_t key_id);
int8_t       gguf_get_val_i8  (const struct gguf_context * ctx, int64_t key_id);
uint16_t     gguf_get_val_u16 (const struct gguf_context * ctx, int64_t key_id);
int16_t      gguf_get_val_i16 (const struct gguf_context * ctx, int64_t key_id);
uint32_t     gguf_get_val_u32 (const struct gguf_context * ctx, int64_t key_id);
int32_t      gguf_get_val_i32 (const struct gguf_context * ctx, int64_t key_id);
float        gguf_get_val_f32 (const struct gguf_context * ctx, int64_t key_id);
uint64_t     gguf_get_val_u64 (const struct gguf_context * ctx, int64_t key_id);
int64_t      gguf_get_val_i64 (const struct gguf_context * ctx, int64_t key_id);
double       gguf_get_val_f64 (const struct gguf_context * ctx, int64_t key_id);
bool         gguf_get_val_bool(const struct gguf_context * ctx, int64_t key_id);
const char * gguf_get_val_str (const struct gguf_context * ctx, int64_t key_id);
Always call gguf_get_kv_type first and verify the type before calling a typed getter. Calling with a mismatched type aborts the program.

Common usage pattern

int64_t key_id = gguf_find_key(ctx, "general.architecture");
if (key_id >= 0 && gguf_get_kv_type(ctx, key_id) == GGUF_TYPE_STRING) {
    const char * arch = gguf_get_val_str(ctx, key_id);
    printf("Architecture: %s\n", arch);
}

KV setters

Setters add a new key-value pair or overwrite an existing one. The new or updated pair is always placed at the end of the list.
void gguf_set_val_u8  (struct gguf_context * ctx, const char * key, uint8_t      val);
void gguf_set_val_i8  (struct gguf_context * ctx, const char * key, int8_t       val);
void gguf_set_val_u16 (struct gguf_context * ctx, const char * key, uint16_t     val);
void gguf_set_val_i16 (struct gguf_context * ctx, const char * key, int16_t      val);
void gguf_set_val_u32 (struct gguf_context * ctx, const char * key, uint32_t     val);
void gguf_set_val_i32 (struct gguf_context * ctx, const char * key, int32_t      val);
void gguf_set_val_f32 (struct gguf_context * ctx, const char * key, float        val);
void gguf_set_val_u64 (struct gguf_context * ctx, const char * key, uint64_t     val);
void gguf_set_val_i64 (struct gguf_context * ctx, const char * key, int64_t      val);
void gguf_set_val_f64 (struct gguf_context * ctx, const char * key, double       val);
void gguf_set_val_bool(struct gguf_context * ctx, const char * key, bool         val);
void gguf_set_val_str (struct gguf_context * ctx, const char * key, const char * val);
Creates or replaces an array key with n elements of a primitive type.
void gguf_set_arr_data(
    struct gguf_context * ctx,
    const char          * key,
    enum gguf_type        type,
    const void          * data,
    size_t                n);
ctx
struct gguf_context *
required
The GGUF context to modify.
key
const char *
required
The key name.
type
enum gguf_type
required
Element type. Must not be GGUF_TYPE_ARRAY or GGUF_TYPE_STRING.
data
const void *
required
Raw data. The function copies n * sizeof(element) bytes.
n
size_t
required
Number of elements in the array.
Creates or replaces an array key with n string elements.
void gguf_set_arr_str(
    struct gguf_context * ctx,
    const char          * key,
    const char         ** data,
    size_t                n);
ctx
struct gguf_context *
required
The GGUF context to modify.
key
const char *
required
The key name.
data
const char **
required
Array of n null-terminated C strings. The function copies all strings.
n
size_t
required
Number of strings in the array.

Tensor operations

Returns the total number of tensors registered in the context.
int64_t gguf_get_n_tensors(const struct gguf_context * ctx);
Looks up a tensor by name and returns its integer ID.
int64_t gguf_find_tensor(
    const struct gguf_context * ctx,
    const char                * name);
ctx
const struct gguf_context *
required
The GGUF context to search.
name
const char *
required
The tensor name to look up.
Returns the tensor ID (>= 0) if found, or -1 if not found.
Returns the name of the tensor at the given index.
const char * gguf_get_tensor_name(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);
Returns the ggml_type of the tensor at the given index.
enum ggml_type gguf_get_tensor_type(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);
Returns the byte offset of the tensor’s data within the tensor data blob.
size_t gguf_get_tensor_offset(
    const struct gguf_context * ctx,
    int64_t                     tensor_id);
Add gguf_get_data_offset(ctx) to convert this to an offset from the start of the file.
Changes the stored type of a tensor. All tensor offsets following this tensor are recalculated immediately to keep the data contiguous.
void gguf_set_tensor_type(
    struct gguf_context * ctx,
    const char          * name,
    enum ggml_type        type);
ctx
struct gguf_context *
required
The GGUF context.
name
const char *
required
Name of the tensor to update.
type
enum ggml_type
required
New data type for the tensor.
Sets the tensor data by copying from the provided pointer. The source must contain at least gguf_get_tensor_size(ctx, id) bytes.
void gguf_set_tensor_data(
    struct gguf_context * ctx,
    const char          * name,
    const void          * data);
ctx
struct gguf_context *
required
The GGUF context.
name
const char *
required
Name of the tensor to update.
data
const void *
required
Source data. Must be at least gguf_get_tensor_size bytes.

Writing GGUF files

Writes the entire context (metadata and optionally tensor data) to a binary file.
bool gguf_write_to_file(
    const struct gguf_context * ctx,
    const char                * fname,
    bool                        only_meta);
ctx
const struct gguf_context *
required
The GGUF context to serialize.
fname
const char *
required
Output file path. The file is created or overwritten.
only_meta
bool
required
When true, only the header, KV pairs, and tensor metadata are written — tensor data is omitted. Use this for the two-pass write patterns shown below.
Returns true on success.

Write patterns

There are three supported ways to write a GGUF file:
Write everything in one call.
gguf_write_to_file(ctx, "output.gguf", /*only_meta=*/ false);
Write metadata first, then append tensor data separately.
gguf_write_to_file(ctx, "output.gguf", /*only_meta=*/ true);

FILE * f = fopen("output.gguf", "ab");
fwrite(tensor_data, 1, tensor_data_size, f);
fclose(f);
Reserve space for metadata at the front, write tensor data, then write metadata. Useful when tensor data is produced incrementally.
FILE * f = fopen("output.gguf", "wb");

// Reserve space for metadata
const size_t size_meta = gguf_get_meta_size(ctx);
fseek(f, size_meta, SEEK_SET);

// Write tensor data
fwrite(tensor_data, 1, tensor_data_size, f);

// Write metadata at the front
void * meta = malloc(size_meta);
gguf_get_meta_data(ctx, meta);
rewind(f);
fwrite(meta, 1, size_meta, f);
free(meta);

fclose(f);

Metadata helpers

Returns the byte offset from the start of the file at which tensor data begins.
size_t gguf_get_data_offset(const struct gguf_context * ctx);
Use this to seek to tensor data in the file: fseek(f, gguf_get_data_offset(ctx), SEEK_SET).
Returns the total size in bytes of the metadata section (header + KV pairs + tensor info + padding).
size_t gguf_get_meta_size(const struct gguf_context * ctx);
This value equals gguf_get_data_offset(ctx) for a fully populated context.
Serializes the metadata into a caller-provided buffer.
void gguf_get_meta_data(
    const struct gguf_context * ctx,
    void                      * data);
ctx
const struct gguf_context *
required
The GGUF context to serialize.
data
void *
required
Output buffer. Must be at least gguf_get_meta_size(ctx) bytes.

Build docs developers (and LLMs) love