Binary encoding format

Protocol Buffers serializes messages to a compact binary format. Understanding the encoding helps you choose the right field types, debug wire-level issues, and reason about forward/backward compatibility.

Wire types

The binary format is a sequence of tag–value pairs. Each pair begins with a tag that encodes both the field number and the wire type. The wire type tells the decoder how many bytes to consume for the value.

Wire type	ID	Used for
Varint	0	`int32`, `int64`, `uint32`, `uint64`, `sint32`, `sint64`, `bool`, `enum`
64-bit	1	`fixed64`, `sfixed64`, `double`
Length-delimited	2	`string`, `bytes`, embedded messages, packed repeated fields
Start group	3	Groups (deprecated)
End group	4	Groups (deprecated)
32-bit	5	`fixed32`, `sfixed32`, `float`

Wire types 3 and 4 (groups) are deprecated. You will not encounter them in proto3 files, but you may see them in older proto2 data.

Field tags

The tag is a varint formed by combining the field number and wire type:

tag = (field_number << 3) | wire_type

For example, field number 1 with wire type 0 (varint) produces:

tag = (1 << 3) | 0 = 0x08

Field number 2 with wire type 2 (length-delimited) produces:

tag = (2 << 3) | 2 = 0x12

Varint encoding

Varints use one or more bytes to encode an integer. The most significant bit (MSB) of each byte is a continuation bit: 1 means more bytes follow, 0 means this is the last byte. The remaining 7 bits of each byte carry the data, in little-endian order.

Example: encoding the integer 300

The value 300 in binary is 100101100. Splitting into 7-bit groups (little-endian):

300 = 0b100101100
    → groups: 0101100  0000010
    → with continuation bits:
       1_0101100  0_0000010
       = 0xAC     0x02

Hex dump:

AC 02

Example: encoding the integer 1

Single byte, no continuation:

Negative numbers and ZigZag encoding

Standard varint encoding of a negative int32 always uses 10 bytes because the sign bit propagates through the full 64-bit representation. Use sint32 / sint64 for fields that frequently carry negative values — these use ZigZag encoding that maps signed integers to unsigned integers:

ZigZag(n) = (n << 1) ^ (n >> 31)   // for sint32
ZigZag(n) = (n << 1) ^ (n >> 63)   // for sint64

Signed	ZigZag encoded
`0`	`0`
`-1`	`1`
`1`	`2`
`-2`	`3`
`2147483647`	`4294967294`
`-2147483648`	`4294967295`

Encoding a simple message

Consider this message:

message Test {
  int32 a = 1;
}

With a = 150, the encoded bytes are:

08 96 01

Breaking it down:

Bytes	Meaning
`08`	Tag: field 1, wire type 0 (varint)
`96 01`	Value 150 as varint (`0x96` = `10010110`, continuation set; `0x01` = `00000001`)

Decoding 96 01:

96 → 0b10010110 → continuation bit set, data = 0010110
01 → 0b00000001 → continuation bit clear, data = 0000001
Concatenate little-endian: 0000001_0010110 = 10010110 = 150

Strings and bytes

Strings and bytes use wire type 2 (length-delimited). The value is the byte count (as a varint) followed by the UTF-8 encoded string bytes. For a string name = 2 field with value "testing":

12 07 74 65 73 74 69 6E 67

Bytes	Meaning
`12`	Tag: field 2, wire type 2
`07`	Length: 7 bytes
`74 65 73 74 69 6E 67`	UTF-8 bytes for `"testing"`

Embedded messages

Embedded messages also use wire type 2. The serialized sub-message bytes follow a length varint, exactly like a string or bytes field. For this schema:

message Inner {
  int32 x = 1;
}

message Outer {
  Inner inner = 3;
}

With inner.x = 150, the encoding of Outer is:

1A 03 08 96 01

Bytes	Meaning
`1A`	Tag: field 3, wire type 2
`03`	Length: 3 bytes
`08 96 01`	Encoded `Inner` with `x = 150`

Repeated field encoding

Unpacked (non-scalar or legacy)

Each element of a repeated field is encoded as a separate tag–value pair using the same field number:

// repeated int32 ids = 6 with values [1, 2, 3]:
01    // tag field 6, value 1
02    // tag field 6, value 2
03    // tag field 6, value 3

Packed encoding (default for scalar types in proto3)

In proto3, repeated scalar fields are packed by default. All values are concatenated into a single length-delimited record, saving the overhead of repeating the tag for each element.

message Test {
  repeated int32 d = 4;
}

With d = [3, 270, 86942], the packed encoding is:

22 06 03 8E 02 9E A7 05

Bytes	Meaning
`22`	Tag: field 4, wire type 2
`06`	Payload length: 6 bytes
`03`	Value 3
`8E 02`	Value 270
`9E A7 05`	Value 86942

A parser must be able to accept both packed and unpacked formats for backward compatibility, even in proto3. The packed option defaults to true for scalar numeric repeated fields.

Map encoding

Map fields are encoded as repeated entries of an auto-generated message type:

// map<string, int32> scores = 1;
// is equivalent to:
message ScoresEntry {
  string key = 1;
  int32 value = 2;
}
repeated ScoresEntry scores = 1;

Each map entry uses wire type 2 (length-delimited), encoding the key and value as fields of the synthetic entry message.

Field ordering and unknown fields

Fields are not required to appear in field-number order in the wire format. Parsers must handle any ordering. When a parser encounters a field number that is not defined in the current schema, it stores the raw tag–value bytes as unknown fields. This is the mechanism that enables forward compatibility: old code reading a message written by new code preserves any fields it does not understand.

Wire compatibility

Certain field type changes are wire-compatible because they share the same wire type:

Compatible types (wire type 0)
`int32`, `int64`, `uint32`, `uint64`, `bool`, `enum`

Compatible types (wire type 2)
`string`, `bytes`, embedded messages, packed repeated fields

Changing a field type to one with a different wire type will cause parse errors for any data written in the old format. Always check wire type compatibility before changing a field type in a production schema.

Get Started

Core Concepts

Language Guides

Advanced Topics

Tooling

Reference

Resources

Wire types

Field tags

Varint encoding

Example: encoding the integer 300

Example: encoding the integer 1

Negative numbers and ZigZag encoding

Encoding a simple message

Strings and bytes

Embedded messages

Repeated field encoding

Unpacked (non-scalar or legacy)

Packed encoding (default for scalar types in proto3)

Map encoding

Field ordering and unknown fields

Wire compatibility

Build docs developers (and LLMs) love

Get Started

Core Concepts

Language Guides

Advanced Topics

Tooling

Reference

Resources

​Wire types

​Field tags

​Varint encoding

​Example: encoding the integer 300

​Example: encoding the integer 1

​Negative numbers and ZigZag encoding

​Encoding a simple message

​Strings and bytes

​Embedded messages

​Repeated field encoding

​Unpacked (non-scalar or legacy)

​Packed encoding (default for scalar types in proto3)

​Map encoding

​Field ordering and unknown fields

​Wire compatibility

Build docs developers (and LLMs) love

Wire types

Field tags

Varint encoding

Example: encoding the integer 300

Example: encoding the integer 1

Negative numbers and ZigZag encoding

Encoding a simple message

Strings and bytes

Embedded messages

Repeated field encoding

Unpacked (non-scalar or legacy)

Packed encoding (default for scalar types in proto3)

Map encoding

Field ordering and unknown fields

Wire compatibility