kafka source reads data from Apache Kafka topics using consumer groups. It supports automatic offset management, acknowledgements, and configurable decoding.
Configuration
Parameters
Comma-separated list of Kafka bootstrap servers in
host:port format.Kafka topic names to consume from. Supports regex patterns starting with
^.Kafka consumer group ID.
Where to start reading if no offset exists for the consumer group.Options:
smallest, earliest, beginning, largest, latest, end, errorKafka session timeout in milliseconds.
Timeout for network requests in milliseconds.
Maximum time the broker may wait to fill the response in milliseconds.
Frequency that consumer offsets are committed in milliseconds.
Field name for the Kafka message key.
Field name for the Kafka topic name.
Field name for the Kafka partition.
Field name for the Kafka message offset.
Field name for Kafka message headers.
Advanced librdkafka configuration options.
Framing configuration for splitting byte streams into messages.
Decoding configuration for parsing messages.
Enable end-to-end acknowledgements.
SASL authentication configuration.
TLS configuration for encrypted connections.
Output Schema
The Kafka source produces log events with the following fields:| Field | Type | Description |
|---|---|---|
message | string | The decoded message payload |
timestamp | timestamp | Kafka message timestamp |
topic | string | Kafka topic name |
partition | integer | Kafka partition |
offset | integer | Kafka message offset |
message_key | string | Kafka message key (if present) |
headers | object | Kafka message headers (if present) |
source_type | string | Always “kafka” |
Examples
Basic Consumer
Multiple Topics with Pattern Matching
JSON Decoding
SASL Authentication
Custom Field Mapping
Advanced librdkafka Configuration
How It Works
Consumer Groups
The Kafka source uses consumer groups for scalability and fault tolerance. Multiple Vector instances with the samegroup_id automatically coordinate to distribute partition consumption.
Offset Management
Offsets are automatically committed based on thecommit_interval_ms setting. When acknowledgements are enabled, offsets are only committed after successful delivery to all sinks.
Rebalancing
When consumers join or leave the group, Kafka automatically rebalances partition assignments. The source handles rebalancing gracefully with configurable drain timeouts.Message Ordering
Messages within a partition are processed in order. Across partitions, ordering is not guaranteed.Performance
- Highly scalable with parallel partition consumption
- Performance depends on message size, network latency, and broker configuration
- Tune
fetch_wait_max_msandlibrdkafka_optionsfor optimal throughput - Consumer lag metrics available via
metrics.topic_lag_metric
Best Practices
- Use unique
group_idvalues for different Vector deployments - Set
auto_offset_resettoearliestto avoid data loss on new topics - Enable acknowledgements for mission-critical data
- Monitor consumer lag using Vector’s internal metrics
- Use topic regex patterns carefully to avoid unexpected consumption
- Configure appropriate session and socket timeouts for your network
- Test rebalance behavior under failure scenarios