Sinks are Vector components that deliver events to external systems. They are the final stage in your Vector pipeline, sending logs, metrics, and traces to databases, object storage, SaaS platforms, and other destinations. Vector includes 40+ built-in sinks for common integrations.
How Sinks Work
Sinks receive events from sources and transforms, then:
Buffer events in memory or on disk
Batch events for efficient delivery
Encode events in the destination’s required format
Compress data to reduce network usage (optional)
Deliver events to the destination via network protocols
Retry failed deliveries with exponential backoff
Acknowledge successful delivery (when acknowledgements are enabled)
Sink Architecture
Sink Categories
AWS S3, CloudWatch Logs/Metrics, Kinesis, SQS, SNS
Google Cloud Cloud Storage, Cloud Logging, Pub/Sub, BigQuery
Azure Blob Storage, Monitor Logs, Event Hubs
Example: AWS S3 sink
sinks :
s3_archive :
type : aws_s3
inputs :
- parsed_logs
region : us-east-1
bucket : my-log-archive
key_prefix : "logs/date=%Y-%m-%d/"
compression : gzip
encoding :
codec : json
batch :
max_bytes : 10485760 # 10MB per file
timeout_secs : 300 # Or 5 minutes, whichever comes first
Example: AWS CloudWatch Logs
sinks :
cloudwatch :
type : aws_cloudwatch_logs
inputs :
- application_logs
region : us-east-1
group_name : /aws/application/prod
stream_name : "{{ host }}-{{ application }}" # Dynamic stream names
encoding :
codec : json
Datadog Unified observability platform for logs, metrics, and traces
New Relic Full-stack observability and APM
Splunk Enterprise log management and SIEM
Elastic Elasticsearch for search and analytics
Example: Datadog logs and metrics
sinks :
datadog_logs :
type : datadog_logs
inputs :
- application_logs
default_api_key : "${DD_API_KEY}"
endpoint : https://http-intake.logs.datadoghq.com
compression : gzip
datadog_metrics :
type : datadog_metrics
inputs :
- host_metrics
default_api_key : "${DD_API_KEY}"
endpoint : https://api.datadoghq.com
Example: Elasticsearch
sinks :
elasticsearch :
type : elasticsearch
inputs :
- structured_logs
endpoint : https://elasticsearch.example.com:9200
auth :
strategy : basic
user : elastic
password : "${ES_PASSWORD}"
bulk :
index : "logs-%Y.%m.%d" # Daily indices
action : create
encoding :
codec : json
buffer :
type : disk
max_size : 268435488 # 256MB disk buffer
Metrics Systems
Prometheus Prometheus remote write and exporter
InfluxDB Time-series database for metrics
Graphite Classic metrics storage system
StatsD StatsD protocol for metrics aggregation
Example: Prometheus Remote Write
sinks :
prometheus :
type : prometheus_remote_write
inputs :
- application_metrics
endpoint : https://prometheus.example.com/api/v1/write
default_namespace : app
auth :
strategy : bearer
token : "${PROM_TOKEN}"
Example: Prometheus Exporter
sinks :
prometheus_exporter :
type : prometheus_exporter
inputs :
- vector_metrics
address : 0.0.0.0:9598
default_namespace : vector
# Metrics available at http://localhost:9598/metrics
Databases
ClickHouse Columnar database for analytics
PostgreSQL Relational database (via COPY protocol)
Example: ClickHouse
sinks :
clickhouse :
type : clickhouse
inputs :
- structured_logs
endpoint : http://clickhouse:8123
database : logs
table : application_logs
skip_unknown_fields : false
encoding :
codec : json
batch :
max_events : 10000
timeout_secs : 10
Message Queues
Kafka Distributed streaming platform
NATS Lightweight messaging system
AMQP RabbitMQ and AMQP protocol
Redis In-memory data store with pub/sub
Example: Kafka
sinks :
kafka :
type : kafka
inputs :
- events
bootstrap_servers : kafka1:9092,kafka2:9092,kafka3:9092
topic : "logs-{{ environment }}"
key_field : request_id
encoding :
codec : json
compression : snappy
batch :
timeout_secs : 1
acknowledgements :
enabled : true
HTTP and Webhooks
Example: Generic HTTP sink
sinks :
http_endpoint :
type : http
inputs :
- logs
uri : https://api.example.com/logs
method : post
encoding :
codec : json
headers :
Authorization : "Bearer ${API_TOKEN}"
Content-Type : "application/json"
batch :
max_events : 100
request :
retry_attempts : 5
retry_initial_backoff_secs : 1
retry_max_duration_secs : 300
Development and Testing
console Print events to stdout (debugging)
blackhole Discard events (testing throughput)
Example: Console output
sinks :
debug :
type : console
inputs :
- logs
encoding :
codec : json
json :
pretty : true
Sink Configuration
Common Options
All sinks support these configuration options:
sinks :
my_sink :
type : <sink_type>
inputs :
- source_or_transform
# Buffering configuration
buffer :
type : memory # memory or disk
max_events : 500 # For memory buffers
max_size : 268435488 # For disk buffers (256MB)
when_full : block # block or drop_newest
# Batching configuration
batch :
max_events : 100 # Max events per batch
max_bytes : 1048576 # Max bytes per batch (1MB)
timeout_secs : 5 # Max time to wait for batch
# Encoding
encoding :
codec : json # json, text, protobuf, etc.
except_fields :
- sensitive_field # Exclude these fields
only_fields :
- field1 # Only include these fields
- field2
# Compression
compression : gzip # gzip, zstd, snappy, none
# Request/delivery settings
request :
concurrency : 5 # Parallel requests
retry_attempts : 5
retry_initial_backoff_secs : 1
retry_max_duration_secs : 300
timeout_secs : 60
# Acknowledgements
acknowledgements :
enabled : false # Wait for delivery confirmation
# Health checks
healthcheck :
enabled : true # Check connectivity at startup
Buffering Strategies
Memory Buffers (Default)
Disk Buffers
Fast, low-latency buffering in RAM. Suitable for most use cases. sinks :
my_sink :
buffer :
type : memory
max_events : 500 # Buffer up to 500 events
when_full : block # Backpressure when full
Pros : Fast, low overhead
Cons : Data loss on crash, limited capacityPersistent buffering for reliability and larger capacity. sinks :
critical_sink :
buffer :
type : disk
max_size : 1073741824 # 1GB on disk
when_full : block
Pros : No data loss on crash, large capacity, survives restarts
Cons : Slower, more I/O overheadSee Buffering for details.
Batching
Batching improves throughput by sending multiple events together:
sinks :
elasticsearch :
batch :
max_events : 100 # Send when 100 events collected
max_bytes : 1048576 # Or when 1MB of data
timeout_secs : 5 # Or after 5 seconds
# First condition met triggers the batch
Tuning guidelines:
Higher max_events : Better throughput, higher latency
Lower timeout_secs : Lower latency, more network requests
Larger max_bytes : More compression efficiency
Sinks support multiple encoding formats:
JSON
Text/Template
Logfmt
CSV
sinks :
json_sink :
encoding :
codec : json
json :
pretty : false # Compact JSON
sinks :
text_sink :
encoding :
codec : text
text :
template : '{{ timestamp }} {{ level }} {{ message }}'
sinks :
logfmt_sink :
encoding :
codec : logfmt
sinks :
csv_sink :
encoding :
codec : csv
csv :
fields :
- timestamp
- level
- message
Compression
Compress data before transmission to reduce bandwidth:
sinks :
compressed_sink :
compression : gzip # gzip (most compatible)
# compression: zstd # zstd (better compression)
# compression: snappy # snappy (fastest)
# compression: none # no compression
Compression comparison:
gzip : Best compatibility, good compression
zstd : Best compression ratio, fast decompression
snappy : Fastest, moderate compression
none : No CPU overhead, more bandwidth
Health Checks
Sinks run health checks at startup to verify connectivity:
sinks :
checked_sink :
healthcheck :
enabled : true # Default: true
# If healthcheck fails:
# - Vector logs an error
# - Vector may refuse to start (configurable globally)
Disable globally:
# vector.yaml
healthchecks :
enabled : false # Skip all health checks
require_healthy : false # Don't block startup on failures
Reliability and Delivery Guarantees
Acknowledgements
Acknowledgements provide end-to-end delivery confirmation:
sources :
kafka_source :
type : kafka
acknowledgements :
enabled : true # Don't commit offsets until ack
transforms :
process :
type : remap
inputs : [ kafka_source ]
source : |
. = parse_json!(.message)
sinks :
elasticsearch :
type : elasticsearch
inputs : [ process ]
acknowledgements :
enabled : true # Ack only after ES confirms write
When to use:
Financial data
Audit logs
Compliance-required data
Trade-offs:
Higher latency (wait for confirmation)
More memory (track in-flight events)
Better reliability (no data loss on sink failure)
Both the source and sink must have acknowledgements enabled for end-to-end delivery guarantees.
Retry Logic
Sinks automatically retry failed deliveries:
sinks :
resilient_sink :
request :
retry_attempts : 5 # Try up to 5 times
retry_initial_backoff_secs : 1 # Start with 1s delay
retry_max_duration_secs : 300 # Give up after 5 minutes
# Exponential backoff: 1s, 2s, 4s, 8s, 16s
Retry triggers:
Network errors (connection refused, timeouts)
HTTP 429 (rate limiting)
HTTP 5xx (server errors)
Not retried:
HTTP 4xx (except 429) - client errors
Authentication failures
Malformed requests
Error Handling
When all retries are exhausted:
Event is dropped
Error is logged
component_errors_total metric is incremented
Downstream acknowledgements fail (if enabled)
To prevent data loss:
Use disk buffers for critical sinks
Enable acknowledgements
Configure dead letter queues (DLQ):
sinks :
primary :
type : elasticsearch
# ... config ...
# Send failures to S3 for later reprocessing
dlq :
type : aws_s3
inputs :
- primary._undeliverable # Special output (future feature)
Concurrency
Increase parallel requests for better throughput:
sinks :
high_throughput :
type : http
request :
concurrency : 20 # 20 parallel requests
batch :
max_events : 100
timeout_secs : 1
Guidelines:
Start with low concurrency (5-10)
Increase until destination is saturated
Monitor destination metrics (CPU, response time)
Consider destination rate limits
Batching
Larger batches improve throughput:
sinks :
optimized :
batch :
max_events : 1000 # Large batches for high throughput
max_bytes : 10485760 # 10MB
timeout_secs : 10 # Accept higher latency
Compression
For remote sinks, compression usually helps:
sinks :
remote :
compression : gzip
# Reduces network usage by 70-90%
# Adds ~5-10ms CPU overhead
Network Tuning
sinks :
tuned :
request :
timeout_secs : 60 # Match destination latency
rate_limit_num : 100 # Max 100 requests...
rate_limit_duration_secs : 1 # ...per second
Best Practices
Memory : Fast path for non-critical data
Disk : Critical data, large buffers, unstable networks
sinks :
audit_logs :
buffer :
type : disk # Critical data
max_size : 1073741824
debug_logs :
buffer :
type : memory # Best effort
max_events : 500
Watch these metrics:
component_sent_events_total: Events delivered
component_errors_total: Delivery failures
buffer_events: Buffer utilization
request_duration_seconds: Delivery latency
sources :
metrics :
type : internal_metrics
sinks :
prometheus :
type : prometheus_exporter
inputs : [ metrics ]
address : 0.0.0.0:9598
Configure appropriate timeouts
Use templates for dynamic routing
sinks :
s3_dynamic :
type : aws_s3
bucket : logs
key_prefix : "{{ environment }}/{{ application }}/date=%Y-%m-%d/"
Run health checks: vector validate --config vector.yaml
Test with console sink first
Use blackhole sink for performance testing
Monitor metrics during initial rollout
Troubleshooting
Events Not Reaching Destination
Check sink health :
curl http://localhost:8686/health
Review logs :
VECTOR_LOG = debug vector --config vector.yaml
Verify network connectivity :
telnet api.example.com 443
Check metrics :
curl http://localhost:8686/metrics | grep component_errors
High Latency
Reduce batch sizes
Increase concurrency
Check destination performance
Consider regional endpoints (reduce network distance)
Buffer Full / Backpressure
Increase buffer size
Add more sink instances (horizontal scaling)
Sample high-volume data
Optimize destination performance
Authentication Errors
Verify credentials in environment variables
Check IAM roles/permissions (for cloud sinks)
Ensure tokens haven’t expired
Review destination access logs