Sinks - Vector

Sinks are Vector components that deliver events to external systems. They are the final stage in your Vector pipeline, sending logs, metrics, and traces to databases, object storage, SaaS platforms, and other destinations. Vector includes 40+ built-in sinks for common integrations.

How Sinks Work

Sinks receive events from sources and transforms, then:

Buffer events in memory or on disk
Batch events for efficient delivery
Encode events in the destination’s required format
Compress data to reduce network usage (optional)
Deliver events to the destination via network protocols
Retry failed deliveries with exponential backoff
Acknowledge successful delivery (when acknowledgements are enabled)

Sink Architecture

Sink Categories

Cloud Platform Sinks

AWS

S3, CloudWatch Logs/Metrics, Kinesis, SQS, SNS

Google Cloud

Cloud Storage, Cloud Logging, Pub/Sub, BigQuery

Azure

Blob Storage, Monitor Logs, Event Hubs

Example: AWS S3 sink

sinks:
  s3_archive:
    type: aws_s3
    inputs:
      - parsed_logs
    region: us-east-1
    bucket: my-log-archive
    key_prefix: "logs/date=%Y-%m-%d/"
    compression: gzip
    encoding:
      codec: json
    batch:
      max_bytes: 10485760    # 10MB per file
      timeout_secs: 300       # Or 5 minutes, whichever comes first

Example: AWS CloudWatch Logs

sinks:
  cloudwatch:
    type: aws_cloudwatch_logs
    inputs:
      - application_logs
    region: us-east-1
    group_name: /aws/application/prod
    stream_name: "{{ host }}-{{ application }}"  # Dynamic stream names
    encoding:
      codec: json

Observability Platforms

Datadog

Unified observability platform for logs, metrics, and traces

New Relic

Full-stack observability and APM

Splunk

Enterprise log management and SIEM

Elastic

Elasticsearch for search and analytics

Example: Datadog logs and metrics

sinks:
  datadog_logs:
    type: datadog_logs
    inputs:
      - application_logs
    default_api_key: "${DD_API_KEY}"
    endpoint: https://http-intake.logs.datadoghq.com
    compression: gzip
  
  datadog_metrics:
    type: datadog_metrics
    inputs:
      - host_metrics
    default_api_key: "${DD_API_KEY}"
    endpoint: https://api.datadoghq.com

Example: Elasticsearch

sinks:
  elasticsearch:
    type: elasticsearch
    inputs:
      - structured_logs
    endpoint: https://elasticsearch.example.com:9200
    auth:
      strategy: basic
      user: elastic
      password: "${ES_PASSWORD}"
    bulk:
      index: "logs-%Y.%m.%d"  # Daily indices
      action: create
    encoding:
      codec: json
    buffer:
      type: disk
      max_size: 268435488     # 256MB disk buffer

Metrics Systems

Prometheus

Prometheus remote write and exporter

InfluxDB

Time-series database for metrics

Graphite

Classic metrics storage system

StatsD

StatsD protocol for metrics aggregation

Example: Prometheus Remote Write

sinks:
  prometheus:
    type: prometheus_remote_write
    inputs:
      - application_metrics
    endpoint: https://prometheus.example.com/api/v1/write
    default_namespace: app
    auth:
      strategy: bearer
      token: "${PROM_TOKEN}"

Example: Prometheus Exporter

sinks:
  prometheus_exporter:
    type: prometheus_exporter
    inputs:
      - vector_metrics
    address: 0.0.0.0:9598
    default_namespace: vector
    # Metrics available at http://localhost:9598/metrics

Databases

ClickHouse

Columnar database for analytics

PostgreSQL

Relational database (via COPY protocol)

Example: ClickHouse

sinks:
  clickhouse:
    type: clickhouse
    inputs:
      - structured_logs
    endpoint: http://clickhouse:8123
    database: logs
    table: application_logs
    skip_unknown_fields: false
    encoding:
      codec: json
    batch:
      max_events: 10000
      timeout_secs: 10

Message Queues

Kafka

Distributed streaming platform

NATS

Lightweight messaging system

AMQP

RabbitMQ and AMQP protocol

Redis

In-memory data store with pub/sub

Example: Kafka

sinks:
  kafka:
    type: kafka
    inputs:
      - events
    bootstrap_servers: kafka1:9092,kafka2:9092,kafka3:9092
    topic: "logs-{{ environment }}"
    key_field: request_id
    encoding:
      codec: json
    compression: snappy
    batch:
      timeout_secs: 1
    acknowledgements:
      enabled: true

HTTP and Webhooks

Example: Generic HTTP sink

sinks:
  http_endpoint:
    type: http
    inputs:
      - logs
    uri: https://api.example.com/logs
    method: post
    encoding:
      codec: json
    headers:
      Authorization: "Bearer ${API_TOKEN}"
      Content-Type: "application/json"
    batch:
      max_events: 100
    request:
      retry_attempts: 5
      retry_initial_backoff_secs: 1
      retry_max_duration_secs: 300

Development and Testing

console

Print events to stdout (debugging)

blackhole

Discard events (testing throughput)

file

Write to local files

Example: Console output

sinks:
  debug:
    type: console
    inputs:
      - logs
    encoding:
      codec: json
      json:
        pretty: true

Sink Configuration

Common Options

All sinks support these configuration options:

sinks:
  my_sink:
    type: <sink_type>
    inputs:
      - source_or_transform
    
    # Buffering configuration
    buffer:
      type: memory              # memory or disk
      max_events: 500           # For memory buffers
      max_size: 268435488       # For disk buffers (256MB)
      when_full: block          # block or drop_newest
    
    # Batching configuration
    batch:
      max_events: 100           # Max events per batch
      max_bytes: 1048576        # Max bytes per batch (1MB)
      timeout_secs: 5           # Max time to wait for batch
    
    # Encoding
    encoding:
      codec: json               # json, text, protobuf, etc.
      except_fields:
        - sensitive_field       # Exclude these fields
      only_fields:
        - field1                # Only include these fields
        - field2
    
    # Compression
    compression: gzip           # gzip, zstd, snappy, none
    
    # Request/delivery settings
    request:
      concurrency: 5            # Parallel requests
      retry_attempts: 5
      retry_initial_backoff_secs: 1
      retry_max_duration_secs: 300
      timeout_secs: 60
    
    # Acknowledgements
    acknowledgements:
      enabled: false            # Wait for delivery confirmation
    
    # Health checks
    healthcheck:
      enabled: true             # Check connectivity at startup

Buffering Strategies

Memory Buffers (Default)
Disk Buffers

Fast, low-latency buffering in RAM. Suitable for most use cases.

sinks:
  my_sink:
    buffer:
      type: memory
      max_events: 500       # Buffer up to 500 events
      when_full: block      # Backpressure when full

Pros: Fast, low overhead Cons: Data loss on crash, limited capacity

Persistent buffering for reliability and larger capacity.

sinks:
  critical_sink:
    buffer:
      type: disk
      max_size: 1073741824  # 1GB on disk
      when_full: block

Pros: No data loss on crash, large capacity, survives restarts Cons: Slower, more I/O overheadSee Buffering for details.

Batching

Batching improves throughput by sending multiple events together:

sinks:
  elasticsearch:
    batch:
      max_events: 100           # Send when 100 events collected
      max_bytes: 1048576        # Or when 1MB of data
      timeout_secs: 5           # Or after 5 seconds
    # First condition met triggers the batch

Tuning guidelines:

Higher max_events: Better throughput, higher latency
Lower timeout_secs: Lower latency, more network requests
Larger max_bytes: More compression efficiency

Encoding Formats

Sinks support multiple encoding formats:

JSON
Text/Template
Logfmt
CSV

sinks:
  json_sink:
    encoding:
      codec: json
      json:
        pretty: false       # Compact JSON

sinks:
  text_sink:
    encoding:
      codec: text
      text:
        template: '{{ timestamp }} {{ level }} {{ message }}'

sinks:
  logfmt_sink:
    encoding:
      codec: logfmt

sinks:
  csv_sink:
    encoding:
      codec: csv
      csv:
        fields:
          - timestamp
          - level
          - message

Compression

Compress data before transmission to reduce bandwidth:

sinks:
  compressed_sink:
    compression: gzip           # gzip (most compatible)
    # compression: zstd         # zstd (better compression)
    # compression: snappy       # snappy (fastest)
    # compression: none         # no compression

Compression comparison:

gzip: Best compatibility, good compression
zstd: Best compression ratio, fast decompression
snappy: Fastest, moderate compression
none: No CPU overhead, more bandwidth

Health Checks

Sinks run health checks at startup to verify connectivity:

sinks:
  checked_sink:
    healthcheck:
      enabled: true             # Default: true
    # If healthcheck fails:
    # - Vector logs an error
    # - Vector may refuse to start (configurable globally)

Disable globally:

# vector.yaml
healthchecks:
  enabled: false                # Skip all health checks
  require_healthy: false        # Don't block startup on failures

Reliability and Delivery Guarantees

Acknowledgements

Acknowledgements provide end-to-end delivery confirmation:

sources:
  kafka_source:
    type: kafka
    acknowledgements:
      enabled: true             # Don't commit offsets until ack

transforms:
  process:
    type: remap
    inputs: [kafka_source]
    source: |
      . = parse_json!(.message)

sinks:
  elasticsearch:
    type: elasticsearch
    inputs: [process]
    acknowledgements:
      enabled: true             # Ack only after ES confirms write

When to use:

Financial data
Audit logs
Compliance-required data

Trade-offs:

Higher latency (wait for confirmation)
More memory (track in-flight events)
Better reliability (no data loss on sink failure)

Both the source and sink must have acknowledgements enabled for end-to-end delivery guarantees.

Retry Logic

Sinks automatically retry failed deliveries:

sinks:
  resilient_sink:
    request:
      retry_attempts: 5                   # Try up to 5 times
      retry_initial_backoff_secs: 1       # Start with 1s delay
      retry_max_duration_secs: 300        # Give up after 5 minutes
    # Exponential backoff: 1s, 2s, 4s, 8s, 16s

Retry triggers:

Network errors (connection refused, timeouts)
HTTP 429 (rate limiting)
HTTP 5xx (server errors)

Not retried:

HTTP 4xx (except 429) - client errors
Authentication failures
Malformed requests

Error Handling

When all retries are exhausted:

Event is dropped
Error is logged
component_errors_total metric is incremented
Downstream acknowledgements fail (if enabled)

To prevent data loss:

Use disk buffers for critical sinks
Enable acknowledgements
Configure dead letter queues (DLQ):

sinks:
  primary:
    type: elasticsearch
    # ... config ...
  
  # Send failures to S3 for later reprocessing
  dlq:
    type: aws_s3
    inputs:
      - primary._undeliverable  # Special output (future feature)

Performance Optimization

Concurrency

Increase parallel requests for better throughput:

sinks:
  high_throughput:
    type: http
    request:
      concurrency: 20           # 20 parallel requests
    batch:
      max_events: 100
      timeout_secs: 1

Guidelines:

Start with low concurrency (5-10)
Increase until destination is saturated
Monitor destination metrics (CPU, response time)
Consider destination rate limits

Batching

Larger batches improve throughput:

sinks:
  optimized:
    batch:
      max_events: 1000          # Large batches for high throughput
      max_bytes: 10485760       # 10MB
      timeout_secs: 10          # Accept higher latency

Compression

For remote sinks, compression usually helps:

sinks:
  remote:
    compression: gzip
    # Reduces network usage by 70-90%
    # Adds ~5-10ms CPU overhead

Network Tuning

sinks:
  tuned:
    request:
      timeout_secs: 60          # Match destination latency
      rate_limit_num: 100       # Max 100 requests...
      rate_limit_duration_secs: 1  # ...per second

Best Practices

Use appropriate buffers

Memory: Fast path for non-critical data
Disk: Critical data, large buffers, unstable networks

sinks:
  audit_logs:
    buffer:
      type: disk            # Critical data
      max_size: 1073741824
  
  debug_logs:
    buffer:
      type: memory          # Best effort
      max_events: 500

Monitor sink health

Watch these metrics:

component_sent_events_total: Events delivered
component_errors_total: Delivery failures
buffer_events: Buffer utilization
request_duration_seconds: Delivery latency

sources:
  metrics:
    type: internal_metrics

sinks:
  prometheus:
    type: prometheus_exporter
    inputs: [metrics]
    address: 0.0.0.0:9598

Configure appropriate timeouts

Match timeouts to destination characteristics:

Fast APIs: 10-30 seconds
Slow APIs: 60-120 seconds
Batch uploads: 300+ seconds

Use templates for dynamic routing

sinks:
  s3_dynamic:
    type: aws_s3
    bucket: logs
    key_prefix: "{{ environment }}/{{ application }}/date=%Y-%m-%d/"

Test before production

Run health checks: vector validate --config vector.yaml
Test with console sink first
Use blackhole sink for performance testing
Monitor metrics during initial rollout

Troubleshooting

Events Not Reaching Destination

Check sink health:
```
curl http://localhost:8686/health
```

Review logs:

VECTOR_LOG=debug vector --config vector.yaml

Verify network connectivity:
```
telnet api.example.com 443
```

Check metrics:

curl http://localhost:8686/metrics | grep component_errors

High Latency

Reduce batch sizes
Increase concurrency
Check destination performance
Consider regional endpoints (reduce network distance)

Buffer Full / Backpressure

Increase buffer size
Add more sink instances (horizontal scaling)
Sample high-volume data
Optimize destination performance

Authentication Errors

Verify credentials in environment variables
Check IAM roles/permissions (for cloud sinks)
Ensure tokens haven’t expired
Review destination access logs

Buffering - Understanding buffers and backpressure
Data Model - Event types and structure
Pipeline Model - How sinks fit in topologies
Transforms - Preparing data before sinks

Getting Started

Core Concepts

Configuration

Deployment

Administration

Guides

​How Sinks Work

​Sink Architecture

​Sink Categories

​Cloud Platform Sinks

AWS

Google Cloud

Azure

​Observability Platforms

Datadog

New Relic

Splunk

Elastic

​Metrics Systems

Prometheus

InfluxDB

Graphite

StatsD

​Databases

ClickHouse

PostgreSQL

​Message Queues

Kafka

NATS

AMQP

Redis

​HTTP and Webhooks

​Development and Testing

console

blackhole

file

​Sink Configuration

​Common Options

​Buffering Strategies

​Batching

​Encoding Formats

​Compression

​Health Checks

​Reliability and Delivery Guarantees

​Acknowledgements

​Retry Logic

​Error Handling

​Performance Optimization

​Concurrency

​Batching

​Compression

​Network Tuning

​Best Practices

​Troubleshooting

​Events Not Reaching Destination

​High Latency

​Buffer Full / Backpressure

​Authentication Errors

​Related Topics

Build docs developers (and LLMs) love

How Sinks Work

Sink Architecture

Sink Categories

Cloud Platform Sinks

Observability Platforms

Metrics Systems

Databases

Message Queues

HTTP and Webhooks

Development and Testing

Sink Configuration

Common Options

Buffering Strategies

Batching

Encoding Formats

Compression

Health Checks

Reliability and Delivery Guarantees

Acknowledgements

Retry Logic

Error Handling

Performance Optimization

Concurrency

Batching

Compression

Network Tuning

Best Practices

Troubleshooting

Events Not Reaching Destination

High Latency

Buffer Full / Backpressure

Authentication Errors

Related Topics