AWS S3 Sink

The AWS S3 sink stores observability events in Amazon S3 buckets. It supports automatic partitioning, multiple compression algorithms, and flexible file naming strategies.

Configuration

[sinks.s3]
type = "aws_s3"
inputs = ["my_source"]

# S3 bucket configuration
bucket = "my-logs-bucket"
key_prefix = "logs/date=%F/"

# AWS region
region = "us-east-1"

# Authentication
auth.access_key_id = "${AWS_ACCESS_KEY_ID}"
auth.secret_access_key = "${AWS_SECRET_ACCESS_KEY}"

# Encoding and compression
encoding.codec = "json"
compression = "gzip"

# Batching
batch.max_bytes = 10485760  # 10MB
batch.timeout_secs = 300

Core Parameters

bucket

string

required

The S3 bucket name. Must not include a leading s3:// or trailing /.

bucket = "my-logs-bucket"

key_prefix

string

default:"date=%F"

Prefix to apply to all object keys. Supports template syntax and strftime date formatting.Use a trailing / to create a directory-like structure.

Show Examples

# Date-based partitioning
key_prefix = "date=%F/hour=%H/"

# Hierarchical structure
key_prefix = "year=%Y/month=%m/day=%d/"

# Template-based
key_prefix = "application_id={{ application_id }}/date=%F/"

# Service-based
key_prefix = "service={{ service }}/environment={{ env }}/"

region

string

required

AWS region where the S3 bucket is located.

region = "us-east-1"
region = "eu-west-1"

endpoint

string

Custom S3-compatible endpoint (for S3-compatible storage).

endpoint = "https://s3.custom.com"

Authentication

The S3 sink supports multiple AWS authentication methods:

auth.access_key_id

string

AWS access key ID for authentication.

auth.secret_access_key

string

AWS secret access key for authentication.

auth.assume_role

string

ARN of an IAM role to assume for authentication.

Static Credentials

[sinks.s3.auth]
access_key_id = "${AWS_ACCESS_KEY_ID}"
secret_access_key = "${AWS_SECRET_ACCESS_KEY}"

IAM Role

When running on EC2, ECS, or EKS, Vector can automatically use IAM role credentials:

# No auth configuration needed - uses instance profile
[sinks.s3]
bucket = "my-bucket"
region = "us-east-1"

Assume Role

[sinks.s3.auth]
assume_role = "arn:aws:iam::123456789012:role/VectorS3WriteRole"

External ID

[sinks.s3.auth]
assume_role = "arn:aws:iam::123456789012:role/VectorS3WriteRole"
external_id = "external-id-12345"

File Naming

filename_time_format

string

default:"%s"

Timestamp format for the time component of object keys using strftime specifiers.Set to empty string to disable timestamp in filename.

filename_time_format = "%s"          # Unix timestamp: 1658176486
filename_time_format = "%Y%m%d%H%M%S" # 20220718203446
filename_time_format = ""            # No timestamp

filename_append_uuid

boolean

default:"true"

Append a UUID v4 token to the end of object keys to ensure uniqueness.Useful in high-volume workloads to prevent name collisions.

filename_append_uuid = true
# Results in: date=2022-07-18/1658176486-30f6652c-71da-4f9f-800d-a1189c47c547.log.gz

filename_extension

string

Override the file extension. By default, the extension is determined by the compression setting.

filename_extension = "json"
filename_extension = "log"

Encoding

encoding.codec

string

required

How events are encoded before writing to S3. Options:

json: JSON encoding (one object per line)
text: Plain text (one line per event)
ndjson: Newline-delimited JSON
csv: CSV format
logfmt: Logfmt encoding
avro: Apache Avro binary format
parquet: Apache Parquet columnar format

[sinks.s3.encoding]
codec = "json"

encoding.only_fields

array

Include only specified fields in the output.

[sinks.s3.encoding]
codec = "json"
only_fields = ["timestamp", "message", "level"]

encoding.except_fields

array

Exclude specified fields from the output.

[sinks.s3.encoding]
codec = "json"
except_fields = ["_metadata", "secret_token"]

encoding.timestamp_format

string

default:"rfc3339"

Format for timestamp fields. Options: rfc3339, unix, unix_ms, unix_ns.

[sinks.s3.encoding]
codec = "json"
timestamp_format = "unix"

Compression

compression

string

default:"gzip"

Compression algorithm. Options: none, gzip, zstd, snappy.Compression reduces storage costs and network bandwidth.

compression = "gzip"  # Good balance of speed and ratio
compression = "zstd"  # Better compression, slightly slower
compression = "none"  # No compression

Batching

Configure batching to control file size and flush frequency:

batch.max_bytes

integer

default:"10485760"

Maximum size of a batch in bytes before creating a new file (10MB default).

[sinks.s3.batch]
max_bytes = 52428800  # 50MB

batch.timeout_secs

float

default:"300"

Maximum time to wait before flushing a partial batch (5 minutes default).

[sinks.s3.batch]
timeout_secs = 60  # Flush every minute

[sinks.s3.batch]
max_bytes = 10485760    # 10MB files
timeout_secs = 300      # Flush every 5 minutes

S3 Options

Advanced S3-specific options:

options.acl

string

Canned ACL to apply to created objects. Options: private, public-read, public-read-write, authenticated-read, bucket-owner-read, bucket-owner-full-control.

[sinks.s3.options]
acl = "bucket-owner-full-control"

options.storage_class

string

default:"STANDARD"

S3 storage class. Options:

STANDARD: Standard storage
REDUCED_REDUNDANCY: Reduced redundancy
INTELLIGENT_TIERING: Automatic cost optimization
STANDARD_IA: Infrequent access
ONEZONE_IA: One zone infrequent access
GLACIER: Glacier storage
GLACIER_IR: Glacier instant retrieval
DEEP_ARCHIVE: Glacier deep archive

[sinks.s3.options]
storage_class = "INTELLIGENT_TIERING"

options.server_side_encryption

string

Server-side encryption algorithm. Options: AES256, aws:kms.

[sinks.s3.options]
server_side_encryption = "aws:kms"

options.ssekms_key_id

string

KMS key ID for server-side encryption with KMS. Required when server_side_encryption = "aws:kms".Supports template syntax for dynamic key selection.

[sinks.s3.options]
server_side_encryption = "aws:kms"
ssekms_key_id = "arn:aws:kms:us-east-1:123456789012:key/abcd1234-..."

options.tags

object

Tags to apply to created objects.

[sinks.s3.options.tags]
Environment = "production"
Application = "vector"
CostCenter = "engineering"

options.content_encoding

string

Override the Content-Encoding header.

[sinks.s3.options]
content_encoding = "gzip"

options.content_type

string

Override the Content-Type header.

[sinks.s3.options]
content_type = "application/json"

TLS Configuration

tls.ca_file

string

Path to CA certificate for custom endpoints.

[sinks.s3.tls]
ca_file = "/path/to/ca.pem"

Request Configuration

request.timeout_secs

integer

default:"60"

Request timeout in seconds.

request.retry_attempts

integer

default:"5"

Number of retry attempts for failed requests.

[sinks.s3.request]
timeout_secs = 30
retry_attempts = 3

Advanced Options

force_path_style

boolean

default:"false"

Use path-style addressing (bucket in path) instead of virtual-hosted style. Required for some S3-compatible services.

force_path_style = true
# https://s3.custom.com/bucket/key instead of https://bucket.s3.custom.com/key

Complete Examples

Basic Configuration

[sinks.s3_logs]
type = "aws_s3"
inputs = ["processed_logs"]

bucket = "my-logs-bucket"
key_prefix = "logs/date=%F/"
region = "us-east-1"

encoding.codec = "json"
compression = "gzip"

[sinks.s3_logs.batch]
max_bytes = 10485760
timeout_secs = 300

Partitioned by Service and Date

[sinks.s3_partitioned]
type = "aws_s3"
inputs = ["logs"]

bucket = "prod-logs"
key_prefix = "service={{ service }}/date=%Y/%m/%d/"
region = "us-west-2"

auth.assume_role = "arn:aws:iam::123456789012:role/VectorRole"

encoding.codec = "json"
compression = "zstd"

[sinks.s3_partitioned.batch]
max_bytes = 52428800  # 50MB
timeout_secs = 600

With KMS Encryption

[sinks.s3_encrypted]
type = "aws_s3"
inputs = ["sensitive_logs"]

bucket = "secure-logs-bucket"
key_prefix = "encrypted/date=%F/"
region = "us-east-1"

encoding.codec = "json"
compression = "gzip"

[sinks.s3_encrypted.options]
server_side_encryption = "aws:kms"
ssekms_key_id = "arn:aws:kms:us-east-1:123456789012:key/12345678-..."
storage_class = "STANDARD_IA"

[sinks.s3_encrypted.options.tags]
Classification = "confidential"
Retention = "7years"

High-Volume Configuration

[sinks.s3_high_volume]
type = "aws_s3"
inputs = ["metrics"]

bucket = "metrics-bucket"
key_prefix = "metrics/year=%Y/month=%m/day=%d/hour=%H/"
region = "us-east-1"

encoding.codec = "json"
compression = "zstd"

filename_time_format = "%Y%m%d%H%M%S"
filename_append_uuid = true

[sinks.s3_high_volume.batch]
max_bytes = 104857600  # 100MB
timeout_secs = 120

[sinks.s3_high_volume.request]
retry_attempts = 10
timeout_secs = 120

[sinks.s3_high_volume.options]
storage_class = "INTELLIGENT_TIERING"

S3-Compatible Storage (MinIO)

[sinks.minio]
type = "aws_s3"
inputs = ["logs"]

bucket = "vector-logs"
key_prefix = "logs/"
endpoint = "https://minio.example.com"
region = "us-east-1"  # Still required but can be any value

force_path_style = true

auth.access_key_id = "minioadmin"
auth.secret_access_key = "minioadmin"

encoding.codec = "json"
compression = "gzip"

Troubleshooting

Authentication Issues

If you encounter authentication errors:

Verify AWS credentials are correct
Check IAM permissions include s3:PutObject on the bucket
Ensure the bucket exists and region is correct
For assume role, verify trust relationships

Object Not Created

If objects aren’t appearing in S3:

Check batch timeout - may need to wait for flush
Verify bucket name and region are correct
Review Vector logs for errors
Ensure sufficient data to trigger batch

Performance Issues

Increase batch size: Larger files reduce API calls
Enable compression: Reduces upload time
Adjust timeout: Balance between latency and file size
Use multiple sinks: Partition across buckets/prefixes
Choose appropriate storage class: Consider access patterns

Best Practices

Use date-based partitioning for easier querying and lifecycle management
Enable compression to reduce storage costs (30-50% savings)
Set appropriate batch sizes to balance cost and latency
Use IAM roles instead of static credentials when possible
Enable KMS encryption for sensitive data
Add meaningful tags for cost tracking and organization
Use Intelligent-Tiering storage class for unknown access patterns
Configure S3 Lifecycle policies to archive or delete old data
Enable S3 versioning for important data
Monitor CloudWatch metrics for S3 API usage

Cost Optimization

Compression: Use gzip or zstd to reduce storage costs
Batch size: Larger batches reduce PUT request costs
Storage class: Use STANDARD_IA for infrequent access
Lifecycle policies: Automatically transition to cheaper storage
Partitioning: Makes selective deletion easier

Sources

Transforms

Sinks

VRL

CLI

API

Configuration

Core Parameters

Authentication

Static Credentials

IAM Role

Assume Role

External ID

File Naming

Encoding

Compression

Batching

S3 Options

TLS Configuration

Request Configuration

Advanced Options

Complete Examples

Basic Configuration

Partitioned by Service and Date

With KMS Encryption

High-Volume Configuration

S3-Compatible Storage (MinIO)

Troubleshooting

Authentication Issues

Object Not Created

Performance Issues

Best Practices

Cost Optimization

See Also

Build docs developers (and LLMs) love

Sources

Transforms

Sinks

VRL

CLI

API

​Configuration

​Core Parameters

​Authentication

​Static Credentials

​IAM Role

​Assume Role

​External ID

​File Naming

​Encoding

​Compression

​Batching

​S3 Options

​TLS Configuration

​Request Configuration

​Advanced Options

​Complete Examples

​Basic Configuration

​Partitioned by Service and Date

​With KMS Encryption

​High-Volume Configuration

​S3-Compatible Storage (MinIO)

​Troubleshooting

​Authentication Issues

​Object Not Created

​Performance Issues

​Best Practices

​Cost Optimization

​See Also

Build docs developers (and LLMs) love

Configuration

Core Parameters

Authentication

Static Credentials

IAM Role

Assume Role

External ID

File Naming

Encoding

Compression

Batching

S3 Options

TLS Configuration

Request Configuration

Advanced Options

Complete Examples

Basic Configuration

Partitioned by Service and Date

With KMS Encryption

High-Volume Configuration

S3-Compatible Storage (MinIO)

Troubleshooting

Authentication Issues

Object Not Created

Performance Issues

Best Practices

Cost Optimization

See Also