Skip to main content
Transforms are the processing components in Vector. They modify, filter, aggregate, or enrich events as they flow through your pipeline.

Transform Configuration

Transforms are configured in the transforms section:
transforms:
  <transform_id>:
    type: <transform_type>
    inputs: [<source_or_transform_ids>]
    # Transform-specific options

Common Transform Parameters

All transforms support these base configuration options:
type
string
required
The type of the transform component.
inputs
array
required
Array of source or transform IDs to receive events from.

Remap Transform

The most powerful transform - uses Vector Remap Language (VRL) to process events:
transforms:
  parse_logs:
    type: remap
    inputs: [apache_logs]
    source: |
      . = parse_apache_log!(.message)
      .status_code = to_int!(.status)
      .timestamp = parse_timestamp!(.timestamp, "%d/%b/%Y:%H:%M:%S %z")
source
string
required
The VRL program to execute for each event. The . refers to the event being processed.
file
string
Path to a file containing the VRL program. Alternative to inline source.
drop_on_error
boolean
default:"false"
Drop events that cause runtime errors instead of passing them through.
drop_on_abort
boolean
default:"false"
Drop events when the VRL program calls abort.

VRL Examples

Parse JSON logs:
source: |
  . = parse_json!(.message)
  .level = upcase!(.level)
Add fields:
source: |
  .environment = "production"
  .processed_at = now()
Filter sensitive data:
source: |
  .message = redact(.message, filters: [r'\\d{3}-\\d{2}-\\d{4}'])

Filter Transform

Filter events based on conditions:
transforms:
  filter_errors:
    type: filter
    inputs: [parse_logs]
    condition:
      type: vrl
      source: '.level == "error"'
condition
object
required
Condition to evaluate. Events that match pass through.
condition.type
string
required
Type of condition: vrl, is_log, is_metric, or is_trace.
condition.source
string
VRL expression that returns a boolean. Required when type is vrl.

Sample Transform

Sample a percentage of events:
transforms:
  sample_half:
    type: sample
    inputs: [parse_logs]
    rate: 2  # Keep 1 out of every 2 events (50%)
rate
integer
required
Keep 1 out of every N events. Rate of 2 = 50%, rate of 10 = 10%.
key_field
string
Field to use for deterministic sampling. Events with the same key value will always be sampled together.

Dedupe Transform

Remove duplicate events:
transforms:
  remove_duplicates:
    type: dedupe
    inputs: [parse_logs]
    fields:
      match: ["message", "timestamp"]
    cache:
      num_events: 5000
fields
object
required
Configuration for matching duplicate events.
fields.match
array
required
Array of field names to use for identifying duplicates.
cache.num_events
integer
default:"5000"
Number of recent events to cache for comparison.

Reduce Transform

Aggregate multiple events into a single event:
transforms:
  aggregate_by_host:
    type: reduce
    inputs: [parse_logs]
    group_by:
      - host
      - level
    merge_strategies:
      count: sum
    expire_after_ms: 60000  # 1 minute
group_by
array
required
Array of field names to group events by.
merge_strategies
object
required
How to merge field values: discard, retain, sum, max, min, array, or concat.
expire_after_ms
integer
required
Milliseconds to wait before flushing a group of events.

Throttle Transform

Limit the rate of events:
transforms:
  rate_limit:
    type: throttle
    inputs: [parse_logs]
    threshold: 100
    window_secs: 1
threshold
integer
required
Maximum number of events to allow per window.
window_secs
integer
required
Time window in seconds.

Route Transform

Route events to different downstream components based on conditions:
transforms:
  route_by_level:
    type: route
    inputs: [parse_logs]
    route:
      error: '.level == "error"'
      warn: '.level == "warn"'
      info: '.level == "info"'

sinks:
  critical_alerts:
    type: console
    inputs: [route_by_level.error]
  
  general_logs:
    type: elasticsearch
    inputs: [route_by_level.warn, route_by_level.info]
route
object
required
Map of route names to VRL conditions. Each route creates an output that can be referenced as <transform_id>.<route_name>.

Lua Transform

Custom event processing using Lua:
transforms:
  custom_processing:
    type: lua
    inputs: [parse_logs]
    version: "2"
    source: |
      function process(event, emit)
        event.log.custom_field = "value"
        emit(event)
      end
version
string
required
Lua transform version: “1” or “2” (recommended).
source
string
required
Lua code to execute. Must define a process function.

Pipeline Example

Chain multiple transforms together:
sources:
  app_logs:
    type: file
    include: ["/var/log/app/*.log"]

transforms:
  # 1. Parse JSON logs
  parse:
    type: remap
    inputs: [app_logs]
    source: |
      . = parse_json!(.message)
  
  # 2. Add metadata
  enrich:
    type: remap
    inputs: [parse]
    source: |
      .environment = "production"
      .processed_at = now()
  
  # 3. Filter only errors
  filter_errors:
    type: filter
    inputs: [enrich]
    condition:
      type: vrl
      source: '.level == "error"'
  
  # 4. Sample 10% for analysis
  sample:
    type: sample
    inputs: [filter_errors]
    rate: 10
  
  # 5. Remove duplicates
  dedupe:
    type: dedupe
    inputs: [sample]
    fields:
      match: ["message", "timestamp"]

sinks:
  elasticsearch:
    type: elasticsearch
    inputs: [dedupe]
    endpoint: "http://localhost:9200"

Transform Outputs

Some transforms create multiple outputs:
transforms:
  route_logs:
    type: route
    inputs: [source]
    route:
      errors: '.level == "error"'
      other: 'true'

sinks:
  # Reference specific route outputs
  error_sink:
    inputs: [route_logs.errors]
  
  regular_sink:
    inputs: [route_logs.other]

Next Steps

Build docs developers (and LLMs) love