Available Sources
Vector provides a wide range of sources for collecting logs, metrics, and traces from different systems:File-Based Sources
File
Collect logs from files with support for globbing, rotation, and checkpointing
Message Queue Sources
Kafka
Collect logs from Apache Kafka topics with consumer group support
HTTP Sources
HTTP
Host an HTTP endpoint to receive logs via POST requests
Syslog Sources
Syslog
Collect logs sent via the Syslog protocol (TCP, UDP, or Unix sockets)
Container Sources
Docker Logs
Collect container logs directly from the Docker daemon
Common Concepts
Event Types
Sources produce one or more of the following event types:- Logs: Structured or unstructured log data
- Metrics: Numerical measurements and time-series data
- Traces: Distributed tracing spans
Acknowledgements
Some sources support end-to-end acknowledgements, ensuring data is not lost in transit. When acknowledgements are enabled, the source only marks data as processed after it has been successfully delivered to all sinks.Log Namespacing
Vector supports two log namespacing modes:- Legacy: Fields are added directly to the event root
- Vector: Metadata is separated into a dedicated namespace
log_namespace setting or allow per-source configuration.
Decoding and Framing
Many sources support configurable decoding and framing:- Framing: How to split incoming bytes into messages (newline delimited, length delimited, etc.)
- Decoding: How to parse messages into events (JSON, text, protobuf, etc.)
Configuration Example
Best Practices
- Use checkpointing: For file-based sources, ensure checkpointing is enabled to avoid duplicate data
- Configure retries: Set appropriate retry and backoff settings for network-based sources
- Filter early: Use source-level filtering when possible to reduce pipeline load
- Monitor source health: Use Vector’s internal metrics to track source performance
- Test acknowledgements: When using acknowledgements, test failure scenarios to ensure data durability
Performance Considerations
- File sources can handle millions of events per second with proper tuning
- Network sources benefit from connection pooling and keep-alive settings
- Consider using multiple sources with load balancing for high-throughput scenarios
- Buffer sizes and batch settings significantly impact throughput and latency