Ship Logs with Vector
Configure Vector to collect systemd-journald logs, reshape events with VRL transforms, and ship structured data to Grafana Loki and Elasticsearch.
Before you start
- ▸A running systemd-based Linux system with journald collecting logs
- ▸Network access to a Loki instance (port 3100) or Elasticsearch cluster (port 9200)
- ▸sudo / root privileges to install packages and modify system users
- ▸Basic familiarity with YAML syntax and systemd service management
Vector is a high-performance observability data pipeline written in Rust. It reads log data from sources, optionally reshapes it through transforms, and forwards it to one or more sinks. It handles journald, files, syslog, Kafka, and dozens of other inputs, while supporting Loki, Elasticsearch, S3, and many more outputs — all in a single, statically linked binary that is trivial to deploy.
This guide walks through installing Vector, configuring it to read from systemd-journald, applying VRL (Vector Remap Language) transforms to enrich and filter log events, and shipping the results to both Grafana Loki and Elasticsearch.
Install Vector
Vector ships packages for all major distributions. Use the official repository rather than distro-packaged versions, which lag behind.
Debian / Ubuntu
curl -fsSL https://apt.vector.dev/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/vector-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/vector-archive-keyring.gpg] https://apt.vector.dev stable vector-0" \
| sudo tee /etc/apt/sources.list.d/vector.list
sudo apt update && sudo apt install -y vector
Fedora / RHEL / Rocky
sudo bash -c 'cat > /etc/yum.repos.d/vector.repo <<EOF
[vector]
name=Vector
baseurl=https://yum.vector.dev/stable/vector-0/x86_64/
enabled=1
gpgcheck=1
gpgkey=https://yum.vector.dev/gpg.key
EOF'
sudo dnf install -y vector
Arch Linux
sudo pacman -S vector
After installation, enable and start the service so it survives reboots:
sudo systemctl enable --now vector
Vector Configuration Basics
Vector's configuration lives at /etc/vector/vector.yaml (YAML is the modern default; TOML is also supported). The file is divided into three top-level sections: sources, transforms, and sinks. Data flows through a pipeline by wiring component IDs together using the inputs key.
Always validate before restarting:
sudo vector validate /etc/vector/vector.yaml
Source: Reading from systemd-journald
Vector's journald source tails the system journal natively — no need to pipe through journalctl. It adds structured fields like _SYSTEMD_UNIT, PRIORITY, and _HOSTNAME automatically.
sources:
system_journal:
type: journald
current_boot_only: true # skip logs from previous boots
include_units: # omit this key to collect everything
- sshd.service
- nginx.service
- postgresql.service
Vector needs read access to the journal. Add the vector user to the systemd-journal group, then restart:
sudo usermod -aG systemd-journal vector
sudo systemctl restart vector
Transforms: Reshaping Events with VRL
VRL (Vector Remap Language) is a purpose-built, safe scripting language for transforming log events. It runs inside a remap transform. Errors abort the transform for that event, so use the ! suffix on fallible functions only when you are certain the field exists, or wrap them in if checks.
Normalize severity and add metadata
transforms:
normalize_logs:
type: remap
inputs:
- system_journal
source: |
# Map numeric syslog PRIORITY to a human-readable level
.level = if exists(.PRIORITY) {
to_string!(.PRIORITY) == "0" || to_string!(.PRIORITY) == "1" || to_string!(.PRIORITY) == "2" ? "critical" :
to_string!(.PRIORITY) == "3" ? "error" :
to_string!(.PRIORITY) == "4" ? "warning" :
to_string!(.PRIORITY) == "5" ? "notice" :
to_string!(.PRIORITY) == "6" ? "info" : "debug"
} else {
"unknown"
}
# Promote useful journald fields to top-level
.unit = del(.SYSTEMD_UNIT) ?? del(._SYSTEMD_UNIT) ?? "unknown"
.host = del(._HOSTNAME) ?? get_hostname!()
# Drop noisy internal fields to save bandwidth
del(._BOOT_ID)
del(._MACHINE_ID)
del(._TRANSPORT)
del(._UID)
del(._GID)
del(._CAP_EFFECTIVE)
# Tag every event with the pipeline version for future debugging
.pipeline_version = "v1"
Filter out debug noise with a filter transform
transforms:
drop_debug:
type: filter
inputs:
- normalize_logs
condition: '.level != "debug"'
Sink: Ship to Grafana Loki
Loki expects logs grouped by a small set of labels (indexed) and a larger log line payload (not indexed). Keep labels low-cardinality — host, unit, and level are good choices.
sinks:
loki_out:
type: loki
inputs:
- drop_debug
endpoint: "http://loki.internal:3100"
encoding:
codec: json
labels:
host: "{{ host }}"
unit: "{{ unit }}"
level: "{{ level }}"
compression: snappy
# Batch settings — tune for your volume
batch:
max_bytes: 1048576 # 1 MiB
timeout_secs: 5
request:
retry_attempts: 5
If Loki requires authentication (Grafana Cloud, for example), add basic auth credentials. Store sensitive values in environment variables, not directly in the config file:
auth:
strategy: basic
user: "${LOKI_USER}"
password: "${LOKI_PASSWORD}"
Pass the variables to the systemd unit by creating an override:
sudo systemctl edit vector
[Service]
EnvironmentFile=/etc/vector/secrets.env
Sink: Ship to Elasticsearch
The Elasticsearch sink uses the Bulk API. Specify api_version: v8 if you are on Elasticsearch 8.x, which dropped support for document type mappings.
sinks:
elasticsearch_out:
type: elasticsearch
inputs:
- drop_debug
endpoints:
- "https://elastic.internal:9200"
api_version: v8
index: "systemd-logs-%Y.%m.%d" # daily index rotation
encoding:
codec: json
auth:
strategy: basic
user: "${ES_USER}"
password: "${ES_PASSWORD}"
tls:
verify_certificate: true
bulk:
action: index
batch:
max_bytes: 10485760 # 10 MiB
timeout_secs: 10
You can fan out to both Loki and Elasticsearch simultaneously by pointing both sinks at the same transform output (drop_debug). Vector sends the events to all connected sinks.
Verify the Pipeline
Validate config syntax first, then watch live output using Vector's built-in tap subcommand, which streams events at any component in the running pipeline:
sudo vector validate /etc/vector/vector.yaml
sudo systemctl restart vector
sudo systemctl status vector
# Tap events after the normalize transform (output varies)
sudo vector tap normalize_logs
Check internal metrics — Vector exposes a Prometheus endpoint by default on port 9598. Add a simple console sink during testing to see exactly what is flowing through:
sinks:
debug_console:
type: console
inputs:
- normalize_logs
encoding:
codec: json
sudo journalctl -u vector -f
Troubleshooting
- No events arriving at the sink: Run
sudo vector tap system_journalfirst to confirm the source is producing events. If it is empty, check that thevectoruser is in thesystemd-journalgroup and that you restarted after adding it. - VRL parse errors: Run
vector vrlinteractively to test expressions against a sample event before committing them to config:echo '{"PRIORITY":"3"}' | vector vrl --program '.level = "error"' - Loki 400 / out-of-order errors: Loki rejects log streams where timestamps arrive out of order. Ensure
current_boot_only: trueis set, or configure Loki'sreject_old_samplesandreject_old_samples_max_ageto match your ingestion window. - Elasticsearch index mapping conflicts: If you change field types in VRL, delete or roll the index. Elasticsearch will reject documents that violate existing mappings silently in bulk requests — check the
_bulkresponse errors or Vector's internal logs. - High memory usage: Vector buffers in memory by default. Switch to disk-backed buffers for high-volume pipelines: add
buffer: { type: disk, max_size: 268435456 }to the relevant sink block.
Frequently asked questions
- Can Vector send logs to both Loki and Elasticsearch at the same time?
- Yes. Point both sink definitions at the same transform's output ID. Vector fans out events to all connected sinks without duplicating processing work.
- How do I test a VRL expression before adding it to my config?
- Use the interactive vector vrl command. Pipe a sample JSON event into it with your program flag, for example: echo '{"PRIORITY":"3"}' | vector vrl --program '.level = "error"'. This lets you iterate quickly without restarting the daemon.
- What happens if a sink is unreachable — will logs be lost?
- By default Vector buffers in memory and retries. For durability on high-volume or unreliable connections, switch to disk-backed buffers by adding a buffer block with type: disk to the sink. This survives Vector restarts.
- Does Vector add significant CPU or memory overhead compared to Filebeat or Promtail?
- In most benchmarks Vector uses less CPU and comparable or lower memory than Filebeat for similar workloads, largely because it is written in Rust. For very low-volume hosts the difference is negligible; it becomes relevant at thousands of events per second.
- Can I reload the Vector config without restarting and dropping buffered events?
- Yes, Vector supports a graceful reload via SIGHUP or systemctl reload vector. It re-reads the configuration and applies changes without dropping buffered data, though adding or removing sources does cause a brief reconnect on those components.
Related guides
Configure Prometheus Alertmanager
Configure Prometheus Alertmanager with routing trees, receivers, inhibition rules, grouping, Go templates, and PagerDuty/Slack on-call integrations.
Build an Intranet Server on Linux
Set up a complete small-office intranet on one Linux box: Nginx web server, dnsmasq local DNS, Samba file sharing, and a Wiki.js team wiki.
Build an nftables Firewall Script
Build a complete nftables firewall from scratch: tables, chains, sets, default-deny input policy, service allowlisting, and persistent systemd configuration.
Caddy as a Reverse Proxy
Set up Caddy as a reverse proxy with automatic HTTPS, load balancing, WebSocket passthrough, reusable snippets, and header control — no certbot required.