$linuxjunkies
>

Ship Logs with Vector

Configure Vector to collect systemd-journald logs, reshape events with VRL transforms, and ship structured data to Grafana Loki and Elasticsearch.

IntermediateUbuntuDebianFedoraArch10 min readUpdated June 1, 2026

Before you start

  • A running systemd-based Linux system with journald collecting logs
  • Network access to a Loki instance (port 3100) or Elasticsearch cluster (port 9200)
  • sudo / root privileges to install packages and modify system users
  • Basic familiarity with YAML syntax and systemd service management

Vector is a high-performance observability data pipeline written in Rust. It reads log data from sources, optionally reshapes it through transforms, and forwards it to one or more sinks. It handles journald, files, syslog, Kafka, and dozens of other inputs, while supporting Loki, Elasticsearch, S3, and many more outputs — all in a single, statically linked binary that is trivial to deploy.

This guide walks through installing Vector, configuring it to read from systemd-journald, applying VRL (Vector Remap Language) transforms to enrich and filter log events, and shipping the results to both Grafana Loki and Elasticsearch.

Install Vector

Vector ships packages for all major distributions. Use the official repository rather than distro-packaged versions, which lag behind.

Debian / Ubuntu

curl -fsSL https://apt.vector.dev/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/vector-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/vector-archive-keyring.gpg] https://apt.vector.dev stable vector-0" \
  | sudo tee /etc/apt/sources.list.d/vector.list
sudo apt update && sudo apt install -y vector

Fedora / RHEL / Rocky

sudo bash -c 'cat > /etc/yum.repos.d/vector.repo <<EOF
[vector]
name=Vector
baseurl=https://yum.vector.dev/stable/vector-0/x86_64/
enabled=1
gpgcheck=1
gpgkey=https://yum.vector.dev/gpg.key
EOF'
sudo dnf install -y vector

Arch Linux

sudo pacman -S vector

After installation, enable and start the service so it survives reboots:

sudo systemctl enable --now vector

Vector Configuration Basics

Vector's configuration lives at /etc/vector/vector.yaml (YAML is the modern default; TOML is also supported). The file is divided into three top-level sections: sources, transforms, and sinks. Data flows through a pipeline by wiring component IDs together using the inputs key.

Always validate before restarting:

sudo vector validate /etc/vector/vector.yaml

Source: Reading from systemd-journald

Vector's journald source tails the system journal natively — no need to pipe through journalctl. It adds structured fields like _SYSTEMD_UNIT, PRIORITY, and _HOSTNAME automatically.

sources:
  system_journal:
    type: journald
    current_boot_only: true          # skip logs from previous boots
    include_units:                   # omit this key to collect everything
      - sshd.service
      - nginx.service
      - postgresql.service

Vector needs read access to the journal. Add the vector user to the systemd-journal group, then restart:

sudo usermod -aG systemd-journal vector
sudo systemctl restart vector

Transforms: Reshaping Events with VRL

VRL (Vector Remap Language) is a purpose-built, safe scripting language for transforming log events. It runs inside a remap transform. Errors abort the transform for that event, so use the ! suffix on fallible functions only when you are certain the field exists, or wrap them in if checks.

Normalize severity and add metadata

transforms:
  normalize_logs:
    type: remap
    inputs:
      - system_journal
    source: |
      # Map numeric syslog PRIORITY to a human-readable level
      .level = if exists(.PRIORITY) {
        to_string!(.PRIORITY) == "0" || to_string!(.PRIORITY) == "1" || to_string!(.PRIORITY) == "2" ? "critical" :
        to_string!(.PRIORITY) == "3" ? "error" :
        to_string!(.PRIORITY) == "4" ? "warning" :
        to_string!(.PRIORITY) == "5" ? "notice" :
        to_string!(.PRIORITY) == "6" ? "info" : "debug"
      } else {
        "unknown"
      }

      # Promote useful journald fields to top-level
      .unit   = del(.SYSTEMD_UNIT) ?? del(._SYSTEMD_UNIT) ?? "unknown"
      .host   = del(._HOSTNAME) ?? get_hostname!() 

      # Drop noisy internal fields to save bandwidth
      del(._BOOT_ID)
      del(._MACHINE_ID)
      del(._TRANSPORT)
      del(._UID)
      del(._GID)
      del(._CAP_EFFECTIVE)

      # Tag every event with the pipeline version for future debugging
      .pipeline_version = "v1"

Filter out debug noise with a filter transform

transforms:
  drop_debug:
    type: filter
    inputs:
      - normalize_logs
    condition: '.level != "debug"'

Sink: Ship to Grafana Loki

Loki expects logs grouped by a small set of labels (indexed) and a larger log line payload (not indexed). Keep labels low-cardinality — host, unit, and level are good choices.

sinks:
  loki_out:
    type: loki
    inputs:
      - drop_debug
    endpoint: "http://loki.internal:3100"
    encoding:
      codec: json
    labels:
      host: "{{ host }}"
      unit: "{{ unit }}"
      level: "{{ level }}"
    compression: snappy
    # Batch settings — tune for your volume
    batch:
      max_bytes: 1048576    # 1 MiB
      timeout_secs: 5
    request:
      retry_attempts: 5

If Loki requires authentication (Grafana Cloud, for example), add basic auth credentials. Store sensitive values in environment variables, not directly in the config file:

    auth:
      strategy: basic
      user: "${LOKI_USER}"
      password: "${LOKI_PASSWORD}"

Pass the variables to the systemd unit by creating an override:

sudo systemctl edit vector
[Service]
EnvironmentFile=/etc/vector/secrets.env

Sink: Ship to Elasticsearch

The Elasticsearch sink uses the Bulk API. Specify api_version: v8 if you are on Elasticsearch 8.x, which dropped support for document type mappings.

sinks:
  elasticsearch_out:
    type: elasticsearch
    inputs:
      - drop_debug
    endpoints:
      - "https://elastic.internal:9200"
    api_version: v8
    index: "systemd-logs-%Y.%m.%d"    # daily index rotation
    encoding:
      codec: json
    auth:
      strategy: basic
      user: "${ES_USER}"
      password: "${ES_PASSWORD}"
    tls:
      verify_certificate: true
    bulk:
      action: index
    batch:
      max_bytes: 10485760   # 10 MiB
      timeout_secs: 10

You can fan out to both Loki and Elasticsearch simultaneously by pointing both sinks at the same transform output (drop_debug). Vector sends the events to all connected sinks.

Verify the Pipeline

Validate config syntax first, then watch live output using Vector's built-in tap subcommand, which streams events at any component in the running pipeline:

sudo vector validate /etc/vector/vector.yaml
sudo systemctl restart vector
sudo systemctl status vector
# Tap events after the normalize transform (output varies)
sudo vector tap normalize_logs

Check internal metrics — Vector exposes a Prometheus endpoint by default on port 9598. Add a simple console sink during testing to see exactly what is flowing through:

sinks:
  debug_console:
    type: console
    inputs:
      - normalize_logs
    encoding:
      codec: json
sudo journalctl -u vector -f

Troubleshooting

  • No events arriving at the sink: Run sudo vector tap system_journal first to confirm the source is producing events. If it is empty, check that the vector user is in the systemd-journal group and that you restarted after adding it.
  • VRL parse errors: Run vector vrl interactively to test expressions against a sample event before committing them to config: echo '{"PRIORITY":"3"}' | vector vrl --program '.level = "error"'
  • Loki 400 / out-of-order errors: Loki rejects log streams where timestamps arrive out of order. Ensure current_boot_only: true is set, or configure Loki's reject_old_samples and reject_old_samples_max_age to match your ingestion window.
  • Elasticsearch index mapping conflicts: If you change field types in VRL, delete or roll the index. Elasticsearch will reject documents that violate existing mappings silently in bulk requests — check the _bulk response errors or Vector's internal logs.
  • High memory usage: Vector buffers in memory by default. Switch to disk-backed buffers for high-volume pipelines: add buffer: { type: disk, max_size: 268435456 } to the relevant sink block.
tested on:Ubuntu 24.04Fedora 40Arch rollingRocky 9

Frequently asked questions

Can Vector send logs to both Loki and Elasticsearch at the same time?
Yes. Point both sink definitions at the same transform's output ID. Vector fans out events to all connected sinks without duplicating processing work.
How do I test a VRL expression before adding it to my config?
Use the interactive vector vrl command. Pipe a sample JSON event into it with your program flag, for example: echo '{"PRIORITY":"3"}' | vector vrl --program '.level = "error"'. This lets you iterate quickly without restarting the daemon.
What happens if a sink is unreachable — will logs be lost?
By default Vector buffers in memory and retries. For durability on high-volume or unreliable connections, switch to disk-backed buffers by adding a buffer block with type: disk to the sink. This survives Vector restarts.
Does Vector add significant CPU or memory overhead compared to Filebeat or Promtail?
In most benchmarks Vector uses less CPU and comparable or lower memory than Filebeat for similar workloads, largely because it is written in Rust. For very low-volume hosts the difference is negligible; it becomes relevant at thousands of events per second.
Can I reload the Vector config without restarting and dropping buffered events?
Yes, Vector supports a graceful reload via SIGHUP or systemctl reload vector. It re-reads the configuration and applies changes without dropping buffered data, though adding or removing sources does cause a brief reconnect on those components.

Related guides