Log Management | Advanced Linux Administration

Slide 1 of 35  |  ALA-09  |  Week 4 of 8
Log Management
and Analysis
journalctl  •  rsyslog  •  Log Forwarding  •  Centralized Logging  •  Analysis Patterns
An incident happened at 02:47 AM. The database went down, three services cascaded, and 400 users saw errors. Your job is to reconstruct the exact sequence of events, identify the root cause, and prove it was not a security incident. You have two logging systems and 48 hours of data. This lecture is how you work that problem.
35 Slides ALA-09 Week 4 of 8 Ubuntu 22.04 LTS
Slide 2 of 35
Two Logging Systems: journald and syslog
Modern Ubuntu runs both. They serve different purposes and complement each other.
kernel systemd units app stdout journald Binary Journal journalctl rsyslog /var/log/*.log
systemd-journald
Captures everything: kernel messages, systemd unit output, application stdout/stderr, audit events, and syslog messages. Stored in binary format in /run/log/journal/ or /var/log/journal/. Queried with journalctl. Rich filtering by unit, user, priority, time, and field value.
rsyslog
Traditional syslog daemon. Reads from the syslog socket. Writes plain-text files to /var/log/. Forwards to remote servers. Supports complex routing, filtering, and transformation with its rule language. Still the standard for centralized log collection and SIEM forwarding.
When to Use journalctl
Investigating what a specific service did. Real-time following of a service's output. Boot-time messages. Kernel messages. Anything that does not need to leave the local machine immediately.
When to Use rsyslog
Forwarding logs to a SIEM or central server. Long-term retention beyond the journal's rotation window. Integration with legacy tools that expect plain text. High-volume application logs with custom routing requirements.
Slide 3 of 35
journalctl: Core Commands
The primary interface to the systemd journal. Learn these commands until they are reflexive.
# Show all journal entries (most recent last) journalctl # Follow in real time (like tail -f) journalctl -f # Last N lines journalctl -n 50 journalctl -n 50 -f # follow AND show last 50 on start # Filter by systemd unit journalctl -u nginx journalctl -u nginx -u mysql # multiple units journalctl -u nginx -f # follow nginx specifically # Filter by priority (emerg alert crit err warning notice info debug) journalctl -p err # only error-level and above journalctl -p err..warning # range: error through warning journalctl -p 0..4 # numeric: 0=emerg 4=warning # Kernel messages only journalctl -k # Boot messages for specific boot journalctl -b # current boot journalctl -b -1 # previous boot journalctl --list-boots # list all available boots
Slide 4 of 35
journalctl: Time-Based Queries
Time filtering is the most critical skill for incident investigation. Master the since/until syntax.
# Absolute time ranges journalctl --since "2026-04-09 02:30:00" journalctl --since "2026-04-09 02:30" --until "2026-04-09 03:00" journalctl --since "2026-04-09 02:45:00" --until "2026-04-09 02:46:00" # Relative time expressions journalctl --since "1 hour ago" journalctl --since "2 days ago" journalctl --since today journalctl --since yesterday journalctl --since "30 minutes ago" --until "5 minutes ago" # Combine time with unit for incident investigation journalctl -u nginx --since "2026-04-09 02:40" --until "2026-04-09 03:00" # Combine time with priority: errors in the last hour journalctl -p err --since "1 hour ago" --no-pager # Reverse order (most recent first) journalctl -r --since today -p err
Incident Investigation Pattern
When investigating a 02:47 incident: run journalctl --since "2026-04-09 02:40" --until "2026-04-09 03:00" -p err first to see all errors across all services in the incident window. This gives you the sequence of failures before you dive into any specific service.
Slide 5 of 35
journalctl: Field Filtering and JSON Output
The journal stores structured key=value fields for every entry. Filter on any field for precise queries.
# Filter by executable path journalctl _EXE=/usr/bin/python3 # Filter by process UID (useful for multi-tenant systems) journalctl _UID=1001 # Filter by hostname (relevant in centralized logging) journalctl _HOSTNAME=web-01 # Filter by syslog identifier (the tag field) journalctl SYSLOG_IDENTIFIER=nginx # Combine multiple fields (AND logic) journalctl _UID=1001 _EXE=/usr/bin/ssh # Discover all fields in a journal entry journalctl -o verbose -n 1 -u nginx # JSON output for programmatic processing journalctl -u nginx -o json --since "1 hour ago" | jq . # JSON short: one line per entry with key fields journalctl -u nginx -o json-pretty --since "10 minutes ago" | \ jq -r '[.__REALTIME_TIMESTAMP, .PRIORITY, .MESSAGE] | @tsv' # Export format for archiving journalctl --since today -o export > /var/archive/journal-$(date +%F).export
Slide 6 of 35
journald Configuration: Storage and Retention
The journal's storage location, size limits, and retention determine how far back you can look during an investigation.
Active Writing Rotate MaxFileSize Compress Compress=yes Vacuum MaxRetention
# /etc/systemd/journald.conf (key settings) [Journal] # Storage: auto (persistent if /var/log/journal/ exists, volatile otherwise) # persistent: always write to /var/log/journal/ # volatile: only /run/log/journal/ (lost on reboot) Storage=persistent # Max disk space for journal (total across all journals) SystemMaxUse=2G # Maximum size of a single journal file SystemMaxFileSize=256M # Maximum age to retain journal entries MaxRetentionSec=30day # Forward entries to syslog socket (enables rsyslog to read them) ForwardToSyslog=yes # Compress journal entries on disk Compress=yes # Rate limiting per service (prevent log flooding) RateLimitIntervalSec=30s RateLimitBurst=10000
# Create persistent journal directory and apply config mkdir -p /var/log/journal systemd-tmpfiles --create --prefix /var/log/journal systemctl restart systemd-journald # Check journal disk usage journalctl --disk-usage # Vacuum old entries journalctl --vacuum-time=30d # remove entries older than 30 days journalctl --vacuum-size=1G # shrink to 1GB total
Slide 7 of 35
Syslog: Facilities and Priorities
The syslog protocol uses a facility.priority taxonomy to classify messages. rsyslog routes messages based on these classifications.
SEVERITY 0 emerg 1 alert 2 crit 3 err 4 warning 5 notice 6 info 7 debug LOW HIGH
Facilities (Source)
kern kernel messages. auth/authpriv authentication events. mail mail system. cron cron jobs. daemon system daemons. local0..local7 custom applications. Use local0 through local7 for your own applications and automation scripts.
Priorities (Severity)
emerg(0) system unusable. alert(1) immediate action required. crit(2) critical conditions. err(3) error conditions. warning(4) warning conditions. notice(5) normal but significant. info(6) informational. debug(7) debug messages.
# Syslog priority rule syntax: facility.priority # Lower number = higher severity # Routing examples in rsyslog.conf: kern.* /var/log/kern.log # all kernel messages auth,authpriv.* /var/log/auth.log # auth messages *.emerg :omusrmsg:* # emergency: alert all users *.warning /var/log/syslog # warning and above: syslog local0.* /var/log/sector/app.log # custom app logs local0.err @@siem.sector.local:514 # forward errors to SIEM # The selector * means all priorities, = means only that priority # mail.none means exclude mail facility # *.warning;mail.none means all warnings except mail
Slide 8 of 35
rsyslog: Input, Filter, Output Pipeline
rsyslog processes every log message through a three-stage pipeline. Understanding this architecture is the key to building correct configurations.
imuxsock imjournal imtcp/imudp Filter Rules facility.priority omfile omfwd (TCP/UDP) omelasticsearch /var/log/ Remote SIEM Elasticsearch
Input Modules (imXXX)
imuxsock reads from the local syslog socket (/dev/log). imklog reads kernel messages. imjournal reads from the systemd journal. imtcp/imudp receive remote syslog over TCP/UDP. imfile tails log files.
Filter Rules
Three filter types: facility.priority selectors (traditional), property-based filters (:msg, contains, "error"), and RainerScript expression filters (if $msg contains "CRITICAL" then). Rules are evaluated in order; stop processing with stop or & operator.
Output Modules (omXXX)
omfile writes to files. omfwd forwards over TCP/UDP syslog. ommysql/ompgsql writes to databases. omrelp uses the reliable RELP protocol. omelasticsearch indexes into Elasticsearch. ommail sends email alerts.
# rsyslog configuration locations ls /etc/rsyslog.conf # main configuration file ls /etc/rsyslog.d/ # drop-in configuration files rsyslogd -N1 # syntax check without running (dry run) systemctl restart rsyslog # apply changes
Slide 9 of 35
rsyslog Configuration: Local Routing
A well-organized rsyslog configuration separates messages into appropriate files for easy access and log rotation.
# /etc/rsyslog.conf (core structure) #### MODULES #### module(load="imuxsock") # local syslog socket module(load="imklog") # kernel log module module(load="imjournal" # read from systemd journal StateFile="imjournal.state") #### GLOBAL DIRECTIVES #### global(workDirectory="/var/lib/rsyslog") #### RULES #### # Kernel messages to dedicated file kern.* /var/log/kern.log # Authentication: separate file for security review auth,authpriv.* /var/log/auth.log # Mail system: its own log (rarely needed but noisy if not separated) mail.* -/var/log/mail.log # Cron jobs: separate for easy troubleshooting cron.* /var/log/cron.log # Custom application on local0 facility local0.* /var/log/sector/app.log # Everything else: main syslog file *.warning;kern.none;auth.none;mail.none;cron.none /var/log/syslog
Slide 10 of 35
rsyslog Templates: Formatting Log Output
Templates control the format of every log line written by rsyslog. Structured formats enable automated parsing.
# Traditional syslog format (default) template(name="TraditionalFormat" type="string" string="%TIMESTAMP% %HOSTNAME% %syslogtag%%msg%\n") # ISO 8601 timestamp with milliseconds (better for correlation) template(name="ISO8601Format" type="string" string="%TIMESTAMP:::date-rfc3339% %HOSTNAME% %syslogtag%%msg%\n") # JSON format (machine-parseable, SIEM-friendly) template(name="JSONFormat" type="string" string="{\"ts\":\"%TIMESTAMP:::date-rfc3339%\",\"host\":\"%HOSTNAME%\",\"prog\":\"%PROGRAMNAME%\",\"pid\":%PROCID%,\"fac\":\"%syslogfacility-text%\",\"sev\":\"%syslogseverity-text%\",\"msg\":\"%msg:::json%\"}\n") # Structured key=value format template(name="KVFormat" type="string" string="ts=%TIMESTAMP:::date-rfc3339% host=%HOSTNAME% prog=%PROGRAMNAME% sev=%syslogseverity-text% msg=%msg%\n") # Apply a template to a file output local0.* action(type="omfile" file="/var/log/sector/app.log" template="JSONFormat")
Slide 11 of 35
Log Forwarding: TCP vs UDP vs RELP
Forwarding logs off the source host is the foundation of centralized logging. Protocol choice determines reliability guarantees.
Source UDP :514 fire&forget TCP :514 buffered RELP :2514 ack'd Central SIEM
UDP (port 514)
Fire and forget. No acknowledgment, no retransmit. If the receiver is down or the network is congested, messages are dropped silently. Fast and simple. Acceptable only for non-critical monitoring where losing some events is tolerable.
TCP (port 514)
Connection-oriented. If the receiver is down, rsyslog buffers messages and retransmits when the connection is restored (with disk queue configured). Better than UDP for production use. Still can lose messages during large bursts without disk queuing.
RELP (port 2514)
Reliable Event Logging Protocol. Application-layer acknowledgment -- the sender knows each message was received and persisted. No data loss even during receiver restarts. Use RELP for compliance environments where every log entry must be accounted for.
# Forward all messages via TCP with disk queue (production standard) action(type="omfwd" target="siem.sector.local" port="514" protocol="tcp" action.resumeRetryCount="-1" queue.type="LinkedList" queue.filename="fwd-buffer" queue.maxDiskSpace="1g" queue.saveOnShutdown="on")
Slide 12 of 35
Reliable Forwarding: Disk-Assisted Queues
When the SIEM or central log server goes down, a disk queue holds messages and replays them when the connection is restored.
# /etc/rsyslog.d/50-forward.conf # Forward everything to central log server with reliable disk queue # Load the forward output module module(load="omfwd") # Forward rule with full queue configuration action( type="omfwd" target="logs.sector.local" port="514" protocol="tcp" # Queue type: LinkedList (in memory) or Disk (file-backed) queue.type="Disk" # Base filename for queue files in workDirectory queue.filename="logs-sector-queue" # Queue size limits queue.maxDiskSpace="2g" queue.maxMessages="500000" # Persist queue to disk on rsyslog shutdown queue.saveOnShutdown="on" # Retry forever if remote is down action.resumeRetryCount="-1" # Retry interval backoff action.resumeInterval="30" )
Queue Directory
Queue files are written to the rsyslog work directory (/var/lib/rsyslog/ by default). Monitor this directory's size. If the remote log server is down for an extended period, queue files will grow and can fill the disk. Set queue.maxDiskSpace to prevent this.
Slide 13 of 35
Central Log Server: Receiving Remote Logs
Configure a dedicated rsyslog server to receive logs from all fleet nodes and store them in per-host files.
web-01 db-01 app-01 Central rsyslog imtcp :514 /var/log/remote/%HOST% SIEM / ELK
# /etc/rsyslog.conf on the CENTRAL LOG SERVER # Enable TCP listener on port 514 module(load="imtcp") input(type="imtcp" port="514") # Enable UDP listener (for legacy devices) module(load="imudp") input(type="imudp" port="514") # Template: per-host log files under /var/log/remote/ template(name="PerHostLogs" type="string" string="/var/log/remote/%HOSTNAME%/%PROGRAMNAME%.log") # Route all incoming remote messages to per-host files if $fromhost-ip != "127.0.0.1" then { action(type="omfile" DynaFile="PerHostLogs" template="ISO8601Format") stop # do not process further (avoid writing to local syslog) } # Create log directory structure mkdir -p /var/log/remote chown syslog:adm /var/log/remote # Open firewall for syslog ufw allow from 10.0.0.0/8 to any port 514
Slide 14 of 35
TLS Log Forwarding: Encrypted Syslog
Logs contain sensitive information. Forward them over TLS to prevent interception and ensure authentication.
# Install TLS support for rsyslog apt install rsyslog-gnutls # Client configuration (/etc/rsyslog.d/60-tls-forward.conf) module(load="omfwd") global( DefaultNetstreamDriver="gtls" DefaultNetstreamDriverCAFile="/etc/ssl/sector/ca.pem" DefaultNetstreamDriverCertFile="/etc/ssl/sector/client-cert.pem" DefaultNetstreamDriverKeyFile="/etc/ssl/sector/client-key.pem" ) action( type="omfwd" target="logs.sector.local" port="6514" protocol="tcp" StreamDriver="gtls" StreamDriverMode="1" # 1 = TLS enabled StreamDriverAuthMode="x509/name" StreamDriverPermittedPeers="logs.sector.local" )
# Server configuration (/etc/rsyslog.d/60-tls-server.conf) module(load="imtcp" StreamDriver.Name="gtls" StreamDriver.Mode="1" StreamDriver.Authmode="x509/name") global( DefaultNetstreamDriverCAFile="/etc/ssl/sector/ca.pem" DefaultNetstreamDriverCertFile="/etc/ssl/sector/server-cert.pem" DefaultNetstreamDriverKeyFile="/etc/ssl/sector/server-key.pem" ) input(type="imtcp" port="6514")
Slide 15 of 35
rsyslog Filtering: Property-Based Rules
Route messages based on their content, not just facility and priority. Essential for multi-application deployments.
# Property-based filter syntax: :property, comparison, "value" # Route messages from nginx to a dedicated file :programname, isequal, "nginx" /var/log/nginx/syslog.log # Route messages containing "CRITICAL" to alert file :msg, contains, "CRITICAL" /var/log/sector/critical-alerts.log # Email alert for critical messages :msg, contains, "CRITICAL" :omusrmsg:root # Filter using regex: match any message containing "fail" or "error" :msg, regex, "[Ff]ail|[Ee]rror" /var/log/sector/errors.log # RainerScript (modern rsyslog v7+ syntax: more powerful) if ($programname == "sshd" and $msg contains "Failed password") then { action(type="omfile" file="/var/log/ssh-failures.log") action(type="omfwd" target="siem.sector.local" port="514" protocol="tcp") stop } # Forward only error+ to remote, keep everything locally if prifilt("*.err") then { action(type="omfwd" target="logs.sector.local" port="514" protocol="tcp") } # No stop -- local rules continue to write to /var/log/syslog
Slide 16 of 35
Log Analysis: grep, awk, and cut Patterns
The fastest log analysis tools are the ones already on the system. Master these pipelines for rapid incident triage.
# Count authentication failures by source IP grep 'Failed password' /var/log/auth.log | \ awk '{print $NF}' | \ sort | uniq -c | sort -rn | head -20 # Extract error messages from nginx access log (5xx responses) awk '$9 >= 500' /var/log/nginx/access.log | \ cut -d'"' -f2 | sort | uniq -c | sort -rn # Count events per minute (time histogram) awk '{print $1" "$2" "$3}' /var/log/syslog | \ cut -c1-16 | \ sort | uniq -c | \ tail -30 # Find all unique programs logging errors in the last hour journalctl -p err --since "1 hour ago" -o json-pretty | \ jq -r '.SYSLOG_IDENTIFIER' | \ sort | uniq -c | sort -rn # Timeline of all sudo activity today grep 'sudo:' /var/log/auth.log | \ awk '{print $1" "$2" "$3" "$6" "$NF}'
Slide 17 of 35
nginx Access Log Analysis: Traffic Patterns
The nginx combined log format is a gold mine. These patterns answer the most common operational questions.
# Top 10 IPs by request count awk '{print $1}' /var/log/nginx/access.log | \ sort | uniq -c | sort -rn | head -10 # Top 10 URLs by request count awk '{print $7}' /var/log/nginx/access.log | \ sort | uniq -c | sort -rn | head -10 # HTTP status code distribution awk '{print $9}' /var/log/nginx/access.log | \ sort | uniq -c | sort -rn # Requests per second (approximate): unique seconds in the log awk '{print $4}' /var/log/nginx/access.log | \ tr -d '[' | cut -c1-20 | \ sort | uniq -c | sort -rn | head -5 # Average response size in bytes awk '{sum+=$10; count++} END{print "avg:", sum/count, "bytes"}' /var/log/nginx/access.log # Slow requests (response time > 1 second, if $request_time is logged) awk '$NF > 1.0 {print $NF, $7}' /var/log/nginx/access.log | \ sort -rn | head -10
Slide 18 of 35
journalctl Analysis: Incident Reconstruction
Combine journalctl with standard text processing tools to reconstruct exactly what happened during an incident.
# Count errors per service in the last 24 hours journalctl -p err --since "24 hours ago" -o json | \ jq -r '.SYSLOG_IDENTIFIER' | \ sort | uniq -c | sort -rn | head -10 # Error rate histogram by hour (today) journalctl -p err --since today -o json | \ jq -r '.__REALTIME_TIMESTAMP | tonumber / 1000000 | todate | .[0:13]' | \ sort | uniq -c # Find the exact sequence of events during the incident window journalctl --since "2026-04-09 02:44" \ --until "2026-04-09 02:52" \ -o short-iso-precise \ --no-pager | less # Show messages from multiple units in chronological order journalctl -u nginx -u postgresql -u sector-worker \ --since "2026-04-09 02:44" \ --until "2026-04-09 02:52" \ --no-pager | sort -k1,2 # Extract just the MESSAGE field for grepping journalctl -u nginx --since today -o json | \ jq -r '.MESSAGE' | \ grep -i 'connect\|timeout\|refused'
Slide 19 of 35
Real-Time Monitoring: Watching Multiple Streams
During an active incident, monitor multiple log streams simultaneously without losing context.
# multitail: watch multiple log files in split-screen apt install multitail multitail /var/log/syslog /var/log/nginx/error.log /var/log/auth.log # Watch two journalctl streams simultaneously with tmux # Pane 1: journalctl -u nginx -f -o short-iso # Pane 2: journalctl -u postgresql -f -o short-iso # Pane 3: journalctl -p err -f # Alert on keywords in live log stream journalctl -f | while IFS= read -r line; do if echo "$line" | grep -qiE 'CRITICAL|segfault|OOM killer|panic'; then echo "[ALERT] $line" echo "$line" | mail -s "[LOG ALERT] $(hostname)" ops@sector.local fi done # Count events per second (live event rate) journalctl -f -o json | while read -r line; do (( count++ )) if (( count % 100 == 0 )); then echo "$count events"; fi done
Slide 20 of 35
Centralized Logging: Loki and Grafana
Grafana Loki is a horizontally-scalable log aggregation system designed for modern infrastructure. It indexes metadata, not content.
Hosts promtail Loki Label Index Grafana LogQL Query Alerts Dashboard
Architecture
promtail is the log shipper agent (similar to Filebeat). It tails local logs and the journal, adds labels, and pushes to Loki. Loki stores log streams indexed by label. Grafana provides the query UI. The stack is lightweight -- Loki does not index message content.
Labels vs Full-Text Index
Loki indexes only labels (metadata: hostname, service, environment). Full-text search is done by streaming content at query time. This makes Loki much cheaper to run than Elasticsearch for log data. But it makes ad-hoc text searches slower.
LogQL
Loki's query language. {job="nginx"} |= "error" selects nginx logs containing "error". Supports regex, JSON parsing, metric extraction, and rate calculations. Integrated with Grafana alerting for threshold-based log alerts.
# promtail config: tail systemd journal and nginx logs scrape_configs: - job_name: systemd journal: max_age: 12h labels: job: systemd host: sector-node-01 - job_name: nginx static_configs: - targets: [localhost] labels: job: nginx __path__: /var/log/nginx/*.log
Slide 21 of 35
ELK Stack: Elasticsearch, Logstash, Kibana
The ELK stack is the traditional enterprise log platform. Full-text indexed, powerful query language, mature ecosystem.
Elasticsearch
Full-text search engine that indexes every word in every log line. Supports complex queries: range filters, boolean logic, fuzzy matching, aggregations. High storage cost due to full indexing. Best for environments where ad-hoc investigation needs are high.
Logstash / Filebeat
Logstash is a processing pipeline: input, filter (grok, mutate, date), output. Heavyweight. Filebeat is a lightweight shipper -- reads files and forwards to Elasticsearch or Logstash. Use Filebeat for standard log shipping; Logstash only when transformation is needed.
Kibana
Web UI for Elasticsearch. Discover view for ad-hoc search. Dashboard for visualization. Alerts for threshold-based notifications. Lens for custom charts. SIEM app for security analytics. The ELK stack is operationally expensive but feature-complete.
# rsyslog forwarding to Elasticsearch via omelasticsearch module(load="omelasticsearch") action( type="omelasticsearch" server="elasticsearch.sector.local" serverport="9200" searchIndex="syslog-%$year%.%$month%.%$day%" template="JSONFormat" )
Slide 22 of 35
Log Retention: Compliance and Operational Requirements
How long you keep logs is not an operational preference -- it is determined by legal requirements and incident response needs.
Compliance Minimums
PCI-DSS: 1 year minimum, 3 months immediately available. HIPAA: 6 years. SOX: 7 years for financial records. GDPR: minimum necessary (but security logs can justify longer retention). NIST 800-53: agency-defined, typically 3 years for AU-11.
Operational Reality
Security incidents are often discovered months after they occur. A 90-day local retention minimum is defensible. Compress and archive to cold storage (S3 Glacier, tape) for long-term compliance retention. On-disk for fast queries; archive for compliance evidence.
#!/usr/bin/env bash # log-archive.sh -- compress and archive logs older than 30 days set -euo pipefail ARCHIVE_DIR="/mnt/log-archive" LOG_DIR="/var/log/remote" CUTOFF_DAYS=30 find "$LOG_DIR" -name '*.log' -mtime +"$CUTOFF_DAYS" | \ while IFS= read -r logfile; do DEST="${ARCHIVE_DIR}${logfile#$LOG_DIR}.gz" mkdir -p "$(dirname "$DEST")" gzip -9 -c "$logfile" > "$DEST" sha256sum "$DEST" >> "${ARCHIVE_DIR}/MANIFEST.sha256" rm "$logfile" logger -t log-archive "Archived: $logfile" done
Slide 23 of 35
Parsing Logs: grok and awk Patterns
Unstructured logs require parsing before they can be queried as data. These patterns cover the most common log formats.
# Parse nginx combined log format into fields with awk # Format: $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes awk 'BEGIN{FS=" "} { ip=$1; ts=$4; request=$7; status=$9; size=$10; gsub(/\[/, "", ts); printf "IP=%s STATUS=%s URL=%s SIZE=%s\n", ip, status, request, size }' /var/log/nginx/access.log | head -5 # Parse syslog format: extract program, PID, and message awk '{ month=$1; day=$2; time=$3; host=$4; prog_pid=$5; sub(/:$/, "", prog_pid); split(prog_pid, a, "["); prog=a[1]; pid=a[2]; gsub(/\]/, "", pid); msg=""; for(i=6;i<=NF;i++) msg=msg" "$i; printf "PROG=%s PID=%s MSG=%s\n", prog, pid, msg }' /var/log/syslog | head -3 # logstash grok pattern for nginx combined format # %{IPORHOST:client_ip} - %{USER:ident} \[%{HTTPDATE:timestamp}\] # "%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}" # %{NUMBER:response_code} %{NUMBER:bytes}
Slide 24 of 35
Security Analysis: Patterns to Look For
Specific log patterns correlate with specific attack types. Know them on sight.
# SSH brute force: failed password attempts grep 'Failed password' /var/log/auth.log | \ awk '{print $(NF-3)}' | \ sort | uniq -c | sort -rn | \ awk '$1 > 10' # IPs with >10 failures = brute force # Successful logins after failures (potential successful breach) SUSPICIOUS_IP="203.0.113.42" grep "$SUSPICIOUS_IP" /var/log/auth.log | \ grep -E 'Failed|Accepted' # sudo privilege escalation by non-standard users grep 'sudo:' /var/log/auth.log | \ grep -v 'deploy\|ansible\|root' # filter known-good accounts # New user accounts created (potential backdoor accounts) grep 'useradd\|adduser' /var/log/auth.log # Large data transfers in nginx logs (potential exfiltration) awk '$10 > 10000000' /var/log/nginx/access.log | \ awk '{printf "%s MB from %s to %s\n", $10/1048576, $1, $7}' # Check for log gaps (sign of tampering) journalctl --list-boots | grep -v '^No cached' journalctl -o json --since today | jq -r '.__REALTIME_TIMESTAMP' | \ awk 'NR>1{diff=$1-prev; if(diff>120000000) print "GAP: "diff/1000000"s at "prev} {prev=$1}'
Slide 25 of 35
imfile: Ingesting Application Log Files
Applications that write to files instead of syslog need the imfile module to bring their output into the rsyslog pipeline.
# /etc/rsyslog.d/40-appfiles.conf # Tail custom application log files and route to syslog pipeline module(load="imfile" PollingInterval="10") # Tail the sector application log input(type="imfile" File="/var/log/sector/app.log" Tag="sector-app" Severity="info" Facility="local0" PersistStateInterval="200" # save position every 200 lines StateFile="sector-app-state") # Tail nginx application error log separately from access log input(type="imfile" File="/var/log/nginx/error.log" Tag="nginx-error" Severity="err" Facility="daemon" StateFile="nginx-error-state") # Wildcard: tail all .log files in a directory input(type="imfile" File="/var/log/apps/*.log" Tag="apps" Facility="local1" StateFile="apps-wildcard-state" Wildcard="on")
Slide 26 of 35
journald to rsyslog: Bridging the Two Systems
Configure both logging systems to work together: journald captures everything locally, rsyslog forwards to central servers.
journald Binary Store ForwardToSyslog=yes syslog socket path imjournal module direct binary read rsyslog SIEM
Option 1: journald Forwards to Syslog
Set ForwardToSyslog=yes in journald.conf. journald forwards copies of all entries to the syslog socket. rsyslog receives them as normal syslog messages. Simple to configure but duplicates all entries.
Option 2: rsyslog Reads Journal Directly
Load the imjournal module in rsyslog. rsyslog reads directly from the journal binary. More efficient than the syslog socket path. Preserves journal-native fields like _SYSTEMD_UNIT. Disable ForwardToSyslog to avoid double-processing.
# Recommended configuration: imjournal + no syslog forwarding # /etc/systemd/journald.conf [Journal] ForwardToSyslog=no # rsyslog reads directly, no need to forward Storage=persistent # /etc/rsyslog.conf # Use imjournal instead of imuxsock module(load="imjournal" StateFile="/var/lib/rsyslog/imjournal.state" Ratelimit.Interval="600" Ratelimit.Burst="20000" IgnorePreviousMessages="off" UsePid="system") # Now rsyslog receives all journal entries natively # and can forward them with the full rsyslog rule set
Slide 27 of 35
Anomaly Detection: Volume-Based Alerting
A sudden spike in error rates is often the first visible symptom of an incident, before users start reporting problems.
#!/usr/bin/env bash # log-rate-monitor.sh -- alert if error rate spikes above threshold set -euo pipefail THRESHOLD=50 # errors per minute threshold WINDOW=1 # minutes to look back RECENT_ERRORS="$(journalctl -p err --since "${WINDOW} minute ago" -q \ --no-pager 2>/dev/null | wc -l)" if (( RECENT_ERRORS > THRESHOLD )); then MSG="ERROR SPIKE: $RECENT_ERRORS errors in last ${WINDOW}min on $(hostname)" logger -t log-monitor -p local0.crit "$MSG" echo "$MSG" | mail -s "[LOG SPIKE] $(hostname)" ops@sector.local # Capture the top error sources for the alert journalctl -p err --since "${WINDOW} minute ago" -o json --no-pager | \ jq -r '.SYSLOG_IDENTIFIER' | \ sort | uniq -c | sort -rn | head -5 | \ mail -s "[LOG SPIKE] Top error sources" ops@sector.local fi # Run every minute: */1 * * * * root flock -n /var/lock/log-rate.lock \ # /usr/local/bin/log-rate-monitor.sh
Slide 28 of 35
logwatch: Automated Daily Log Digest
logwatch summarizes yesterday's logs across all services into a readable daily digest email.
# Install logwatch apt install logwatch # Generate a report to stdout (test before scheduling) logwatch --output stdout --format text --detail high # Generate report for a specific service logwatch --service sshd --output stdout # Generate HTML report logwatch --output mail --format html \ --mailto ops@sector.local \ --detail high # /etc/logwatch/conf/logwatch.conf (key settings) # Output = mail # Format = html # MailTo = ops@sector.local # MailFrom = logwatch@$(hostname) # Range = yesterday # Detail = high # Service = All # Mailer = /usr/sbin/sendmail -t # logwatch runs automatically via cron: /etc/cron.daily/00logwatch # Customize what it reports: /etc/logwatch/conf/services/
Slide 29 of 35
Log Tampering: Detecting and Preventing
An attacker who can delete logs can hide their activity. Detect tampering and make it harder to accomplish silently.
# Detect log file truncation (mtime newer than expected for size) stat /var/log/auth.log | grep -E 'Size|Modify' # Check for gaps in journal (unexpected restarts of journald) journalctl --list-boots # Multiple unexpected short-duration boots = potential tampering or crash # auditd: watch auth.log for write/truncate (catches log clearing) # Add to audit rules: # -w /var/log/auth.log -p wa -k log-tampering # -w /var/log/syslog -p wa -k log-tampering # -w /var/log/audit/ -p wa -k log-tampering # append-only flag: log files can be written but not truncated or deleted chattr +a /var/log/auth.log chattr +a /var/log/syslog # NOTE: logrotate cannot rotate +a files -- remove flag in postrotate, re-add after # Forward to remote server immediately (golden rule) # Once an event is on the remote server, local deletion cannot erase it # This is the most effective anti-tampering control # Verify log continuity: check last entry timestamp matches expected rate journalctl --since today -o json | \ jq -r '.__REALTIME_TIMESTAMP' | tail -1
Slide 30 of 35
Log Correlation: Connecting Events Across Services
Incidents rarely manifest in a single service. Correlating logs across services reveals the causal chain.
#!/usr/bin/env bash # correlate.sh -- combine logs from multiple services around a time window set -euo pipefail SINCE="${1:-'30 minutes ago'}" UNTIL="${2:-'now'}" echo "=== CORRELATION REPORT: $SINCE to $UNTIL ===" # All errors in the window, chronological, all services journalctl -p err --since "$SINCE" --until "$UNTIL" \ -o short-iso-precise --no-pager 2>/dev/null echo "=== AUTH EVENTS ===" grep -h '' /var/log/auth.log | \ awk "-v since=\"$SINCE\"" '...' 2>/dev/null || true echo "=== NGINX ERRORS ===" awk '\$9 >= 500' /var/log/nginx/access.log 2>/dev/null || true echo "=== NETWORK CONNECTIONS ===" ss -tp state established echo "=== DISK ===" df -h | awk 'int($5) > 80' echo "=== LOAD ===" uptime
Slide 31 of 35
Log Lifecycle: From Generation to Archive
A complete log lifecycle design ensures logs are available when needed, do not fill disks, and meet retention requirements.
Hot Tier (0-30 days)
Uncompressed on local disk in /var/log/. journald binary journal. Full-text searchable with journalctl and grep. logrotate manages rotation and compression. Maximum query speed for recent incidents.
Warm Tier (30-90 days)
Compressed files on a central log server. Queryable but requires decompression (zcat | grep). rsync from hot tier nightly. Still on spinning disk or fast NAS. Used for post-incident investigations that cross the 30-day local retention window.
Cold Tier (90 days - retention limit)
Compressed archives on object storage (S3, Backblaze, tape). Retrieval time measured in minutes to hours. Integrity-protected with SHA-256 manifest. Primarily for compliance evidence, not operational use.
Design Principle
Design the lifecycle before the incident happens. When an auditor asks for all authentication events from 90 days ago and you discover you only kept 7 days of logs, it is too late. Archive decisions must be made at deployment time, not during an audit.
Slide 32 of 35
Troubleshooting rsyslog: When Logs Stop Arriving
A systematic diagnostic process for rsyslog forwarding failures.
1Check rsyslog is running: systemctl status rsyslog. If stopped, logs are not being processed. Check journalctl for the crash reason.
2Syntax check: rsyslogd -N1. A config error will prevent rsyslog from starting. This command validates without running.
3Test the network path: nc -zv logs.sector.local 514. If the connection fails, check firewall rules on both ends. Logs will queue in the disk buffer but not forward.
4Check the queue files: ls -lh /var/lib/rsyslog/*.qi. A growing queue file means the remote is down or rejecting messages.
5Enable debug logging: set $DebugLevel 2 in rsyslog.conf temporarily. Verbose output goes to /var/log/rsyslog-debug.log. Remove after diagnosis.
6Check disk space: df -h /var/log. rsyslog will stop writing if the filesystem is full. logrotate failure causes this.
Slide 33 of 35  |  Applied Configuration
Applied: Complete Node Logging Setup
The full configuration stack for a production node: journald persistent, rsyslog with forwarding, and logrotate.
# 1. journald: persistent, 30-day retention # /etc/systemd/journald.conf [Journal] Storage=persistent SystemMaxUse=2G MaxRetentionSec=30day ForwardToSyslog=no # rsyslog reads journal directly Compress=yes # 2. rsyslog: local routing + forwarding # /etc/rsyslog.conf (summary) module(load="imjournal" StateFile="/var/lib/rsyslog/imjournal.state") auth,authpriv.* /var/log/auth.log kern.* /var/log/kern.log local0.* /var/log/sector/app.log *.warning /var/log/syslog # Forward everything to central with disk queue action(type="omfwd" target="logs.sector.local" port="514" protocol="tcp" queue.type="Disk" queue.filename="fwd-q" queue.saveOnShutdown="on") # 3. logrotate: /etc/logrotate.d/sector # /var/log/syslog /var/log/auth.log /var/log/sector/app.log { # daily; rotate 14; compress; delaycompress; missingok; notifempty # postrotate # systemctl restart rsyslog # endscript # }
Slide 34 of 35  |  Applied Automation
Applied: Log-Driven Alert Pipeline
A complete alerting pipeline from log event to ops notification without a SIEM -- using only rsyslog and bash.
# /etc/rsyslog.d/70-alerts.conf # Route critical events to an alert script via omprogram module(load="omprog") # Forward critical+ messages to an alerting script if prifilt("*.crit") then { action(type="omprog" binary="/usr/local/bin/log-alert.sh" template="JSONFormat" output="/var/log/alert-pipe.log") }
#!/usr/bin/env bash # /usr/local/bin/log-alert.sh -- receive JSON log events from rsyslog omprog # stdin receives one JSON object per line set -euo pipefail while IFS= read -r event; do SEV="$(echo "$event" | jq -r '.sev' 2>/dev/null || echo 'unknown')" MSG="$(echo "$event" | jq -r '.msg' 2>/dev/null || echo 'unknown')" HOST="$(echo "$event" | jq -r '.host' 2>/dev/null || echo 'unknown')" PROG="$(echo "$event" | jq -r '.prog' 2>/dev/null || echo 'unknown')" # Webhook alert (Slack/PagerDuty) curl -sf -X POST "${WEBHOOK_URL}" \ -H 'Content-Type: application/json' \ -d "{\"text\": \"[$SEV] $HOST/$PROG: $MSG\"}" >/dev/null done
Slide 35 of 35  |  ALA-09 Summary
Log Management: What You Now Know
You can now reconstruct any incident on your infrastructure. You know how to query journald for the incident window, forward logs reliably to central servers, parse and correlate events across services, and build alerting pipelines that catch problems before users do.
1journald stores structured binary logs. rsyslog handles routing and forwarding. They are complementary -- use both. journald for local investigation, rsyslog for centralized collection.
2journalctl --since "2026-04-09 02:40" --until "2026-04-09 03:00" -p err is the first command to run during any incident investigation.
3Enable persistent journald storage: mkdir /var/log/journal. Without this, logs are lost on reboot and you lose your post-incident evidence.
4rsyslog forwarding over TCP with a disk queue is the production standard. Without a disk queue, messages are lost when the remote server is down.
5Use facility local0..local7 for your applications. Route them to dedicated files. Never write everything to /var/log/syslog -- it becomes unsearchable.
6Forward logs off the source node in real time. A local-only log that an attacker can delete is not an audit trail -- it is a suggestion.
7Log format matters. JSON output from applications enables automated parsing and SIEM ingestion. Unstructured text logs require regex extraction that is fragile.
8Design retention tiers before an incident. Hot (local, 30 days), warm (central, 90 days), cold (archive, compliance period). The decision cannot be made retroactively.
9An unmonitored log is the same as no log. Automated daily review with logwatch plus threshold-based alerting ensures logs are actually used, not just collected.