ALA-R5: Signal Processing

ALA-R5

Signal Processing

Adv Linux / ALA-R5
< Course Index

Operational Briefing

Mission Context:

"Raw output from system commands is almost never what you want. It is the raw signal. Pipelines transform that signal into actionable intelligence. The five tools in this module are the core of every log analysis, config audit, and forensic extraction you will perform in this course."

The Pipeline Architecture

A pipeline connects the stdout of one command to the stdin of the next. Each command in the pipeline runs as a separate subprocess. The shell wires them together with kernel pipes, not temporary files.

# Filter process list to find sshd without showing grep itself ps aux | grep sshd | grep -v grep # Count error-level entries in syslog cat /var/log/syslog | grep 'error' | wc -l # Pipe both stdout and stderr together (bash syntax) command |& grep 'WARN' # Fail the pipeline if any command in it fails set -o pipefail

By default, a pipeline's exit code is the exit code of the last command only. With set -o pipefail active, the pipeline returns the exit code of the first failing command. Always set this in scripts that use pipelines for critical operations. Without it, a failure mid-pipeline is silently swallowed.

Operational Context:

Pipelines transform raw output from system commands into actionable intelligence. This is the fundamental pattern of all log analysis work.

grep: Pattern Extraction

grep is a forensic instrument as much as a search tool. In log analysis, incident response, and config auditing, it is almost always the first step in a pipeline.

# Case-insensitive search in auth log grep -i 'failed' /var/log/auth.log # Recursive search with permission error suppression grep -r 'password' /etc/ 2>/dev/null # Exclude comment lines (lines starting with #) grep -v '^#' /etc/ssh/sshd_config # Extended regex: match IPv4 addresses in a log file grep -E '([0-9]{1,3}\.){3}[0-9]{1,3}' access.log # Count matching lines grep -c 'FAILED' /var/log/auth.log # Show only filenames that contain a match grep -rl 'PasswordAuthentication yes' /etc/

Key flags: -i (case-insensitive), -r (recursive), -v (invert match), -n (line numbers), -c (count), -E (extended regex), -o (print only matching text, not full line), -l (filenames only). -E eliminates the need to escape special regex characters.

Operational Context:

Searching auth logs for failed login patterns, scanning configs for insecure settings: grep is a forensic instrument.

sed: Stream Editing

sed processes text line by line and applies transformations: substitute, print a range, delete, extract time windows. It is the standard tool for programmatic config file edits and log slicing.

# Print only lines 100-200 of a log file sed -n '100,200p' /var/log/syslog # Substitute a config value (dry run to stdout) sed 's/PasswordAuthentication yes/PasswordAuthentication no/' sshd_config # In-place edit with a backup (the .bak file is your safety net) sed -i.bak '/^#/d' config.conf # Print lines between two timestamp patterns sed -n '/2026-04-11 03:47/,/2026-04-11 03:58/p' auth.log

Always use -i.bak (not bare -i) when editing files in place. The .bak suffix creates a backup before the edit, giving you a one-command recovery path: mv file.conf.bak file.conf. The s/old/new/g syntax substitutes every occurrence on each line; without the trailing g it only replaces the first match per line.

Operational Context:

When hardening a cell, you will edit config files programmatically. sed -i with a backup extension is the safe way to do it.

awk: Structured Data Processing

awk treats text as rows and columns. System logs, /etc/passwd, and command output all have consistent field structures that awk can process directly without intermediate parsing.

# Print first field and last field of each line awk '{print $1, $NF}' /var/log/access.log # Use : as field separator; print usernames with UID >= 1000 awk -F: '$3 >= 1000 {print $1}' /etc/passwd # Count FAILED lines, print total at the end awk '/FAILED/ {count++} END {print count}' auth.log # Print header, then process lines, then print footer awk 'BEGIN {print "=== Report ==="} /Accepted/ {print $1, $9} END {print "=== Done ==="}' auth.log

Built-in variables: $0 (entire line), $1..$N (fields), NF (number of fields), NR (current row number), FS (field separator, set with -F). The BEGIN block runs before any input; END runs after all input. Conditionals inside the action block filter which lines trigger the action.

Operational Context:

System logs have consistent field structures. awk processes them as rows and columns, enabling extraction from even enormous log files.

tee, xargs, and Process Substitution

These three tools cover the cases where simple pipelines fall short: when you need to split output, pass filenames instead of content, or compare two command outputs side by side.

# Write to file and stdout simultaneously (for live view + saved record) command 2>&1 | tee /tmp/diagnostic.log # Find log files containing 'error' - xargs passes filenames, not content find /var/log -name "*.log" | xargs grep -l 'error' # Back up every .conf file in the current directory ls *.conf | xargs -I{} cp {} {}.bak # Process substitution: compare sorted versions of two files diff <(sort file1.txt) <(sort file2.txt)

tee duplicates a stream: it writes to a file and also passes the stream downstream. Use it whenever you want both a saved log and live terminal output. xargs -I{} lets you specify exactly where the input goes in the command template. Process substitution <(command) creates a file descriptor from command output, allowing tools that expect file arguments (like diff) to work with command output directly.

Operational Context:

When you need to pass filenames instead of content, use xargs. When you need to compare two command outputs, use process substitution.

Redirection: Complete Reference

Redirection controls where stdin, stdout, and stderr go. Mastering it fully is prerequisite to writing clean bash scripts in Week 3. Errors that disappear into /dev/null during an incident cannot be debugged.

OperatorMeaning
> fileRedirect stdout, overwrite file
>> fileRedirect stdout, append to file
< fileRead stdin from file
2> fileRedirect stderr only
2>&1Merge stderr into stdout stream
&> fileRedirect both stdout and stderr to file
> /dev/nullDiscard output entirely
# Here document: feed multi-line text as stdin to a command cat << 'EOF' line one line two EOF # Redirect both streams to a file, silent operation ./script.sh &> /var/log/script.log # Discard stderr noise while keeping stdout find / -name "target" 2>/dev/null

Order matters. 2>&1 >file and >file 2>&1 are not equivalent. The shell processes redirections left to right. To send both streams to a file, the correct order is >file 2>&1, or use the shorthand &>file.

Operational Context:

Understanding redirection completely is prerequisite to writing clean bash scripts in Week 3. Errors that disappear into /dev/null cannot be debugged.

Self-Check

  1. Write a single pipeline that counts the number of failed SSH login attempts in /var/log/auth.log using only grep and wc.
  2. What does set -o pipefail do, and why does it matter in a script that pipes output through several filters?
  3. Write an awk command that reads /etc/passwd and prints only the usernames of accounts with UID 0 (root-level UIDs).
  4. Write the one-line pipeline that finds the top 5 IP addresses in /var/log/nginx/access.log by request count, using only awk, sort, uniq, and head.

If you are comfortable with all of these, you have completed the Week 0 refresher sequence. Week 1 begins with ALA-01.