CH03 — PRESENTATION

Grep, Pipes & Text Processing

The power of Linux lies in combining small tools through pipes. Master grep and its flags, understand I/O redirection operators, build multi-stage pipelines, and leverage awk and sed for text transformation — the skills that separate Linux beginners from administrators.

Slide 1 — grep: Global Regular Expression Print

Search for Patterns in Files and Streams

grep searches input for lines matching a pattern and prints matching lines to stdout. It reads files or standard input, making it the universal "find what I'm looking for" tool in Linux pipelines. Named after the g/re/p command in the ed text editor: global / regular expression / print.

The basic syntax: grep [OPTIONS] PATTERN [FILE...]. Without a file argument, grep reads from stdin — enabling its use at the end of a pipe.

# Basic grep usage $ grep "error" /var/log/syslog # Find "error" in syslog $ grep "Failed" /var/log/auth.log # Find failed login attempts $ grep "root" /etc/passwd # Find lines containing "root" # grep reads from stdin when no file is given $ cat /etc/passwd | grep "root" # Pipe to grep $ dmesg | grep "eth0" # Filter kernel messages for network
Slide 2 — grep Flags and Options
-i
Case-insensitive — matches regardless of case. grep -i "error" matches "Error", "ERROR", "error".
-r
Recursive — search all files in a directory tree. grep -r "password" /etc/ searches every file under /etc.
-v
Invert match — print lines that do NOT match. grep -v "^#" removes comment lines from config output.
-c
Count — print only the number of matching lines, not the lines themselves. Useful for quick statistics.
-n
Line numbers — prefix each output line with its line number in the file. Essential for locating config errors.
-l
Files only — print only filenames of files containing matches. grep -rl "TODO" /src/
-A/-B/-C
Context lines-A 3 shows 3 lines After match, -B 2 shows 2 Before, -C 2 shows 2 around. Critical for log analysis.
-E
Extended regex — same as egrep. Enables +, ?, |, () without escaping. grep -E "error|warning"
-w
Whole word — matches pattern only as a complete word. grep -w "root" matches "root" but not "chroot".
-o
Only matching — print only the matched portion, not the whole line. Useful for extracting specific data from lines.
# Real-world grep examples $ grep -in "failed" /var/log/auth.log # Case-insensitive, with line numbers $ grep -rn "password" /etc/ 2>/dev/null # Recursive search, suppress errors $ grep -v "^#" /etc/ssh/sshd_config # Show non-comment lines only $ grep -c "404" /var/log/nginx/access.log # Count 404 errors $ grep -A 5 "OOM killer" /var/log/syslog # OOM event + 5 lines of context $ grep -E "error|warning|critical" app.log # Match multiple patterns $ grep -o "192\.168\.[0-9]*\.[0-9]*" log # Extract IP addresses only
Slide 3 — Regular Expression Basics

Pattern Matching Language

Regular expressions (regex) are a pattern-matching language used by grep, sed, awk, and many other tools. Learning even the basics dramatically expands what you can do with text processing.

PatternMeaningExampleMatches
.Any single character (except newline)r..troot, raat, r00t
^Start of line^rootLines starting with "root"
$End of linesh$Lines ending with "sh"
*Zero or more of precedinger*oreror, error, errror
+One or more (extended regex)er+oreror, error (not "eor")
?Zero or one (extended regex)colou?rcolor, colour
[abc]Character class[Ff]ailedFailed, failed
[^abc]Negated class[^0-9]Any non-digit
\bWord boundary\broot\b"root" but not "chroot"
(a|b)Alternation (extended)(error|warn)error or warn
Slide 4 — The Pipe Operator and I/O Redirection

Connecting Commands: The Unix Philosophy in Action

The Unix philosophy: write programs that do one thing well and work with other programs. The pipe | makes this possible — it connects the stdout of one command to the stdin of the next, creating data transformation pipelines without temporary files.

>
Redirect stdout
Write stdout to a file. Creates the file if needed. Overwrites existing content.
>>
Append stdout
Append stdout to a file. Creates file if needed. Does NOT overwrite existing content.
<
Redirect stdin
Read stdin from a file instead of keyboard. Feed file contents to a command as input.
2>
Redirect stderr
Redirect error messages (fd 2) to a file. 2>/dev/null silences all errors.
|
Pipe
Connect stdout of left command to stdin of right command. Chains commands into a pipeline.
tee
Tee
Read stdin, write to both stdout AND a file simultaneously. Split the stream.
# I/O Redirection examples $ date > timestamp.txt # Write date to file (overwrites) $ date >> log.txt # Append date to log file $ sort < unsorted.txt # Read file into sort's stdin $ grep "error" log 2>/dev/null # Discard error messages $ grep "warn" log 2>&1 # Redirect stderr to stdout $ cat access.log | grep "500" > errors.txt # Filter and save $ ls -la | tee listing.txt # Display AND save to file
Slide 5 — Building Pipelines

Multi-Stage Data Processing

Pipelines chain multiple commands together, each transforming the data stream. The output of each command flows as input into the next. Classic security analysis workflow: extract raw data, filter to relevant lines, sort, deduplicate, count.

Example: Count unique failed SSH IPs in auth.log

cat /var/log/auth.log
|
grep "Failed password"
|
awk '{print $11}'
|
sort
|
uniq -c
|
sort -rn
|
head -20
# Classic pipeline toolkit $ sort file.txt | uniq -c | sort -rn # Count occurrences, most frequent first $ ps aux | grep nginx | grep -v grep # Find nginx processes (exclude grep itself) $ cat /etc/passwd | cut -d: -f1 | sort # Extract and sort usernames $ find /var/log -name "*.log" | xargs grep -l "error" # Find logs with errors $ netstat -tuln | grep LISTEN | awk '{print $4}' # Show listening ports # awk basics — process text column by column $ awk '{print $1}' /var/log/apache2/access.log # Print first field (IP) $ awk -F: '{print $1, $3}' /etc/passwd # Print username and UID (: delimiter) $ awk '$3 > 1000' /etc/passwd # Lines where field 3 > 1000 # sed basics — stream edit (find and replace) $ sed 's/error/ERROR/g' log.txt # Replace all "error" with "ERROR" $ sed '/^#/d' /etc/ssh/sshd_config # Delete comment lines $ sed -n '10,20p' file.txt # Print lines 10 through 20 # xargs — build commands from stdin $ cat hosts.txt | xargs ping -c 1 # Ping each host in file $ find . -name "*.tmp" | xargs rm # Delete all .tmp files found
SECURITY ANALYST'S SUPERPOWER

Grep and pipes are the most important tools in a security analyst's CLI arsenal. A typical log analysis workflow: cat /var/log/auth.log | grep "Failed" | grep -oE "([0-9]+\.){3}[0-9]+" | sort | uniq -c | sort -rn | head -10 — this extracts all IPs that caused failed SSH logins, counts attempts per IP, and shows the top 10 attackers in a single line. No Python script required. The same pattern applies to web server logs, firewall logs, and IDS alerts. Master this pattern before reaching for a scripting language.

Presentation Complete

Mark complete to save your progress and unlock the Chapter 3 quiz.

Progress saved. Head to the quiz to test your knowledge.