Bash Scripting Fundamentals | Advanced Linux Administration

Slide 1 of 35  |  ALA-05  |  Week 3 of 8
Bash Scripting
Fundamentals
Variables  •  Conditionals  •  Loops  •  Functions  •  Arrays  •  getopts
Cell operations do not run by hand. Every repeatable action in the Matrix gets scripted, tested, and deployed. This lecture takes you from knowing bash syntax to building scripts that are safe under failure, readable by anyone, and production-worthy from the first line.
35 Slides ALA-05 Week 3 of 8 Ubuntu 22.04 LTS
Slide 2 of 35
Script Structure: Shebang to Execution
Every production script has a consistent anatomy. Learn it once, apply it everywhere.
#!/bin/bash PARSE EXECUTE EXIT CODE $? = 0 | 1
#!/usr/bin/env bash # script-name.sh — one-line description of what this script does # Usage: ./script-name.sh [OPTIONS] <argument> # Author: sector-admin | Updated: 2026-04-09 set -euo pipefail # exit on error, unset vars, pipe failures IFS=$'\n\t' # safe word splitting: newlines and tabs only # --- Constants (UPPER_CASE by convention) --- readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" readonly LOG_FILE="/var/log/sector/$(basename "$0" .sh).log" # --- Functions go before main logic --- log() { echo "[$(date +%T)] $*" | tee -a "$LOG_FILE"; } # --- Main logic at the bottom --- log "Script started."
#!/usr/bin/env bash
Use env bash over /bin/bash. It finds bash in PATH, making the script portable across systems where bash lives in non-standard locations.
set -euo pipefail
Non-negotiable safety net. Exit on error, treat unset variables as errors, fail pipelines if any stage fails. Every production script starts with this.
IFS=$'\n\t'
The default IFS includes spaces, which causes word splitting bugs with filenames containing spaces. Narrowing IFS prevents a class of silent, destructive bugs.
Slide 3 of 35
Variables: Declaration and Scope
Bash variables have no types by default. Every value is a string unless you declare otherwise.
export (child processes inherit) global (current shell) local var="fn only" NAME="global"
# Assignment: no spaces around the = sign NAME="sector-node-42" # local to current shell export REGION="grid-north" # available to child processes readonly MAX_RETRIES=3 # cannot be reassigned or unset declare -i COUNT=0 # integer type: arithmetic operations enforced declare -l username="ADMIN" # auto-lowercases on assignment: admin declare -u STATUS="active" # auto-uppercases on assignment: ACTIVE # Safe expansion with defaults HOST="${TARGET_HOST:-localhost}" # use localhost if TARGET_HOST is unset or empty PORT="${TARGET_PORT:=9000}" # assign 9000 if unset, then use it echo "${REQUIRED_VAR:?'REQUIRED_VAR must be set'}" # abort if unset # Integer arithmetic COUNT=$((COUNT + 1)) echo $((2 ** 10)) # 1024 (( COUNT++ )) # increment in-place
Quoting Rule
Always double-quote variable expansions: "$VAR" not $VAR. Unquoted expansions undergo word splitting and globbing. A variable containing a space or * will silently break your command. Quote everything unless you have a deliberate reason not to.
Slide 4 of 35
Special Variables: $?, $$, $!, $@, $#
Bash provides a set of built-in variables that expose runtime information about the script and its arguments.
./script.sh arg1 arg2 arg3 $0 script.sh $1 arg1 $2 arg2 $3 arg3 $# = 3 count $@ = "arg1" "arg2" "arg3" $? = 0 last exit
Exit and Process
$? exit code of the last command (0 = success). $$ PID of the current shell -- used to create unique temp files. $! PID of the last background process -- used with wait. $0 name of the script itself.
Positional Arguments
$1...$9 positional parameters. ${10} for tenth and beyond. $# count of arguments. $@ all arguments as separate words (safe). $* all arguments as a single word (usually wrong).
#!/usr/bin/env bash set -euo pipefail # $# — argument count check if [[ $# -lt 1 ]]; then echo "Usage: $0 <node-name>" >&2 exit 1 fi NODE="$1" # Safe temp file using $$ TMPFILE="/tmp/scan-$$-${NODE}.txt" trap 'rm -f "$TMPFILE"' EXIT # cleanup on exit, even on error # $@ — iterate over all arguments safely for arg in "$@"; do echo "Processing: $arg" done # $? — capture exit code explicitly ping -c1 "$NODE" >/dev/null 2>&1 STATUS=$? [[ $STATUS -eq 0 ]] && echo "$NODE is reachable" || echo "$NODE is DOWN"
Slide 5 of 35
Conditionals: [[ ]] vs [ ]
Use [[ ]] in bash scripts. It is safer, more powerful, and avoids the word-splitting traps of the POSIX [ ] test command.
# String tests if [[ "$USER" == "root" ]]; then echo "Running as root"; fi if [[ "$HOST" != "localhost" ]]; then echo "Remote host"; fi if [[ -z "$VAR" ]]; then echo "VAR is empty"; fi if [[ -n "$VAR" ]]; then echo "VAR has a value"; fi # Glob patterns inside [[ ]] (no quoting around pattern) if [[ "$FILENAME" == *.log ]]; then echo "Is a log file"; fi # Regex matching with =~ if [[ "$IP" =~ ^[0-9]{1,3}(\.[0-9]{1,3}){3}$ ]]; then echo "Valid IPv4: $IP" fi # Numeric tests: use -eq -ne -lt -le -gt -ge if [[ $LOAD -gt 4 ]]; then echo "High load detected"; fi # File tests [[ -f "$FILE" ]] # exists and is a regular file [[ -d "$DIR" ]] # exists and is a directory [[ -r "$FILE" ]] # readable [[ -x "$FILE" ]] # executable [[ -s "$FILE" ]] # exists and is non-empty [[ -L "$FILE" ]] # is a symlink
Slide 6 of 35
if / elif / else: Decision Branches
Structured conditional logic for scripts that adapt to environment state.
if true action_1 false elif true action_2 false else fallback fi
#!/usr/bin/env bash set -euo pipefail check_service() { local svc="$1" if systemctl is-active --quiet "$svc"; then echo "[OK] $svc is running" elif systemctl is-enabled --quiet "$svc"; then echo "[WARN] $svc is enabled but stopped" return 1 else echo "[FAIL] $svc is not installed or enabled" return 2 fi } # case statement: cleaner than long if/elif chains for discrete values apply_action() { local action="$1" case "$action" in start) systemctl start nginx ;; stop) systemctl stop nginx ;; restart) systemctl restart nginx ;; status) systemctl status nginx ;; *) echo "Unknown action: $action" >&2; return 1 ;; esac } check_service nginx apply_action "${1:-status}"
case vs if/elif
Use case when testing a single variable against multiple discrete values. Use if/elif when conditions involve different variables or complex expressions. The wildcard *) as the last branch catches anything unrecognized.
Slide 7 of 35
Loops: for
Three forms of for loop. Know when to use each.
for item in list more items? yes do ... done continue break no (list exhausted) done
# Form 1: iterate over a list of words for host in web-01 web-02 db-01 cache-01; do ping -c1 -W1 "$host" >/dev/null 2>&1 && echo "$host UP" || echo "$host DOWN" done # Form 2: iterate over command output (use mapfile instead for large sets) for logfile in /var/log/*.log; do SIZE="$(du -sh "$logfile" | cut -f1)" echo "$SIZE $logfile" done # Form 3: C-style arithmetic loop for (( i=1; i<=5; i++ )); do echo "Attempt $i of 5" curl -sf "http://api.sector.local/health" && break sleep 2 done # Iterating over file contents safely (read line by line) while IFS= read -r line; do echo "Host: $line" done < /etc/sector/hosts.txt
Safe File Reading Pattern
Never use for line in $(cat file) -- it splits on whitespace and gobbles special characters. Use while IFS= read -r line; do ... done < file. The IFS= clears field splitting. The -r prevents backslash interpretation.
Slide 8 of 35
Loops: while and until
Event-driven iteration for polling, retry logic, and daemon-style scripts.
while cond true? yes do ... done loop back false (exit loop) done until = inverted
# while: loop while condition is true RETRIES=0 MAX=5 while ! pg_isready -h db-01 >/dev/null 2>&1; do (( RETRIES++ )) if [[ $RETRIES -ge $MAX ]]; then echo "Database unreachable after $MAX attempts" >&2 exit 1 fi echo "Waiting for DB... attempt $RETRIES" sleep 3 done # until: loop until condition becomes true (opposite of while) until systemctl is-active --quiet nginx; do echo "Waiting for nginx to start..." sleep 1 done echo "nginx is up" # Infinite loop with break condition (monitoring pattern) while true; do LOAD="$(cut -d' ' -f1 /proc/loadavg)" if (( $(echo "$LOAD > 8.0" | bc -l) )); then echo "[ALERT] Load spike: $LOAD" | mail -s "Load Alert" ops@sector.local fi sleep 30 done
Slide 9 of 35
Functions: Reusable Logic Blocks
Functions in bash eliminate repetition and make scripts testable and maintainable.
main script caller call($1,$2) my_func() local var1 var2 scoped to function return $? echo output $(my_func) $? 0-255
# Declaration syntax (prefer this form over function keyword) log_info() { echo "[INFO] $(date +%T) $*"; } log_warn() { echo "[WARN] $(date +%T) $*" >&2; } log_error() { echo "[ERROR] $(date +%T) $*" >&2; } # local variables prevent scope leakage deploy_config() { local src="$1" local dest="$2" local backup="${dest}.bak.$(date +%s)" [[ -f "$dest" ]] && cp "$dest" "$backup" cp "$src" "$dest" log_info "Deployed $src -> $dest (backup: $backup)" } # Return values: functions return exit codes (0-255) # To return data, use echo + command substitution get_primary_ip() { local iface="${1:-eth0}" ip -4 addr show "$iface" | awk '/inet /{gsub(/\/.*/, "", $2); print $2}' } PRIMARY_IP="$(get_primary_ip eth0)" log_info "Primary IP: $PRIMARY_IP" deploy_config /etc/sector/nginx.conf.new /etc/nginx/nginx.conf
Scope Rule
All variables in bash are global unless declared with local. A variable set inside a function without local will overwrite any same-named variable in the caller. Always use local for function-internal variables.
Slide 10 of 35
Arrays: Indexed and Associative
Bash supports two array types. Indexed arrays for ordered lists, associative arrays for key-value lookup.
# Indexed array NODES=("web-01" "web-02" "db-01" "cache-01") echo "${NODES[0]}" # web-01 (zero-indexed) echo "${NODES[-1]}" # cache-01 (last element, bash 4.3+) echo "${#NODES[@]}" # 4 (array length) echo "${NODES[@]}" # all elements NODES+=("mon-01") # append element unset 'NODES[2]' # remove element (leaves gap -- use mapfile for dense arrays) # Safe iteration (always use "${array[@]}" not "${array[*]}") for node in "${NODES[@]}"; do ssh "$node" 'hostname -f' 2>&1 done # Associative array (requires declare -A) declare -A SERVICES SERVICES["web"]="nginx" SERVICES["db"]="postgresql" SERVICES["cache"]="redis" for role in "${!SERVICES[@]}"; do # ${!array[@]} = keys systemctl is-active --quiet "${SERVICES[$role]}" && \ echo "[$role] ${SERVICES[$role]}: OK" done
Slide 11 of 35
String Manipulation: Parameter Expansion
Bash has a powerful built-in string processing engine. Use it before reaching for awk or sed.
# Length, substring, and slicing PATH_VAL="/var/log/sector/audit-2026-04-09.log" echo ${#PATH_VAL} # length: 41 echo ${PATH_VAL:10:6} # substring: sector # Remove prefix / suffix (greedy and non-greedy) FILE="audit-2026-04-09.log" echo ${FILE%.log} # remove shortest .log suffix: audit-2026-04-09 echo ${FILE%%-*} # remove longest -* suffix: audit echo ${FILE#audit-} # remove prefix: 2026-04-09.log # Search and replace MSG="node_down node_unreachable node_rebooting" echo ${MSG/node_/NODE_} # replace first: NODE_down node_unreachable... echo ${MSG//node_/NODE_} # replace all: NODE_down NODE_unreachable... # Case conversion (bash 4+) STATUS="critical" echo ${STATUS^} # capitalize first: Critical echo ${STATUS^^} # all caps: CRITICAL echo ${STATUS,,} # all lower: critical # Split string into array on delimiter CSV="web-01,web-02,db-01" IFS=',' read -ra PARTS <<< "$CSV" echo "${PARTS[1]}" # web-02
Slide 12 of 35
Exit Codes: The Language of Script Results
Exit codes are how scripts communicate success or failure to callers, cron, and orchestration systems.
0 success 1 general 2 - 125 user-defined errors 126-127 cmd errors 128+N signal kills 130 137 143
0 = Success
Exit code 0 means the script completed successfully. This is what && tests for. If your script reaches the end without calling exit, it exits with the exit code of the last command run.
1 = General Error
Exit code 1 is the conventional general-purpose failure code. Use it when no more specific code applies. For scripts, define your own codes (2-125) to distinguish different failure modes and help callers take targeted action.
126, 127, 128+
Reserved: 126 = command found but not executable, 127 = command not found. Codes 128+ are used for fatal signal exits: 130 = SIGINT (Ctrl+C), 137 = SIGKILL, 143 = SIGTERM.
# Define meaningful exit codes as named constants readonly E_OK=0 readonly E_USAGE=1 readonly E_DEPENDENCY=2 readonly E_NETWORK=3 readonly E_PERMISSION=4 check_deps() { for cmd in curl jq awk rsync; do if ! command -v "$cmd" >/dev/null 2>&1; then log_error "Required command not found: $cmd" exit $E_DEPENDENCY fi done } # Caller can then act on specific exit codes # ./deploy.sh || { echo "Deploy failed with code $?"; pagerduty_alert; }
Slide 13 of 35
Argument Parsing: getopts
The POSIX-standard built-in for parsing short options. Handles flags, arguments, and error reporting.
A script with hardcoded behavior is brittle. A script that accepts flags like -v for verbose, -n hostname for a target, and -d for dry-run is reusable across every context your team encounters.
./deploy.sh -v -n web-01 -p 8080 getopts ":vdn:p:" case -v flag -n $OPTARG -p $OPTARG shift $((OPTIND-1)) $@ rest
#!/usr/bin/env bash set -euo pipefail VERBOSE=0; DRY_RUN=0; TARGET=""; PORT=22 usage() { echo "Usage: $0 [-v] [-d] [-n hostname] [-p port]" echo " -v verbose output" echo " -d dry run (show what would happen)" echo " -n target hostname (required)" echo " -p SSH port (default: 22)" exit 1 } while getopts ":vdn:p:" opt; do # leading : = silent error mode case "$opt" in v) VERBOSE=1 ;; d) DRY_RUN=1 ;; n) TARGET="$OPTARG" ;; # OPTARG holds the option's argument p) PORT="$OPTARG" ;; :) echo "Option -$OPTARG requires an argument" >&2; usage ;; ?) echo "Unknown option: -$OPTARG" >&2; usage ;; esac done shift $((OPTIND - 1)) # remove parsed options; $@ now holds positional args [[ -z "$TARGET" ]] && { echo "-n is required" >&2; usage; } [[ $VERBOSE -eq 1 ]] && echo "Connecting to $TARGET:$PORT (dry=$DRY_RUN)"
Slide 14 of 35
trap: Guaranteed Cleanup
trap registers code to run when the script exits, regardless of how it exits -- success, error, or signal.
set -euo pipefail -e exit err -u unset err pipefail pipe errs + trap cleanup EXIT = fail loud + clean up production-safe script ERR INT TERM
#!/usr/bin/env bash set -euo pipefail # Create temp directory; trap ensures it is always removed WORKDIR="$(mktemp -d /tmp/sector-deploy-XXXXXX)" cleanup() { rm -rf "$WORKDIR" log_info "Cleaned up $WORKDIR" } trap cleanup EXIT # runs on normal exit AND errors # Multiple signals: exit on Ctrl+C or SIGTERM with cleanup trap 'log_warn "Interrupted"; exit 130' INT TERM # ERR trap: called whenever a command exits non-zero on_error() { log_error "Script failed at line $1: $2" # send alert, write status, etc. } trap 'on_error $LINENO "$BASH_COMMAND"' ERR # DEBUG trap: inspect every command before it runs (verbose tracing) # trap 'echo "CMD: $BASH_COMMAND"' DEBUG log_info "Work directory: $WORKDIR" # ... rest of script; cleanup runs no matter what happens next
Critical Pattern
The combination of set -euo pipefail plus trap cleanup EXIT is the gold standard for safe automation. The script fails loudly on any error and leaves the system clean. Without these, partial failures leave temp files, incomplete configs, and silent corruption.
Slide 15 of 35
Input Validation: Reject Bad Data Early
Scripts that run with untrusted input or unexpected values cause outages. Validate everything at the entry point.
# IP address format validation using regex validate_ip() { local ip="$1" if [[ ! "$ip" =~ ^([0-9]{1,3}\.){3}[0-9]{1,3}$ ]]; then echo "Invalid IP address: $ip" >&2; return 1 fi } # Integer range validation validate_port() { local port="$1" if [[ ! "$port" =~ ^[0-9]+$ ]] || [[ $port -lt 1 || $port -gt 65535 ]]; then echo "Invalid port (1-65535): $port" >&2; return 1 fi } # Sanitize hostname: allow only alphanumeric, hyphens, dots sanitize_host() { local host="$1" if [[ ! "$host" =~ ^[a-zA-Z0-9._-]+$ ]]; then echo "Invalid hostname characters: $host" >&2; return 1 fi echo "$host" } # Privilege check: abort if not root [[ $(id -u) -ne 0 ]] && { echo "Must run as root" >&2; exit 1; }
Slide 16 of 35
Here-Documents and Here-Strings
Feed multiline or string data to commands inline without creating temp files.
# Heredoc: multi-line input to a command cat > /etc/sector/worker.conf <<EOF listen_port = 9000 log_level = INFO node_id = $(hostname -s) deployed_at = $(date +%F) EOF # Quoted delimiter: no variable expansion (write literal bash code) cat <<'EOF' for node in "${NODES[@]}"; do echo "Processing $node" done EOF # Indented heredoc (bash 4.3+): leading tabs stripped cat <<-EOF This line has its leading tab stripped. So does this one. Node: $(hostname) EOF # Here-string: feed a single string to a command's stdin read -r year month day <<< "$(date +%Y-%m-%d | tr '-' ' ')" echo "Year=$year Month=$month Day=$day" # grep a string without a file grep -o '[0-9]\+' <<< "Load: 3.42 4.11 4.55"
Slide 17 of 35
Process Substitution in Scripts
Eliminate temporary files from comparison, merge, and multi-source pipeline patterns.
# Compare current running services against expected list diff <(systemctl list-units --state=active --type=service --no-legend | awk '{print $1}' | sort) \ <(sort /etc/sector/required-services.txt) # Verify two remote nodes have identical /etc/passwd diff <(ssh node-a 'sort /etc/passwd') \ <(ssh node-b 'sort /etc/passwd') # Feed two sorted streams into comm to find differences # Lines in node-a packages but not in node-b comm -23 <(ssh node-a 'dpkg -l | awk '"'"'NR>5{print $2}'"'"' | sort') \ <(ssh node-b 'dpkg -l | awk '"'"'NR>5{print $2}'"'"' | sort') # Join log entries from two services by timestamp paste <(grep 'ERROR' /var/log/app.log) \ <(grep 'ERROR' /var/log/worker.log)
When Not to Use It
Process substitution requires bash. POSIX sh does not support it. If the script must run under #!/bin/sh (e.g., in /etc/init.d/ or BusyBox environments), write to an actual temp file with mktemp and clean up with trap.
Slide 18 of 35
Arithmetic: $(( )) and declare -i
Integer arithmetic is built in. Floating-point requires awk or bc.
# Integer arithmetic context: $(( expression )) CORES=$(nproc) WORKERS=$((CORES * 2)) TIMEOUT=$((WORKERS * 5 + 10)) # Modulo, power, bitwise echo $(( 100 % 7 )) # 2 echo $(( 2 ** 8 )) # 256 echo $(( 0xFF & 0x0F )) # 15 (bitwise AND) # (( )) in conditionals (no $ needed inside) (( WORKERS > 0 )) && echo "workers configured" if (( CORES >= 4 )); then echo "Enabling parallel compile mode" fi # Floating point with awk (awk is always available) LOAD="$(awk '{print $1}' /proc/loadavg)" awk -v load="$LOAD" -v thresh="4.0" 'BEGIN{if(load+0 > thresh+0) exit 0; exit 1}' \ && echo "Load critical: $LOAD" # bc for interactive-style floating point DISK_PCT="$(df / | awk 'NR==2{gsub(/%/,""); print $5}')" WARN_THRESH=80 (( DISK_PCT > WARN_THRESH )) && echo "Disk at ${DISK_PCT}% -- approaching limit"
Slide 19 of 35
Debugging: set -x and bash -n
Bash provides first-class debugging tools. Use them before any script goes near production.
bash -n script.sh
Syntax check without execution. Catches parse errors, unclosed brackets, mismatched quotes. Run this on every new script before the first execution.
set -x (xtrace)
Prints each command to stderr before executing it, with all expansions resolved. The most powerful debugging tool in bash. Prefix output with PS4 for file/line numbers.
set -v
Prints each line before expansion (raw source). Combined with set -x, you see both the raw syntax and the expanded command. Useful for debugging complex parameter expansions.
# Enhanced xtrace: show file, line number, and function name export PS4='+(${BASH_SOURCE##*/}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }' set -x # Scope xtrace to just one function deploy() { set -x # enable tracing for this function only rsync -av /src/ /dst/ set +x # disable tracing after function } # Syntax check only (no execution) # bash -n deploy.sh && echo "Syntax OK" || echo "Syntax errors found" # shellcheck: install with: apt install shellcheck # shellcheck deploy.sh # catches style issues, common bugs, and SC advisories
Slide 20 of 35
Signal Handling: Graceful Shutdown in Scripts
Scripts run by systemd, cron, and orchestrators receive signals. Handle them or lose work.
#!/usr/bin/env bash set -euo pipefail # State flag for graceful loop termination RUNNING=1 # Signal handler: set flag, do not exit immediately handle_stop() { echo "[INFO] Shutdown signal received -- finishing current task" RUNNING=0 } trap handle_stop SIGTERM SIGINT SIGHUP # Worker loop: checks RUNNING flag after each iteration process_queue() { while [[ $RUNNING -eq 1 ]]; do # Simulate processing one item ITEM="$(dequeue_item 2>/dev/null || echo '')" [[ -z "$ITEM" ]] && { sleep 1; continue; } process_item "$ITEM" done echo "[INFO] Worker exited cleanly" } process_queue exit 0
systemd and SIGTERM
When systemd stops a service, it sends SIGTERM followed (after TimeoutStopSec) by SIGKILL. A script that handles SIGTERM can finish its current unit of work and write state before systemd forces termination. Scripts that ignore SIGTERM get killed mid-operation.
Slide 21 of 35
Logging Patterns: Structured Output
Scripts need consistent, machine-parseable log output from the start, not added as an afterthought.
#!/usr/bin/env bash # Logging framework: levels, timestamps, stderr routing readonly LOG_LEVEL_INFO=1 readonly LOG_LEVEL_WARN=2 readonly LOG_LEVEL_ERROR=3 CURRENT_LOG_LEVEL=1 _log() { local level="$1" label="$2"; shift 2 [[ $level -lt $CURRENT_LOG_LEVEL ]] && return local ts="$(date '+%Y-%m-%dT%H:%M:%S')" printf '%s [%s] %s\n' "$ts" "$label" "$*" } log_info() { _log $LOG_LEVEL_INFO "INFO " "$@"; } log_warn() { _log $LOG_LEVEL_WARN "WARN " "$@" >&2; } log_error() { _log $LOG_LEVEL_ERROR "ERROR" "$@" >&2; } # Route all output to log file and keep stderr on terminal readonly LOGFILE="/var/log/sector/$(basename "$0" .sh)-$(date +%F).log" mkdir -p "$(dirname "$LOGFILE")" exec 1>>"$LOGFILE" # stdout appends to log file exec 2>&1 # stderr goes to same log file log_info "Script started: $0 $*" log_warn "Low disk space on /var/log: $(df -h /var/log | awk 'NR==2{print $5}')" log_error "Deployment failed -- see above"
Slide 22 of 35
Parallel Execution: Background Jobs and wait
Fan out work across multiple hosts or tasks, then collect results -- without leaving zombie jobs behind.
#!/usr/bin/env bash set -euo pipefail NODES=("web-01" "web-02" "db-01" "mon-01") declare -A PIDS RESULTS_DIR="$(mktemp -d)" trap 'rm -rf "$RESULTS_DIR"' EXIT # Launch one background job per node for node in "${NODES[@]}"; do { ssh -o ConnectTimeout=5 "$node" \ 'df -h / | awk "NR==2{print $5}"' \ > "${RESULTS_DIR}/${node}.txt" 2>&1 } & PIDS["$node"]="$!" done # Collect results: wait for each PID and check its exit code FAILURES=0 for node in "${!PIDS[@]}"; do if ! wait "${PIDS[$node]}"; then echo "FAILED: $node"; (( FAILURES++ )) else USAGE="$(cat "${RESULTS_DIR}/${node}.txt")" echo "$node: disk usage $USAGE" fi done (( FAILURES > 0 )) && exit 1
Slide 23 of 35
Library Pattern: Sourcing Shared Functions
Avoid duplicating logging, validation, and utility functions across scripts. Build a shared library.
# /etc/sector/lib/common.sh -- shared functions # Source this in every script: source /etc/sector/lib/common.sh # Guard against double-sourcing [[ -n "${_SECTOR_COMMON_LOADED:-}" ]] && return 0 readonly _SECTOR_COMMON_LOADED=1 # Require a command to exist before proceeding require_cmd() { local cmd="$1" command -v "$cmd" >/dev/null 2>&1 || { echo "[ERROR] Required command not found: $cmd" >&2; exit 2 } } # Retry a command up to N times with delay retry() { local max="$1" delay="$2"; shift 2 local attempt=1 until "$@"; do (( attempt++ )) [[ $attempt -gt $max ]] && return 1 sleep "$delay" done } # In your script: source /etc/sector/lib/common.sh require_cmd rsync retry 3 5 rsync -av /data/ backup-node:/data/
Slide 24 of 35
Config Files: Sourcing vs Parsing
Scripts need external configuration. Two safe patterns: source a bash config file, or parse a structured format.
# Pattern 1: source a KEY=VALUE config file (trusted source only) # /etc/sector/deploy.conf: # TARGET_HOST="web-01" # DEPLOY_USER="deploy" # BACKUP_RETAIN_DAYS=7 readonly CONFIG="/etc/sector/deploy.conf" [[ -f "$CONFIG" ]] || { echo "Config not found: $CONFIG" >&2; exit 1; } [[ -r "$CONFIG" ]] || { echo "Config not readable: $CONFIG" >&2; exit 1; } source "$CONFIG" # Pattern 2: parse a structured config safely with awk (no code execution) get_config_value() { local key="$1" awk -F'=' -v key="$key" '$1==key{gsub(/^["'"'"']+|["'"'"']+$/, "", $2); print $2; exit}' \ /etc/sector/deploy.conf } RETAIN="$(get_config_value BACKUP_RETAIN_DAYS)" echo "Retaining backups for $RETAIN days" # Always validate required keys after sourcing : "${TARGET_HOST:?'TARGET_HOST must be set in $CONFIG'}" : "${DEPLOY_USER:?'DEPLOY_USER must be set in $CONFIG'}"
Slide 25 of 35
mapfile: File Lines to Arrays
Read file contents into an array safely, preserving whitespace and handling all edge cases.
# mapfile (alias: readarray) reads lines from stdin into an indexed array mapfile -t HOSTS < /etc/sector/hosts.txt # -t strips trailing newlines from each element echo "Loaded ${#HOSTS[@]} hosts" # Process a subset: start at index 2, read 5 lines mapfile -t -s2 -n5 SUBSET < /etc/sector/hosts.txt # Read from command output into array mapfile -t FAILED_SVCS < <( systemctl list-units --state=failed --type=service --no-legend | awk '{print $1}' ) if [[ ${#FAILED_SVCS[@]} -gt 0 ]]; then echo "ALERT: ${#FAILED_SVCS[@]} failed services detected" for svc in "${FAILED_SVCS[@]}"; do echo " - $svc" systemctl status "$svc" --no-pager -l 2>&1 | tail -5 done fi
mapfile vs for line in $(cat)
mapfile is safe with filenames containing spaces, special characters, and empty lines. It does not trigger globbing. It does not split on whitespace. For any file with more than trivial content, mapfile is the correct tool.
Slide 26 of 35
Testing: Bash Automated Testing System (bats)
Scripts at production scale require automated tests. BATS provides a TAP-compliant test framework for bash.
# Install bats: apt install bats OR git clone https://github.com/bats-core/bats-core # tests/validate_ip.bats load '../src/common.sh' # source the library to test @test "validate_ip accepts a valid IPv4 address" { run validate_ip "192.168.1.100" [ "$status" -eq 0 ] } @test "validate_ip rejects an address with letters" { run validate_ip "192.168.x.100" [ "$status" -eq 1 ] } @test "validate_ip rejects empty string" { run validate_ip "" [ "$status" -eq 1 ] } @test "deploy_config backs up existing file" { DEST="$(mktemp)" echo "old config" > "$DEST" run deploy_config /dev/null "$DEST" [ "$status" -eq 0 ] # verify backup was created BACKUP="$(ls "${DEST}.bak."* 2>/dev/null | head -1)" [ -f "$BACKUP" ] } # Run tests: bats tests/
Slide 27 of 35  |  Applied Script
Applied Script: Node Health Check
A complete, production-grade health check script using everything from this lecture.
#!/usr/bin/env bash # node-health.sh — check system health, exit non-zero if unhealthy set -euo pipefail source /etc/sector/lib/common.sh THRESHOLDS=([cpu]="80" [mem]="90" [disk]="85" [load]="8") FAIL=0 check_disk() { local pct pct="$(df / | awk 'NR==2{gsub(/%/,""); print $5}')" (( pct > THRESHOLDS[disk] )) && { log_warn "Disk ${pct}% -- threshold ${THRESHOLDS[disk]}%"; FAIL=1; } \ || log_info "Disk OK: ${pct}%" } check_load() { local load load="$(awk '{print int($1)}' /proc/loadavg)" (( load > THRESHOLDS[load] )) && { log_warn "Load ${load} -- threshold ${THRESHOLDS[load]}"; FAIL=1; } \ || log_info "Load OK: ${load}" } check_services() { local -a required=("ssh" "cron" "rsyslog") for svc in "${required[@]}"; do systemctl is-active --quiet "$svc" || { log_error "$svc is not running"; FAIL=1; } done } check_disk; check_load; check_services exit $FAIL
Slide 28 of 35  |  Applied Script
Applied Script: Bulk Config Deployer
Deploy a config file to multiple hosts in parallel with rollback on failure.
#!/usr/bin/env bash # bulk-deploy.sh -f config.conf -t /etc/app/app.conf [-d] [-v] set -euo pipefail; source /etc/sector/lib/common.sh SRC=""; DEST=""; DRY=0; VERBOSE=0 while getopts ":f:t:dv" o; do case $o in f) SRC="$OPTARG" ;; t) DEST="$OPTARG" ;; d) DRY=1 ;; v) VERBOSE=1 ;; :) echo "-$OPTARG needs arg" >&2; exit 1 ;; ?) echo "bad flag" >&2; exit 1 ;; esac done : "${SRC:?'-f required'}" "${DEST:?'-t required'}" mapfile -t HOSTS < /etc/sector/hosts.txt declare -A PIDS; TMPDIR="$(mktemp -d)" trap 'rm -rf "$TMPDIR"' EXIT deploy_one() { local h="$1" [[ $DRY -eq 1 ]] && { log_info "DRY: $h <- $SRC"; return; } scp -q "$SRC" "${h}:${DEST}" && log_info "OK: $h" || { log_error "FAIL: $h"; return 1; } } for h in "${HOSTS[@]}"; do deploy_one "$h" >"${TMPDIR}/${h}.log" 2>&1 &; PIDS["$h"]="$!" done FAIL=0 for h in "${!PIDS[@]}"; do wait "${PIDS[$h]}" || (( FAIL++ )); [[ $VERBOSE -eq 1 ]] && cat "${TMPDIR}/${h}.log" done; exit $FAIL
Slide 29 of 35
Common Pitfalls: Mistakes That Cost Production Hours
These are not theory. Every one of these has caused a real outage.
Unquoted Variables
rm -rf $DIR/* -- if DIR is empty or unset, this becomes rm -rf /*. Always quote: rm -rf "${DIR:?}/". Use set -u to catch unset variables early.
Forgetting set -e
Without set -e, a failed command is silently ignored and the script continues. A failed mkdir followed by a chmod will chmod the wrong directory. Always add set -euo pipefail.
Parsing ls Output
Never parse ls output in scripts. Filenames with spaces, newlines, and special characters break word splitting. Use glob expansion (for f in /dir/*) or find -print0 | xargs -0 instead.
cd without error check
cd /somewhere; rm -rf * -- if cd fails (dir doesn't exist), rm runs in the current directory. Use cd /somewhere || exit 1 or cd /somewhere && rm -rf *.
Backtick syntax
Backtick command substitution `cmd` cannot be nested, is hard to read, and escaping is inconsistent. Always use $(cmd). It nests cleanly and reads clearly.
Slide 30 of 35
ShellCheck: Static Analysis for Bash
ShellCheck catches bugs that would otherwise only appear at runtime. Run it before every commit.
# Install apt install shellcheck # Ubuntu/Debian brew install shellcheck # macOS # Basic usage shellcheck myscript.sh # Example findings ShellCheck catches: # SC2086 — unquoted variable (word splitting risk) echo $var # ShellCheck warns: use "$var" # SC2164 — cd without error handling cd /tmp; rm -rf * # warn: use cd /tmp || exit 1 # SC2006 — backtick usage version=`python3 --version` # warn: use $() # SC2148 — missing shebang # Suppress a warning when you know what you're doing (use sparingly) # shellcheck disable=SC2034 UNUSED_VAR="intentional" # Integrate with CI: exit non-zero on any finding shellcheck -S error myscript.sh # fail only on errors, not warnings
Slide 31 of 35
Portability: Bash vs sh vs dash
Ubuntu's /bin/sh is dash, not bash. Scripts run as /bin/sh will fail if they use bash-specific features.
Bash-Only Features (require #!/usr/bin/env bash)
[[ ]] extended test. arrays and declare -A. $(( )) arithmetic. getopts behavior (POSIX getopts is limited). Process substitution <(). mapfile/readarray. String case operators ^^ ,,.
POSIX sh Features (safe in /bin/sh)
[ ] test. $() command substitution. case/esac. for/while/until. trap. ${var:-default} parameter expansion. No arrays. No [[ ]]. No declare.
# Check what /bin/sh actually is on this system ls -la /bin/sh # On Ubuntu: /bin/sh -> dash # On RHEL/CentOS: /bin/sh -> bash (historically) # Tell bash to complain about POSIX violations (development aid) bash --posix myscript.sh # When you NEED portability (Docker entrypoints, /etc/init.d/, # BusyBox environments): use #!/bin/sh and POSIX-only syntax # When you are writing admin automation: use #!/usr/bin/env bash # and use all bash features freely -- they are available everywhere # you will realistically deploy to in a managed Linux environment
Slide 32 of 35
Organization at Scale: Script as a Project
When your script grows beyond 100 lines, treat it like software. Structure matters.
Recommended Directory Layout
bin/ -- executable scripts. lib/ -- shared function libraries. tests/ -- bats test files. conf/ -- default config files. docs/ -- usage and architecture notes. Makefile -- run tests, lint, install targets.
Version Control Practices
Every script gets committed to git. Tag releases. Use branches for breaking changes. Scripts that run on production servers should be deployed from a known git tag, not copied from a developer workstation by hand.
# Makefile for a bash project SCRIPTS := $(wildcard bin/*.sh lib/*.sh) .PHONY: lint test install lint: @shellcheck $(SCRIPTS) @echo "Lint passed" test: @bats tests/ @echo "Tests passed" install: @install -m 0755 bin/node-health.sh /usr/local/bin/node-health @install -m 0644 lib/common.sh /etc/sector/lib/common.sh @echo "Installed" check: lint test
Slide 33 of 35
Performance: Avoid Unnecessary Subshells
Each $() forks a subshell. In tight loops or large data sets, this cost accumulates.
# SLOW: fork per iteration for f in /var/log/*.log; do SIZE=$(du -sk "$f" | cut -f1) # two forks: du + cut (( SIZE > 100000 )) && echo "$f" done # FAST: single external call, awk replaces cut, no loop subshells du -sk /var/log/*.log | awk '$1 > 100000 {print $2}' # Use parameter expansion instead of awk/sed for simple substitutions FILE="audit-2026.log" # SLOW: fork BASE=$(basename "$FILE" .log) # forks basename # FAST: built-in expansion BASE="${FILE%.log}" # no fork, faster # Read file content without fork # SLOW: $(cat file) CONTENT="$(cat /etc/hostname)" # forks cat # FAST: built-in read read -r HOSTNAME < /etc/hostname # no fork
Slide 34 of 35  |  Applied Script
Applied Script: TLS Certificate Expiry Monitor
Check TLS certificates on multiple domains and alert when expiry is within threshold days.
#!/usr/bin/env bash set -euo pipefail WARN_DAYS=30; CRITICAL_DAYS=7 mapfile -t DOMAINS < /etc/sector/monitored-domains.txt check_cert() { local domain="$1" local expiry_raw days_left expiry_raw="$(echo | openssl s_client -connect "${domain}:443" \ -servername "$domain" 2>/dev/null \ | openssl x509 -noout -enddate 2>/dev/null \ | cut -d= -f2)" [[ -z "$expiry_raw" ]] && { log_error "Could not read cert: $domain"; return 1; } local expiry_epoch now_epoch expiry_epoch="$(date -d "$expiry_raw" +%s)" now_epoch="$(date +%s)" days_left=$(( (expiry_epoch - now_epoch) / 86400 )) if (( days_left <= CRITICAL_DAYS )); then log_error "CRITICAL: $domain expires in ${days_left} days" elif (( days_left <= WARN_DAYS )); then log_warn "WARNING: $domain expires in ${days_left} days" else log_info "OK: $domain valid for ${days_left} days" fi } for domain in "${DOMAINS[@]}"; do check_cert "$domain"; done
Slide 35 of 35  |  ALA-05 Summary
Bash Scripting: What You Now Know
A script is not a trick. It is a system. Every operator in this cell is now capable of building automation that fails loudly, cleans up after itself, accepts arguments cleanly, and can be tested, reviewed, and trusted under pressure.
1Every production script starts with set -euo pipefail and a trap on EXIT for cleanup. No exceptions.
2Always double-quote variable expansions: "$VAR". Unquoted variables undergo word splitting and globbing.
3Use [[ ]] not [ ] in bash scripts. Supports regex via =~, glob patterns, and avoids word-splitting traps.
4Use local for all variables inside functions. Without it, every variable is global and will stomp on callers.
5Read files with while IFS= read -r line; do ... done < file -- never with for line in $(cat file).
6Use getopts to parse flags. Define usage() as a function. Check required args after parsing with shift $((OPTIND-1)).
7Return data from functions via echo and capture with $(). Return codes (0-255) signal success or failure type.
8Run shellcheck on every script before deployment. Run bash -n for syntax checks. Use bats for automated tests.
9Avoid subshells in tight loops. Use built-in parameter expansion instead of forking basename, cat, or cut.