Service Authority — systemd Deep Dive | Advanced Linux Administration

Slide 1 of 36 | ALA-02 | Week 1 of 4

Service Authority
systemd Deep Dive

Unit Types • Unit File Anatomy • Dependencies • systemctl • journalctl

Sector command requires three services to come online in the correct order -- the database before the API, the API before the gateway. One misconfigured dependency and the gateway starts against a cold database. This module is how you specify that order and guarantee it.

36 Slides ALA-02 Week 1 of 4 Ubuntu 22.04 LTS

Slide 2 of 36

What Is systemd?

PID 1. The first process the kernel starts. Everything else depends on it.

PID 1 and the Init System

When the Linux kernel finishes loading, it executes exactly one process: init. Since Ubuntu 15.04, that process is systemd. It is responsible for starting all other processes, managing their lifecycles, and shutting the system down cleanly. Kill PID 1 and the system halts.

What Replaced

systemd replaced SysV init (the legacy System V init scripts). SysV used shell scripts in /etc/init.d/ executed sequentially. systemd uses declarative unit files and parallel activation. Boot time dropped from minutes to seconds on modern systems.

What systemd Controls

Services (daemons), sockets, timers, mount points, swap devices, system targets (runlevels), device nodes, and more. It is not just an init system -- it is a system and service manager. journalctl, hostnamectl, timedatectl, localectl are all part of the ecosystem.

Exam Note

On CompTIA Linux+, RHCSA, and LPIC-1 exams, systemd unit file syntax, systemctl commands, and journalctl filtering are heavily tested. Everything in this module appears on certification exams.

Slide 3 of 36

Unit Types: What systemd Manages

Every resource systemd manages is a unit. Each type has its own file extension and behavior.

.service

A daemon or one-shot process. The most common unit type. Defines how to start, stop, and restart a service process. Examples: nginx.service, ssh.service, postgresql.service.

.socket

A socket activation unit. systemd listens on a socket (TCP port, Unix socket, FIFO) and only starts the associated service when a connection arrives. Saves resources -- the service does not run until it is needed.

.timer

A scheduled unit. Replaces cron for systemd-managed systems. Activates another unit on a schedule. Supports monotonic timers (after boot, after last run) and calendar timers (specific dates/times). Has better logging than cron.

.target

A synchronization point. Groups related units. multi-user.target is the traditional equivalent of runlevel 3 (multi-user, network up, no GUI). graphical.target = runlevel 5. network-online.target means the network is fully up.

.mount

Manages filesystem mount points. Automatically generated from /etc/fstab entries. Can be manually written for complex mount configurations with proper dependency ordering. Mount unit names must match the mount path (slashes become dashes).

.path

Activates a unit when a filesystem path changes (created, modified, deleted). Uses inotify. Useful for watch-and-process workflows -- e.g., start a processing service when a new file arrives in a directory.

Slide 4 of 36

Unit File Locations

Three directories. Precedence matters. Editing the wrong one is a common mistake.

/lib/systemd/system/

Vendor-provided units. Installed by packages (apt, dpkg). Never edit files here directly -- package updates will overwrite your changes silently. On Ubuntu 22.04, /lib/systemd/system/ is a symlink to /usr/lib/systemd/system/.

/etc/systemd/system/

Administrator-created and overriding units. Files here take precedence over /lib/systemd/system/. This is where you put custom service files and drop-in override files. Survives package updates. Always work here.

/run/systemd/system/

Runtime units created dynamically. Not persistent across reboots. Created by systemd itself or by other programs at runtime. You rarely write files here manually -- these are managed programmatically.

# List all unit files and their states
systemctl list-unit-files

# Find where a specific unit's file lives
systemctl cat nginx.service          # shows the file with its path in a header comment

# Show the effective configuration (vendor + overrides merged)
systemctl show nginx.service

# Your custom unit files go here
ls -la /etc/systemd/system/

Slide 5 of 36

Service Unit: [Unit] Section

Metadata and dependency declarations. Read before the service is started.

# /etc/systemd/system/sector-api.service
# [Unit] section: describes the unit and declares dependencies

[Unit]
Description=Sector API Application Server
Documentation=https://internal.docs/sector-api

# After: this unit starts AFTER the listed units are active
After=network-online.target sector-db.service

# Requires: hard dependency — if sector-db.service stops, this stops too
Requires=sector-db.service

# Wants: soft dependency — try to start, but continue even if it fails
Wants=network-online.target

# ConditionPathExists: only start if this file exists
ConditionPathExists=/opt/sector-api/sector-api.jar

Description

A human-readable name shown in systemctl status output. Make it descriptive. This is the first thing an operator reads when troubleshooting an unknown service at 3 AM.

Documentation

A URI pointing to documentation. Supports http://, https://, man:, file: URIs. systemctl help unit-name opens the URI. Include it -- your future self will thank you.

Slide 6 of 36

Service Unit: [Service] Section

How to start, stop, and supervise the process. The most complex section.

[Service]
# Type: defines how systemd tracks when the service is "ready"
Type=simple          # default: process started = service ready
# Type=forking      # old-style daemons that fork and exit parent
# Type=notify       # service sends sd_notify() when ready
# Type=oneshot      # short-lived task, not a daemon

# Run as this user/group
User=sector-svc
Group=sector-svc

# Working directory
WorkingDirectory=/opt/sector-api

# Environment variables
EnvironmentFile=/etc/sector-api/env
Environment="LOG_LEVEL=INFO"

# The actual command to start the service
ExecStart=/usr/bin/java -jar /opt/sector-api/sector-api.jar

# Optional: run before ExecStart for setup
ExecStartPre=/usr/bin/test -f /opt/sector-api/sector-api.jar

# Restart policy
Restart=on-failure
RestartSec=5

Slide 7 of 36

Service Type Values

The wrong Type causes systemd to mistrack your service. Understand each one.

Type=simple

The default. systemd considers the service started as soon as ExecStart runs. The process does NOT fork. Use for modern daemons that stay in the foreground. Most Python, Java, and Node.js services use this type.

Type=forking

For traditional Unix daemons that fork a child and exit the parent. systemd waits for the parent to exit, then tracks the child. You must also set PIDFile= so systemd knows which PID to track after the fork. Legacy -- avoid in new code.

Type=notify

The service calls sd_notify("READY=1") when it is fully initialized. systemd waits for this notification before marking the service active. Use for services with non-trivial startup (loading config, connecting to DB). nginx, PostgreSQL, and systemd-networkd use this.

Type=oneshot

For short-lived tasks that run and exit (not daemons). systemd waits for ExecStart to exit before marking the unit active. Set RemainAfterExit=yes if you want the unit to show as "active" after the task completes.

Type=dbus

Service is considered ready when it acquires a specific name on the D-Bus system bus. Specify BusName=. Used by desktop services and some system daemons. Rare in server environments.

Slide 8 of 36

Service Unit: [Install] Section

Controls what happens when you run systemctl enable. Not read at runtime.

[Install]
# WantedBy: which target "wants" this unit when enabled
# multi-user.target = start at boot in multi-user mode (no GUI)
# graphical.target  = start when GUI is available
WantedBy=multi-user.target

# RequiredBy: hard dependency from the target to this unit
# If the target fails to start this unit, the target itself fails
# RequiredBy=multi-user.target

# Alias: alternative names for the unit
# Alias=api.service

What systemctl enable Does

Creates a symlink in /etc/systemd/system/multi-user.target.wants/ pointing to your unit file. This symlink is what causes the service to start at boot. systemctl disable removes the symlink. Neither starts nor stops the service immediately.

enable vs start

enable = set to auto-start at boot. start = start right now. Do both after writing a new service: systemctl enable --now unit-name enables AND starts in one command. This is the standard deployment pattern.

Deployment Pattern

After writing a new unit file: systemctl daemon-reload (load the new file), then systemctl enable --now sector-api.service (enable at boot and start now). Two commands, correct order, every time.

Slide 9 of 36

Restart Policies: Service Resilience

systemd can automatically recover from service failures. Configure it explicitly.

Restart=no

Default. systemd does not restart the service on any exit. If it crashes, it stays down until an operator manually starts it. Appropriate for one-shot tasks or services where unexpected restarts would cause harm.

Restart=on-failure

Restart only if the process exits with a non-zero code, is killed by a signal, times out, or hits a watchdog. Does NOT restart on clean exit (exit 0). The most common setting for production services -- handles crashes without restarting intentional shutdowns.

Restart=always

Always restart, regardless of exit code. Even a clean exit triggers a restart. Use for services that are expected to run forever and where no exit is intentional from systemd's perspective. Combine with StartLimitIntervalSec to prevent restart storms.

[Service]
Restart=on-failure
RestartSec=5                 # wait 5 seconds before restarting

# Prevent infinite restart storms: allow max 5 restarts in 60 seconds
StartLimitIntervalSec=60
StartLimitBurst=5

# After hitting the burst limit, the unit enters "failed" state
# Recovery: systemctl reset-failed unit-name && systemctl start unit-name

Slide 10 of 36

Dependencies: Requires, Wants, After, Before

Four directives. Requires and Wants declare what must exist. After and Before declare order.

These four directives answer two distinct questions: "What do I need?" (Requires/Wants) and "When do I start relative to them?" (After/Before). They are orthogonal -- declaring Requires does NOT imply After. You must specify both if you need both.

Requires=unit

Hard dependency. If the required unit cannot be started, this unit fails. If the required unit stops (for any reason) while this unit is running, this unit is also stopped. Use when the dependency is truly non-negotiable.

Wants=unit

Soft dependency. systemd will try to start the wanted unit alongside this one, but this unit is not affected if the wanted unit fails. Use for optional dependencies or services that enhance but are not required.

After=unit

Ordering only. This unit starts AFTER the listed unit is considered active. Does NOT imply Requires or Wants -- just controls sequence. If both units would start at boot, this one waits. If only this unit is started, After has no effect.

Before=unit

This unit must be active before the listed unit starts. The inverse of After. If unit A has Before=B, it is equivalent to B having After=A. Use in units that are consumed by others, rather than in consumers.

Slide 11 of 36

System Targets

Targets are synchronization points. They replaced SysV runlevels.

multi-user.target

The standard server target. Multi-user, non-graphical, network up. Equivalent to SysV runlevel 3. Most server services declare WantedBy=multi-user.target in their [Install] section. The default target on Ubuntu Server installs.

network-online.target

Reached when at least one network interface is configured and online. Critical distinction: network.target means network is UP but may not yet have an address. network-online.target means an address is assigned and routing works. Use the latter for services that need connectivity.

rescue.target / emergency.target

rescue.target = single-user mode, minimal services, root shell. emergency.target = most minimal state possible, read-only root. Used for system repair. Boot into them by adding systemd.unit=rescue.target to the kernel command line in GRUB.

# See the current default target (boots to)
systemctl get-default

# Change the default target
systemctl set-default multi-user.target

# Switch to a target immediately (without rebooting)
systemctl isolate rescue.target

# List all targets and their active state
systemctl list-units --type=target

Slide 12 of 36

systemctl Command Reference

The primary interface for managing units. Know every command in this slide.

# Service lifecycle
systemctl start   nginx.service      # start the service now
systemctl stop    nginx.service      # send SIGTERM, then SIGKILL after timeout
systemctl restart nginx.service      # stop then start
systemctl reload  nginx.service      # send SIGHUP — reload config, no downtime
systemctl reload-or-restart nginx    # reload if supported, restart otherwise

# Enable/disable (boot persistence)
systemctl enable  nginx.service      # create symlink, starts at boot
systemctl disable nginx.service      # remove symlink
systemctl enable --now nginx.service # enable AND start immediately
systemctl mask    nginx.service      # prevent start entirely (symlink to /dev/null)
systemctl unmask  nginx.service

# Status and inspection
systemctl status  nginx.service      # human-readable status with recent logs
systemctl is-active  nginx.service   # prints "active" or "inactive"
systemctl is-enabled nginx.service   # prints "enabled" or "disabled"
systemctl is-failed  nginx.service   # returns 0 if in failed state

Slide 13 of 36

daemon-reload and Why It Matters

After editing any unit file, you must tell systemd to re-read its configuration.

What daemon-reload Does

systemd caches unit file contents in memory. When you create or modify a unit file on disk, systemd does not see the changes until you run systemctl daemon-reload. This command re-reads all unit files. It does NOT restart any services.

What Happens Without It

You edit a unit file, run systemctl restart service, and wonder why nothing changed. The service restarted using the old cached unit definition. This is the number one cause of "I edited the file but nothing changed" confusion in systemd.

# Correct workflow for any unit file change:

# 1. Edit or create the unit file
nano /etc/systemd/system/sector-api.service

# 2. Reload systemd's unit file cache
systemctl daemon-reload

# 3. Restart the service to apply changes
systemctl restart sector-api.service

# 4. Verify the new configuration is running
systemctl status sector-api.service

Never Skip Step 2

Always run daemon-reload after editing unit files. Make it muscle memory. The sequence is: edit, reload, restart, verify. Skipping daemon-reload wastes time and causes baffling debugging sessions.

Slide 14 of 36

Drop-In Override Files

Customize vendor units without touching the original file. Survives package upgrades.

nginx is installed via apt. The vendor unit file in /lib/systemd/system/ does not have your required environment variables or restart policy. You cannot edit the vendor file without losing changes on the next apt upgrade. Drop-in files solve this.

# Method 1: systemctl edit (recommended -- creates the directory for you)
systemctl edit nginx.service
# Opens an editor. Save your overrides. Automatically runs daemon-reload.
# Creates: /etc/systemd/system/nginx.service.d/override.conf

# Method 2: manual creation
mkdir -p /etc/systemd/system/nginx.service.d/

# /etc/systemd/system/nginx.service.d/override.conf
[Service]
Environment="NGINX_ENV=production"
Restart=always
RestartSec=3

# Reload and verify the override was applied
systemctl daemon-reload
systemctl cat nginx.service    # shows vendor file + override file concatenated

Rule

Drop-in files are merged with the original unit file. You only need to specify the directives you are changing -- everything else is inherited. A 3-line override file is cleaner and safer than copying and modifying a 50-line vendor file.

Slide 15 of 36

Service Security: Sandboxing Directives

systemd provides namespace-based isolation without containers. Use it for every production service.

[Service]
# Run as non-root user
User=sector-svc
Group=sector-svc

# Read-only filesystem -- except specified directories
ProtectSystem=strict              # entire filesystem read-only
ReadWritePaths=/opt/sector/data   # exception: this path is writable

# Hide sensitive paths from the service
InaccessiblePaths=/etc/shadow /etc/gshadow /root

# Prevent the service from gaining new privileges
NoNewPrivileges=true

# Limit which system calls the service can make (allowlist approach)
SystemCallFilter=@system-service  # common syscalls for services

# Private /tmp: service gets its own isolated /tmp
PrivateTmp=true

# Private network namespace (no network access)
# PrivateNetwork=true  -- use only if service needs no networking

Exam and Real-World Note

systemd-analyze security unit-name gives your service a security score and specific recommendations. Run it on every service you write and address the high-severity items. This is production-quality hardening with zero additional software.

Slide 16 of 36

journalctl — The systemd Journal

Structured, indexed, queryable logs. Replaces scattered text files for systemd-managed services.

Why Journal Over Text Files

Traditional logs are plain text. Searching them with grep is slow on large files. The journal stores logs in a binary indexed format. Queries by time range, unit, priority, or PID are fast regardless of log volume. Timestamps are stored with nanosecond precision.

Journal Persistence

By default on Ubuntu 22.04, the journal persists to /var/log/journal/ after creating the directory (or if it already exists). Without it, logs are kept in /run/log/journal/ and lost on reboot. Check with journalctl --disk-usage.

# View all journal entries (most recent last)
journalctl

# Follow new entries in real time (like tail -f)
journalctl -f

# View journal for a specific unit
journalctl -u nginx.service

# Follow a specific unit in real time
journalctl -fu nginx.service

# Show only the last 50 lines of a unit's log
journalctl -u nginx.service -n 50

# Show disk usage of the journal
journalctl --disk-usage

Slide 17 of 36

journalctl: Advanced Filtering

Time ranges, priority levels, and structured field queries.

# Filter by time range
journalctl --since "2026-04-09 08:00:00"
journalctl --since "1 hour ago"
journalctl --since "today" --until "now"
journalctl -u nginx.service --since "2026-04-09" --until "2026-04-09 23:59:59"

# Filter by priority level (0=emerg, 1=alert, 2=crit, 3=err, 4=warning, 5=notice, 6=info, 7=debug)
journalctl -p err                    # errors and above (0-3)
journalctl -p warning..err           # range: warning to error
journalctl -p crit -u nginx.service  # critical events from nginx

# Filter by PID or executable path
journalctl _PID=14823
journalctl _EXE=/usr/sbin/sshd

# JSON output for machine processing
journalctl -u nginx.service -o json-pretty | head -40

# Export for sharing or archival
journalctl -u nginx.service --since "today" > /tmp/nginx-today.log

Slide 18 of 36

journalctl: Boot Logs and Boot Analysis

Diagnose boot failures and track changes across reboots.

# List all recorded boot sessions
journalctl --list-boots

# View logs from the current boot
journalctl -b

# View logs from the previous boot (useful after a crash)
journalctl -b -1

# View logs from two boots ago
journalctl -b -2

# Errors and above from the previous boot
journalctl -b -1 -p err

# systemd-analyze: boot time performance breakdown
systemd-analyze                        # total boot time
systemd-analyze blame                  # which units took longest
systemd-analyze critical-chain         # the critical path that determined total boot time

# Plot boot sequence to an SVG
systemd-analyze plot > /tmp/boot-chart.svg

Slide 19 of 36

Writing a Custom Service Unit

Full working example: a Node.js API server with security hardening.

# /etc/systemd/system/sector-api.service
[Unit]
Description=Sector API Service
Documentation=https://internal.docs/sector-api
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=sector-svc
Group=sector-svc
WorkingDirectory=/opt/sector-api
EnvironmentFile=-/etc/sector-api/env      # leading - means: ignore if missing
ExecStartPre=/usr/bin/node --check /opt/sector-api/server.js
ExecStart=/usr/bin/node /opt/sector-api/server.js
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=sector-api
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ReadWritePaths=/opt/sector-api/data /var/log/sector-api
StartLimitIntervalSec=60
StartLimitBurst=5

[Install]
WantedBy=multi-user.target

Slide 20 of 36

Timer Units: Replacing Cron

Scheduled tasks with full systemd logging, dependencies, and failure detection.

A cron job that fails silently is a compliance nightmare. systemd timers write every execution to the journal, support systemd dependencies, and can be inspected with systemctl status. Migrate critical cron jobs to timers.

# /etc/systemd/system/sector-backup.service (the task to run)
[Unit]
Description=Sector Database Backup
After=sector-db.service

[Service]
Type=oneshot
User=backup-svc
ExecStart=/opt/sector/scripts/backup.sh

---
# /etc/systemd/system/sector-backup.timer (the schedule)
[Unit]
Description=Run Sector Backup Nightly

[Timer]
OnCalendar=*-*-* 02:30:00         # every day at 02:30
AccuracySec=1m                      # allow 1 min drift for system load
Persistent=true                     # run missed execution on next boot

[Install]
WantedBy=timers.target

---
# Enable and start the timer (not the service -- the timer fires the service)
systemctl enable --now sector-backup.timer
systemctl list-timers                   # see all timers and next fire time

Slide 21 of 36

Timer: Calendar Syntax Reference

OnCalendar expressions. More expressive than cron and verifiable with systemd-analyze.

# systemd calendar event format: DOW YYYY-MM-DD HH:MM:SS
# * = any value  , = list  .. = range  / = step

OnCalendar=daily                     # 00:00:00 every day (shorthand)
OnCalendar=hourly                    # top of every hour
OnCalendar=weekly                    # Monday 00:00:00
OnCalendar=monthly                   # first of each month at 00:00:00

OnCalendar=*-*-* 02:30:00           # every day at 02:30
OnCalendar=Mon *-*-* 06:00:00       # every Monday at 06:00
OnCalendar=Mon..Fri *-*-* 09:00:00  # weekdays at 09:00
OnCalendar=*-*-1 00:00:00           # first of every month
OnCalendar=*-*-* *:00/15:00         # every 15 minutes

# Verify a calendar expression before deploying
systemd-analyze calendar "Mon..Fri *-*-* 09:00:00"
# Output shows the next 10 scheduled fire times

Tip

Always run systemd-analyze calendar "your-expression" before deploying a timer. It shows the next scheduled execution times, so you can verify you wrote the expression correctly before it misses a production backup window.

Slide 22 of 36

Socket Activation: On-Demand Services

systemd listens on a socket. The service only starts when a connection arrives.

The Problem It Solves

Some services are needed infrequently. Running them continuously wastes RAM. With socket activation, systemd holds the socket open. When a client connects, systemd starts the service and hands it the accepted connection. Zero client delay -- from the client's perspective the service was always listening.

SSH Uses This

ssh.socket listens on port 22. sshd.service activates when a connection arrives. On low-traffic servers this saves the sshd process running constantly for connections that happen twice a day. Check with systemctl status ssh.socket.

# /etc/systemd/system/sector-worker.socket
[Unit]
Description=Sector Worker Socket

[Socket]
ListenStream=127.0.0.1:9000    # TCP socket on port 9000
Accept=false                     # pass connected socket to service (not fork)

[Install]
WantedBy=sockets.target

---
# /etc/systemd/system/sector-worker.service
[Unit]
Description=Sector Worker (Socket Activated)

[Service]
ExecStart=/opt/sector/worker
StandardInput=socket             # read from the socket handed by systemd

Slide 23 of 36

Dependency Chains: Multi-Service Startup

Design a correct startup sequence for a three-tier application stack.

Three services: PostgreSQL database, a REST API, and an nginx reverse proxy. PostgreSQL must be active before the API. The API must be active before nginx serves traffic. If the database dies, the API must stop. If the API dies, nginx must stop.

# sector-db.service (PostgreSQL wrapper)
[Unit]
Description=Sector Database (PostgreSQL)
After=network-online.target

---
# sector-api.service
[Unit]
Description=Sector API
Requires=sector-db.service           # hard: if db stops, api stops
After=sector-db.service              # ordering: start after db is active

---
# sector-proxy.service (nginx)
[Unit]
Description=Sector Proxy (nginx)
Requires=sector-api.service           # hard: if api stops, proxy stops
After=sector-api.service             # ordering: start after api is active

---
# Visualize the dependency graph
systemd-analyze dot sector-proxy.service | dot -Tsvg > /tmp/deps.svg

Slide 24 of 36

systemd-analyze — Diagnostics Toolkit

Analyze boot performance, unit dependencies, and security posture.

# Overall boot time breakdown
systemd-analyze
# Startup finished in 1.923s (kernel) + 4.516s (userspace) = 6.439s graphical.target

# Which units slowed boot the most?
systemd-analyze blame
# 3.201s apt-daily-upgrade.service
# 1.893s snapd.service

# Show the dependency chain that determined total boot time
systemd-analyze critical-chain

# Security score for a unit (0 = worst, 10 = best)
systemd-analyze security nginx.service
# UNSAFE 2.1 (most vendor units are poorly sandboxed by default)

# Validate a unit file for syntax errors
systemd-analyze verify /etc/systemd/system/sector-api.service

# Generate full dependency graph (requires graphviz)
systemd-analyze dot --require | dot -Tsvg > /tmp/full-deps.svg

Slide 25 of 36

Unit States: Reading systemctl status

Understand every field in the status output before you can diagnose failures.

# systemctl status nginx.service output anatomy
* nginx.service - A high performance web server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
#              ^path                                ^boot-state  ^whether enabled by default
     Active: active (running) since Wed 2026-04-09 08:00:01 UTC; 2h 17min ago
#             ^load-state  ^sub-state  ^timestamp
       Docs: man:nginx(8)
   Main PID: 1847 (nginx)
      Tasks: 5 (limit: 9446)
     Memory: 6.8M
        CPU: 345ms
     CGroup: /system.slice/nginx.service
             |- 1847 "nginx: master process /usr/sbin/nginx"
             |- 1848 "nginx: worker process"

Active States

active (running) = one or more processes active. active (exited) = oneshot completed successfully. active (waiting) = waiting for an event (path, timer, socket).

Inactive States

inactive (dead) = not running, no failure. failed = exited with error or killed by signal. activating = starting up. deactivating = shutting down. reloading = config reload in progress.

Load States

loaded = file read successfully. not-found = unit file does not exist. masked = unit is symlinked to /dev/null -- cannot be started. bad-setting = unit file has a syntax error.

Slide 26 of 36

cgroups: Resource Limits for Services

Prevent a runaway service from consuming all CPU, RAM, or I/O on the system.

[Service]
# CPU: limit to 50% of one CPU core
CPUQuota=50%

# Memory: hard limit -- kernel OOM-kills at this threshold
MemoryMax=512M

# Memory: soft limit -- kernel applies memory pressure above this
MemoryHigh=400M

# I/O weight (relative to other services, 100 = default)
IOWeight=50

# Limit the number of tasks (threads + processes)
TasksMax=64

Inspect Live cgroup Usage

systemctl status unit-name shows Memory and CPU in its output. systemd-cgtop shows real-time resource usage by cgroup -- like htop but organized by service. cat /sys/fs/cgroup/system.slice/nginx.service/memory.current shows raw bytes.

Why This Matters

Without resource limits, a memory leak in one service can trigger the OOM killer system-wide, taking down unrelated services. CPUQuota prevents one service from monopolizing cores during a spike. These directives are mandatory for multi-tenant servers.

Slide 27 of 36

Transient Units: systemd-run

Run a command as a systemd service without writing a unit file. Useful for testing and one-off jobs.

# Run a command as a transient service unit
systemd-run --unit=scan-job /opt/scripts/network-scan.sh

# Check its status and logs like any unit
systemctl status scan-job.service
journalctl -u scan-job.service

# Run with resource limits (test your limits before writing unit files)
systemd-run --unit=limited-job --property=MemoryMax=100M python3 /opt/process-data.py

# Run as a different user
systemd-run --uid=www-data --gid=www-data /usr/bin/php /opt/task.php

# Run with a specific environment variable
systemd-run -E SECTOR_ENV=staging /opt/sector/deploy.sh

# Interactive shell in a transient scope (useful for debugging cgroup isolation)
systemd-run --user --pty -p MemoryMax=512M /bin/bash

Use Case

Use systemd-run to test a command under the same cgroup constraints you plan to use in a unit file. Validate that MemoryMax values are appropriate before writing the production unit. Faster than writing, deploying, testing, and rolling back a unit file.

Slide 28 of 36

Debugging Failed Services

A systematic approach. Follow this sequence before escalating or googling.

systemctl start sector-api returns immediately. systemctl status shows "failed." You need to find the root cause in under 3 minutes. This is the sequence.

# Step 1: Read the status output (includes last 10 log lines)
systemctl status sector-api.service

# Step 2: Read the full journal for this unit, this boot
journalctl -u sector-api.service -b --no-pager

# Step 3: If it failed to start, check ExecStart path and permissions
systemctl cat sector-api.service    # see the unit file
ls -la /opt/sector-api/server.js    # does the file exist?
stat /opt/sector-api/server.js       # what user:group owns it?

# Step 4: Check if the unit file has a syntax error
systemd-analyze verify sector-api.service

# Step 5: Reset failure state and try again with verbose output
systemctl reset-failed sector-api.service
SYSTEMD_LOG_LEVEL=debug systemctl start sector-api.service
journalctl -u sector-api.service -n 50 -b

Slide 29 of 36

Environment Files: Secrets and Config

Keep credentials out of unit files and version control. Load them at service start.

# /etc/sector-api/env  (mode 600, owned by sector-svc)
DB_PASSWORD=s3cr3t-db-key
API_SECRET=a1b2c3d4e5f6
LOG_LEVEL=INFO
LISTEN_PORT=8080

---
# Reference the env file in the unit (leading - means OK if missing)
[Service]
EnvironmentFile=-/etc/sector-api/env

# The environment variables are then available to the process
ExecStart=/usr/bin/node /opt/sector-api/server.js

---
# Verify the environment variables are loaded correctly
systemctl show sector-api.service -p Environment

# Lock down the env file -- only sector-svc and root can read it
chown root:sector-svc /etc/sector-api/env
chmod 640 /etc/sector-api/env

Never Do This

Never hardcode credentials in the unit file itself. Unit files are world-readable by default (-rw-r--r--). Any user on the system can run systemctl cat sector-api.service and read them. Use EnvironmentFile with locked-down permissions instead.

Slide 30 of 36

Watchdog: Liveness Monitoring

systemd can restart services that stop responding, even if they do not crash.

What the Watchdog Does

With WatchdogSec= set, systemd expects the service to send sd_notify("WATCHDOG=1") at least once per watchdog interval. If the service goes silent (deadlock, infinite loop, hung thread), systemd kills and restarts it after the interval expires.

Service Requirements

The service must be instrumented to call sd_notify("WATCHDOG=1") periodically. Read the interval from WATCHDOG_USEC environment variable and notify at half the interval. Most modern service frameworks support this natively.

[Service]
Type=notify                       # required for watchdog
WatchdogSec=30                      # restart if silent for 30 seconds
Restart=on-failure

---
# In the service process (Python example using systemd bindings)
# from systemd.daemon import notify, WATCHDOG_USEC
# import time, os
# interval = int(os.environ.get('WATCHDOG_USEC', 30000000)) / 2000000
# while True:
#     do_work()
#     notify('WATCHDOG=1')   # tell systemd "I'm alive"
#     time.sleep(interval)

Slide 31 of 36

Inspecting the System: List Commands

Find services, understand what is running, and audit the system state.

# List all active units
systemctl list-units

# List only service units
systemctl list-units --type=service

# List only failed units
systemctl list-units --state=failed

# List all unit files and their enablement state
systemctl list-unit-files --type=service

# List all timers with next activation time
systemctl list-timers --all

# List dependencies of a unit (what it depends on)
systemctl list-dependencies nginx.service

# List reverse dependencies (what depends on this unit)
systemctl list-dependencies --reverse nginx.service

# Show all properties of a unit
systemctl show nginx.service

# Check whether a specific property is set
systemctl show nginx.service -p Restart -p RestartSec

Slide 32 of 36

Journal Management

Control journal size, rotation, and retention. Prevent unbounded disk growth.

# /etc/systemd/journald.conf -- key settings

# Maximum journal size on disk
SystemMaxUse=500M

# Maximum size of a single journal file
SystemMaxFileSize=50M

# Keep journals for this many days
MaxRetentionSec=30day

# Compress journal files
Compress=yes

---
# Apply changes after editing journald.conf
systemctl restart systemd-journald

# Manually vacuum old journal files
journalctl --vacuum-size=200M       # keep only last 200 MB of journals
journalctl --vacuum-time=7d         # delete journals older than 7 days
journalctl --vacuum-files=5         # keep only last 5 archive journal files

# Check current disk usage
journalctl --disk-usage

Slide 33 of 36 | Legacy Reference

Legacy Service Control: Before systemctl

systemd is the modern default, but you will still meet SysVinit boxes — older RHEL/CentOS, SUSE, appliances, embedded. Know the commands and how they map.

# SysVinit / LSB era -- managing services WITHOUT systemd

# --- Enable / disable a service at boot: chkconfig (RHEL + SUSE SysV; service name varies by distro) ---
chkconfig apache2 on          # add to the boot runlevels  (RHEL names it httpd; systemd: systemctl enable apache2)
chkconfig apache2 off         # remove from the init runlevels -- disable at boot (systemd: systemctl disable apache2)
chkconfig --list              # per-runlevel on/off state  (systemd: systemctl list-unit-files)

# --- Start / stop / check a running service (LSB wrapper: Debian + RHEL) ---
service apache2 start         # systemd: systemctl start apache2
service apache2 stop          # systemd: systemctl stop apache2
service apache2 status        # systemd: systemctl status apache2

# --- SUSE convenience symlinks: rc<service> ---
rcapache2 start             # start the apache2 service
rcapache2 stop              # stop a running instance of apache2     (systemd: systemctl stop apache2)

# --- Or call the SysV init script directly ---
/etc/init.d/apache2 restart  # the actual script the wrappers above invoke

# Runlevels were SysV's targets: 3 = multi-user (multi-user.target), 5 = GUI (graphical.target), 0 = halt, 6 = reboot

Slide 34 of 36 | Lab Exercises

Practice Exercises

Complete these before the lab ends. Use your Ubuntu 22.04 VM.

1 Write a complete service unit for a Python script of your choice. Include all three sections ([Unit], [Service], [Install]), a non-root User, PrivateTmp=true, and Restart=on-failure. Deploy it with daemon-reload and verify it starts.

2 Create a drop-in override for ssh.service that adds Restart=always and RestartSec=3. Verify with systemctl cat ssh.service that the override is merged correctly.

3 Write a timer + service pair that runs a one-line script every 5 minutes. Verify with systemctl list-timers that it fires correctly. Check the journal to confirm execution.

4 Use journalctl to find all errors from the current boot for all units. Then narrow it to just the ssh service for the past 24 hours. Export the result to a file.

5 Run systemd-analyze security nginx.service and identify the top 3 security improvements recommended. Implement at least one using a drop-in override. Re-run the analysis and verify the score improved.

Slide 35 of 36

What's Next

systemd runs your services. Week 2 digs into what those services connect to.

ALA-03: Network Configuration

Netplan YAML, ip link and ip addr, NetworkManager vs systemd-networkd, bonding modes, VLAN configuration. network-online.target will make sense in new ways after this module.

ALA-04: Grid Diagnostics

ss, ip route, dig, tcpdump, nmap. When services start but clients cannot reach them, you need these tools. The ss output maps directly to the socket units you just learned to configure.

Week 2: Storage and Security

LVM volume management, LUKS encryption, filesystem tuning. The ProtectSystem and ReadWritePaths directives you wrote today will interact with everything you learn about storage in Week 2.

Key Integration Point

Every service you administer -- nginx, PostgreSQL, custom applications -- is now a unit you can manage, constrain, monitor, and schedule with the tools from this module. systemd is not a complexity tax. It is the control plane for the entire system.

Slide 36 of 36 | ALA-02

ALA-02 Summary: Key Takeaways

You can now design, deploy, harden, and troubleshoot systemd services. You understand the difference between Requires and Wants, why After and Requires are orthogonal, and why daemon-reload is never optional. These are not basics -- this is how professional Linux administrators operate.

8 Facts to Carry Out of This Lecture

1 systemd is PID 1. Unit files live in /lib/systemd/system/ (vendor) and /etc/systemd/system/ (admin). Always work in /etc/.

2 Three sections: [Unit] (metadata + dependencies), [Service] (process config), [Install] (boot integration). All three are required for a complete unit.

3 Requires = hard dependency (stop together). Wants = soft dependency. After = ordering only. You must specify both Requires and After to get both behavior and order.

4 After editing any unit file: daemon-reload then restart. Never skip daemon-reload. It costs 0.1 seconds and saves hours of confusion.

5 Use drop-in overrides (systemctl edit) to modify vendor units. Never edit files in /lib/systemd/system/ directly -- package updates will overwrite them.

6 journalctl -u unit -b -p err is your first debugging tool. systemd-analyze verify catches syntax errors before deployment.

7 Timers replace cron. They log to the journal, support dependencies, and can be inspected with systemctl list-timers. Migrate critical cron jobs.

8 PrivateTmp, ProtectSystem, NoNewPrivileges, and User= are the minimum security baseline for every new service unit. Run systemd-analyze security to verify.

Service Authority -- systemd Deep Dive | Advanced Linux Administration