Service Authority -- systemd Deep Dive | Advanced Linux Administration

Slide 1 of 35  |  ALA-02  |  Week 1 of 8
Service Authority
systemd Deep Dive
Unit Types  •  Unit File Anatomy  •  Dependencies  •  systemctl  •  journalctl
Sector command requires three services to come online in the correct order -- the database before the API, the API before the gateway. One misconfigured dependency and the gateway starts against a cold database. This module is how you specify that order and guarantee it.
35 Slides ALA-02 Week 1 of 8 Ubuntu 22.04 LTS
Slide 2 of 35
What Is systemd?
PID 1. The first process the kernel starts. Everything else depends on it.
kernel systemd PID 1 sshd nginx postgresql services
PID 1 and the Init System
When the Linux kernel finishes loading, it executes exactly one process: init. Since Ubuntu 15.04, that process is systemd. It is responsible for starting all other processes, managing their lifecycles, and shutting the system down cleanly. Kill PID 1 and the system halts.
What Replaced
systemd replaced SysV init (the legacy System V init scripts). SysV used shell scripts in /etc/init.d/ executed sequentially. systemd uses declarative unit files and parallel activation. Boot time dropped from minutes to seconds on modern systems.
What systemd Controls
Services (daemons), sockets, timers, mount points, swap devices, system targets (runlevels), device nodes, and more. It is not just an init system -- it is a system and service manager. journalctl, hostnamectl, timedatectl, localectl are all part of the ecosystem.
Exam Note
On CompTIA Linux+, RHCSA, and LPIC-1 exams, systemd unit file syntax, systemctl commands, and journalctl filtering are heavily tested. Everything in this module appears on certification exams.
Slide 3 of 35
Unit Types: What systemd Manages
Every resource systemd manages is a unit. Each type has its own file extension and behavior.
systemd .service .socket .timer .target .mount .path
.service
A daemon or one-shot process. The most common unit type. Defines how to start, stop, and restart a service process. Examples: nginx.service, ssh.service, postgresql.service.
.socket
A socket activation unit. systemd listens on a socket (TCP port, Unix socket, FIFO) and only starts the associated service when a connection arrives. Saves resources -- the service does not run until it is needed.
.timer
A scheduled unit. Replaces cron for systemd-managed systems. Activates another unit on a schedule. Supports monotonic timers (after boot, after last run) and calendar timers (specific dates/times). Has better logging than cron.
.target
A synchronization point. Groups related units. multi-user.target is the traditional equivalent of runlevel 3 (multi-user, network up, no GUI). graphical.target = runlevel 5. network-online.target means the network is fully up.
.mount
Manages filesystem mount points. Automatically generated from /etc/fstab entries. Can be manually written for complex mount configurations with proper dependency ordering. Mount unit names must match the mount path (slashes become dashes).
.path
Activates a unit when a filesystem path changes (created, modified, deleted). Uses inotify. Useful for watch-and-process workflows -- e.g., start a processing service when a new file arrives in a directory.
Slide 4 of 35
Unit File Locations
Three directories. Precedence matters. Editing the wrong one is a common mistake.
/lib/systemd/system/
Vendor-provided units. Installed by packages (apt, dpkg). Never edit files here directly -- package updates will overwrite your changes silently. On Ubuntu 22.04, /lib/systemd/system/ is a symlink to /usr/lib/systemd/system/.
/etc/systemd/system/
Administrator-created and overriding units. Files here take precedence over /lib/systemd/system/. This is where you put custom service files and drop-in override files. Survives package updates. Always work here.
/run/systemd/system/
Runtime units created dynamically. Not persistent across reboots. Created by systemd itself or by other programs at runtime. You rarely write files here manually -- these are managed programmatically.
# List all unit files and their states systemctl list-unit-files # Find where a specific unit's file lives systemctl cat nginx.service # shows the file with its path in a header comment # Show the effective configuration (vendor + overrides merged) systemctl show nginx.service # Your custom unit files go here ls -la /etc/systemd/system/
Slide 5 of 35
Service Unit: [Unit] Section
Metadata and dependency declarations. Read before the service is started.
sector-api.service [Unit] Description After, Requires [Service] ExecStart, Type Restart, User [Install] WantedBy RequiredBy
# /etc/systemd/system/sector-api.service # [Unit] section: describes the unit and declares dependencies [Unit] Description=Sector API Application Server Documentation=https://internal.docs/sector-api # After: this unit starts AFTER the listed units are active After=network-online.target sector-db.service # Requires: hard dependency — if sector-db.service stops, this stops too Requires=sector-db.service # Wants: soft dependency — try to start, but continue even if it fails Wants=network-online.target # ConditionPathExists: only start if this file exists ConditionPathExists=/opt/sector-api/sector-api.jar
Description
A human-readable name shown in systemctl status output. Make it descriptive. This is the first thing an operator reads when troubleshooting an unknown service at 3 AM.
Documentation
A URI pointing to documentation. Supports http://, https://, man:, file: URIs. systemctl help unit-name opens the URI. Include it -- your future self will thank you.
Slide 6 of 35
Service Unit: [Service] Section
How to start, stop, and supervise the process. The most complex section.
[Service] # Type: defines how systemd tracks when the service is "ready" Type=simple # default: process started = service ready # Type=forking # old-style daemons that fork and exit parent # Type=notify # service sends sd_notify() when ready # Type=oneshot # short-lived task, not a daemon # Run as this user/group User=sector-svc Group=sector-svc # Working directory WorkingDirectory=/opt/sector-api # Environment variables EnvironmentFile=/etc/sector-api/env Environment="LOG_LEVEL=INFO" # The actual command to start the service ExecStart=/usr/bin/java -jar /opt/sector-api/sector-api.jar # Optional: run before ExecStart for setup ExecStartPre=/usr/bin/test -f /opt/sector-api/sector-api.jar # Restart policy Restart=on-failure RestartSec=5
Slide 7 of 35
Service Type Values
The wrong Type causes systemd to mistrack your service. Understand each one.
Type=simple
The default. systemd considers the service started as soon as ExecStart runs. The process does NOT fork. Use for modern daemons that stay in the foreground. Most Python, Java, and Node.js services use this type.
Type=forking
For traditional Unix daemons that fork a child and exit the parent. systemd waits for the parent to exit, then tracks the child. You must also set PIDFile= so systemd knows which PID to track after the fork. Legacy -- avoid in new code.
Type=notify
The service calls sd_notify("READY=1") when it is fully initialized. systemd waits for this notification before marking the service active. Use for services with non-trivial startup (loading config, connecting to DB). nginx, PostgreSQL, and systemd-networkd use this.
Type=oneshot
For short-lived tasks that run and exit (not daemons). systemd waits for ExecStart to exit before marking the unit active. Set RemainAfterExit=yes if you want the unit to show as "active" after the task completes.
Type=dbus
Service is considered ready when it acquires a specific name on the D-Bus system bus. Specify BusName=. Used by desktop services and some system daemons. Rare in server environments.
Slide 8 of 35
Service Unit: [Install] Section
Controls what happens when you run systemctl enable. Not read at runtime.
[Install] # WantedBy: which target "wants" this unit when enabled # multi-user.target = start at boot in multi-user mode (no GUI) # graphical.target = start when GUI is available WantedBy=multi-user.target # RequiredBy: hard dependency from the target to this unit # If the target fails to start this unit, the target itself fails # RequiredBy=multi-user.target # Alias: alternative names for the unit # Alias=api.service
What systemctl enable Does
Creates a symlink in /etc/systemd/system/multi-user.target.wants/ pointing to your unit file. This symlink is what causes the service to start at boot. systemctl disable removes the symlink. Neither starts nor stops the service immediately.
enable vs start
enable = set to auto-start at boot. start = start right now. Do both after writing a new service: systemctl enable --now unit-name enables AND starts in one command. This is the standard deployment pattern.
Deployment Pattern
After writing a new unit file: systemctl daemon-reload (load the new file), then systemctl enable --now sector-api.service (enable at boot and start now). Two commands, correct order, every time.
Slide 9 of 35
Restart Policies: Service Resilience
systemd can automatically recover from service failures. Configure it explicitly.
exit != 0 Restart=? wait RestartSec restart service stay failed burst limit
Restart=no
Default. systemd does not restart the service on any exit. If it crashes, it stays down until an operator manually starts it. Appropriate for one-shot tasks or services where unexpected restarts would cause harm.
Restart=on-failure
Restart only if the process exits with a non-zero code, is killed by a signal, times out, or hits a watchdog. Does NOT restart on clean exit (exit 0). The most common setting for production services -- handles crashes without restarting intentional shutdowns.
Restart=always
Always restart, regardless of exit code. Even a clean exit triggers a restart. Use for services that are expected to run forever and where no exit is intentional from systemd's perspective. Combine with StartLimitIntervalSec to prevent restart storms.
[Service] Restart=on-failure RestartSec=5 # wait 5 seconds before restarting # Prevent infinite restart storms: allow max 5 restarts in 60 seconds StartLimitIntervalSec=60 StartLimitBurst=5 # After hitting the burst limit, the unit enters "failed" state # Recovery: systemctl reset-failed unit-name && systemctl start unit-name
Slide 10 of 35
Dependencies: Requires, Wants, After, Before
Four directives. Requires and Wants declare what must exist. After and Before declare order.
sector-api sector-db Requires network.target Wants sector-db After sector-proxy Before
These four directives answer two distinct questions: "What do I need?" (Requires/Wants) and "When do I start relative to them?" (After/Before). They are orthogonal -- declaring Requires does NOT imply After. You must specify both if you need both.
Requires=unit
Hard dependency. If the required unit cannot be started, this unit fails. If the required unit stops (for any reason) while this unit is running, this unit is also stopped. Use when the dependency is truly non-negotiable.
Wants=unit
Soft dependency. systemd will try to start the wanted unit alongside this one, but this unit is not affected if the wanted unit fails. Use for optional dependencies or services that enhance but are not required.
After=unit
Ordering only. This unit starts AFTER the listed unit is considered active. Does NOT imply Requires or Wants -- just controls sequence. If both units would start at boot, this one waits. If only this unit is started, After has no effect.
Before=unit
This unit must be active before the listed unit starts. The inverse of After. If unit A has Before=B, it is equivalent to B having After=A. Use in units that are consumed by others, rather than in consumers.
Slide 11 of 35
System Targets
Targets are synchronization points. They replaced SysV runlevels.
sysinit.target basic.target network.target multi-user .target BOOT SEQUENCE sysinit --> basic --> network --> multi-user
multi-user.target
The standard server target. Multi-user, non-graphical, network up. Equivalent to SysV runlevel 3. Most server services declare WantedBy=multi-user.target in their [Install] section. The default target on Ubuntu Server installs.
network-online.target
Reached when at least one network interface is configured and online. Critical distinction: network.target means network is UP but may not yet have an address. network-online.target means an address is assigned and routing works. Use the latter for services that need connectivity.
rescue.target / emergency.target
rescue.target = single-user mode, minimal services, root shell. emergency.target = most minimal state possible, read-only root. Used for system repair. Boot into them by adding systemd.unit=rescue.target to the kernel command line in GRUB.
# See the current default target (boots to) systemctl get-default # Change the default target systemctl set-default multi-user.target # Switch to a target immediately (without rebooting) systemctl isolate rescue.target # List all targets and their active state systemctl list-units --type=target
Slide 12 of 35
systemctl Command Reference
The primary interface for managing units. Know every command in this slide.
# Service lifecycle systemctl start nginx.service # start the service now systemctl stop nginx.service # send SIGTERM, then SIGKILL after timeout systemctl restart nginx.service # stop then start systemctl reload nginx.service # send SIGHUP — reload config, no downtime systemctl reload-or-restart nginx # reload if supported, restart otherwise # Enable/disable (boot persistence) systemctl enable nginx.service # create symlink, starts at boot systemctl disable nginx.service # remove symlink systemctl enable --now nginx.service # enable AND start immediately systemctl mask nginx.service # prevent start entirely (symlink to /dev/null) systemctl unmask nginx.service # Status and inspection systemctl status nginx.service # human-readable status with recent logs systemctl is-active nginx.service # prints "active" or "inactive" systemctl is-enabled nginx.service # prints "enabled" or "disabled" systemctl is-failed nginx.service # returns 0 if in failed state
Slide 13 of 35
daemon-reload and Why It Matters
After editing any unit file, you must tell systemd to re-read its configuration.
edit unit step 1 daemon-reload re-read cache step 2 restart step 3 verify
What daemon-reload Does
systemd caches unit file contents in memory. When you create or modify a unit file on disk, systemd does not see the changes until you run systemctl daemon-reload. This command re-reads all unit files. It does NOT restart any services.
What Happens Without It
You edit a unit file, run systemctl restart service, and wonder why nothing changed. The service restarted using the old cached unit definition. This is the number one cause of "I edited the file but nothing changed" confusion in systemd.
# Correct workflow for any unit file change: # 1. Edit or create the unit file nano /etc/systemd/system/sector-api.service # 2. Reload systemd's unit file cache systemctl daemon-reload # 3. Restart the service to apply changes systemctl restart sector-api.service # 4. Verify the new configuration is running systemctl status sector-api.service
Never Skip Step 2
Always run daemon-reload after editing unit files. Make it muscle memory. The sequence is: edit, reload, restart, verify. Skipping daemon-reload wastes time and causes baffling debugging sessions.
Slide 14 of 35
Drop-In Override Files
Customize vendor units without touching the original file. Survives package upgrades.
nginx is installed via apt. The vendor unit file in /lib/systemd/system/ does not have your required environment variables or restart policy. You cannot edit the vendor file without losing changes on the next apt upgrade. Drop-in files solve this.
# Method 1: systemctl edit (recommended -- creates the directory for you) systemctl edit nginx.service # Opens an editor. Save your overrides. Automatically runs daemon-reload. # Creates: /etc/systemd/system/nginx.service.d/override.conf # Method 2: manual creation mkdir -p /etc/systemd/system/nginx.service.d/ # /etc/systemd/system/nginx.service.d/override.conf [Service] Environment="NGINX_ENV=production" Restart=always RestartSec=3 # Reload and verify the override was applied systemctl daemon-reload systemctl cat nginx.service # shows vendor file + override file concatenated
Rule
Drop-in files are merged with the original unit file. You only need to specify the directives you are changing -- everything else is inherited. A 3-line override file is cleaner and safer than copying and modifying a 50-line vendor file.
Slide 15 of 35
Service Security: Sandboxing Directives
systemd provides namespace-based isolation without containers. Use it for every production service.
[Service] # Run as non-root user User=sector-svc Group=sector-svc # Read-only filesystem -- except specified directories ProtectSystem=strict # entire filesystem read-only ReadWritePaths=/opt/sector/data # exception: this path is writable # Hide sensitive paths from the service InaccessiblePaths=/etc/shadow /etc/gshadow /root # Prevent the service from gaining new privileges NoNewPrivileges=true # Limit which system calls the service can make (allowlist approach) SystemCallFilter=@system-service # common syscalls for services # Private /tmp: service gets its own isolated /tmp PrivateTmp=true # Private network namespace (no network access) # PrivateNetwork=true -- use only if service needs no networking
Exam and Real-World Note
systemd-analyze security unit-name gives your service a security score and specific recommendations. Run it on every service you write and address the high-severity items. This is production-quality hardening with zero additional software.
Slide 16 of 35
journalctl — The systemd Journal
Structured, indexed, queryable logs. Replaces scattered text files for systemd-managed services.
nginx sshd postgresql SERVICES systemd-journald binary indexed log /var/log/journal/ PERSISTENT /run/log/journal/ VOLATILE
Why Journal Over Text Files
Traditional logs are plain text. Searching them with grep is slow on large files. The journal stores logs in a binary indexed format. Queries by time range, unit, priority, or PID are fast regardless of log volume. Timestamps are stored with nanosecond precision.
Journal Persistence
By default on Ubuntu 22.04, the journal persists to /var/log/journal/ after creating the directory (or if it already exists). Without it, logs are kept in /run/log/journal/ and lost on reboot. Check with journalctl --disk-usage.
# View all journal entries (most recent last) journalctl # Follow new entries in real time (like tail -f) journalctl -f # View journal for a specific unit journalctl -u nginx.service # Follow a specific unit in real time journalctl -fu nginx.service # Show only the last 50 lines of a unit's log journalctl -u nginx.service -n 50 # Show disk usage of the journal journalctl --disk-usage
Slide 17 of 35
journalctl: Advanced Filtering
Time ranges, priority levels, and structured field queries.
# Filter by time range journalctl --since "2026-04-09 08:00:00" journalctl --since "1 hour ago" journalctl --since "today" --until "now" journalctl -u nginx.service --since "2026-04-09" --until "2026-04-09 23:59:59" # Filter by priority level (0=emerg, 1=alert, 2=crit, 3=err, 4=warning, 5=notice, 6=info, 7=debug) journalctl -p err # errors and above (0-3) journalctl -p warning..err # range: warning to error journalctl -p crit -u nginx.service # critical events from nginx # Filter by PID or executable path journalctl _PID=14823 journalctl _EXE=/usr/sbin/sshd # JSON output for machine processing journalctl -u nginx.service -o json-pretty | head -40 # Export for sharing or archival journalctl -u nginx.service --since "today" > /tmp/nginx-today.log
Slide 18 of 35
journalctl: Boot Logs and Boot Analysis
Diagnose boot failures and track changes across reboots.
# List all recorded boot sessions journalctl --list-boots # View logs from the current boot journalctl -b # View logs from the previous boot (useful after a crash) journalctl -b -1 # View logs from two boots ago journalctl -b -2 # Errors and above from the previous boot journalctl -b -1 -p err # systemd-analyze: boot time performance breakdown systemd-analyze # total boot time systemd-analyze blame # which units took longest systemd-analyze critical-chain # the critical path that determined total boot time # Plot boot sequence to an SVG systemd-analyze plot > /tmp/boot-chart.svg
Slide 19 of 35
Writing a Custom Service Unit
Full working example: a Node.js API server with security hardening.
# /etc/systemd/system/sector-api.service [Unit] Description=Sector API Service Documentation=https://internal.docs/sector-api After=network-online.target Wants=network-online.target [Service] Type=simple User=sector-svc Group=sector-svc WorkingDirectory=/opt/sector-api EnvironmentFile=-/etc/sector-api/env # leading - means: ignore if missing ExecStartPre=/usr/bin/node --check /opt/sector-api/server.js ExecStart=/usr/bin/node /opt/sector-api/server.js Restart=on-failure RestartSec=5 StandardOutput=journal StandardError=journal SyslogIdentifier=sector-api NoNewPrivileges=true PrivateTmp=true ProtectSystem=strict ReadWritePaths=/opt/sector-api/data /var/log/sector-api StartLimitIntervalSec=60 StartLimitBurst=5 [Install] WantedBy=multi-user.target
Slide 20 of 35
Timer Units: Replacing Cron
Scheduled tasks with full systemd logging, dependencies, and failure detection.
schedule .timer OnCalendar= triggers .service Type=oneshot journal
A cron job that fails silently is a compliance nightmare. systemd timers write every execution to the journal, support systemd dependencies, and can be inspected with systemctl status. Migrate critical cron jobs to timers.
# /etc/systemd/system/sector-backup.service (the task to run) [Unit] Description=Sector Database Backup After=sector-db.service [Service] Type=oneshot User=backup-svc ExecStart=/opt/sector/scripts/backup.sh --- # /etc/systemd/system/sector-backup.timer (the schedule) [Unit] Description=Run Sector Backup Nightly [Timer] OnCalendar=*-*-* 02:30:00 # every day at 02:30 AccuracySec=1m # allow 1 min drift for system load Persistent=true # run missed execution on next boot [Install] WantedBy=timers.target --- # Enable and start the timer (not the service -- the timer fires the service) systemctl enable --now sector-backup.timer systemctl list-timers # see all timers and next fire time
Slide 21 of 35
Timer: Calendar Syntax Reference
OnCalendar expressions. More expressive than cron and verifiable with systemd-analyze.
# systemd calendar event format: DOW YYYY-MM-DD HH:MM:SS # * = any value , = list .. = range / = step OnCalendar=daily # 00:00:00 every day (shorthand) OnCalendar=hourly # top of every hour OnCalendar=weekly # Monday 00:00:00 OnCalendar=monthly # first of each month at 00:00:00 OnCalendar=*-*-* 02:30:00 # every day at 02:30 OnCalendar=Mon *-*-* 06:00:00 # every Monday at 06:00 OnCalendar=Mon..Fri *-*-* 09:00:00 # weekdays at 09:00 OnCalendar=*-*-1 00:00:00 # first of every month OnCalendar=*-*-* *:00/15:00 # every 15 minutes # Verify a calendar expression before deploying systemd-analyze calendar "Mon..Fri *-*-* 09:00:00" # Output shows the next 10 scheduled fire times
Tip
Always run systemd-analyze calendar "your-expression" before deploying a timer. It shows the next scheduled execution times, so you can verify you wrote the expression correctly before it misses a production backup window.
Slide 22 of 35
Socket Activation: On-Demand Services
systemd listens on a socket. The service only starts when a connection arrives.
client connect .socket listening :9000 activates .service handles conn response IDLE (waiting) ON-DEMAND zero-downtime from client perspective
The Problem It Solves
Some services are needed infrequently. Running them continuously wastes RAM. With socket activation, systemd holds the socket open. When a client connects, systemd starts the service and hands it the accepted connection. Zero client delay -- from the client's perspective the service was always listening.
SSH Uses This
ssh.socket listens on port 22. sshd.service activates when a connection arrives. On low-traffic servers this saves the sshd process running constantly for connections that happen twice a day. Check with systemctl status ssh.socket.
# /etc/systemd/system/sector-worker.socket [Unit] Description=Sector Worker Socket [Socket] ListenStream=127.0.0.1:9000 # TCP socket on port 9000 Accept=false # pass connected socket to service (not fork) [Install] WantedBy=sockets.target --- # /etc/systemd/system/sector-worker.service [Unit] Description=Sector Worker (Socket Activated) [Service] ExecStart=/opt/sector/worker StandardInput=socket # read from the socket handed by systemd
Slide 23 of 35
Dependency Chains: Multi-Service Startup
Design a correct startup sequence for a three-tier application stack.
sector-db PostgreSQL Requires After sector-api REST API Requires After sector-proxy nginx STARTS 1st STARTS 2nd STARTS 3rd if db stops --> api stops --> proxy stops (Requires chain)
Three services: PostgreSQL database, a REST API, and an nginx reverse proxy. PostgreSQL must be active before the API. The API must be active before nginx serves traffic. If the database dies, the API must stop. If the API dies, nginx must stop.
# sector-db.service (PostgreSQL wrapper) [Unit] Description=Sector Database (PostgreSQL) After=network-online.target --- # sector-api.service [Unit] Description=Sector API Requires=sector-db.service # hard: if db stops, api stops After=sector-db.service # ordering: start after db is active --- # sector-proxy.service (nginx) [Unit] Description=Sector Proxy (nginx) Requires=sector-api.service # hard: if api stops, proxy stops After=sector-api.service # ordering: start after api is active --- # Visualize the dependency graph systemd-analyze dot sector-proxy.service | dot -Tsvg > /tmp/deps.svg
Slide 24 of 35
systemd-analyze — Diagnostics Toolkit
Analyze boot performance, unit dependencies, and security posture.
# Overall boot time breakdown systemd-analyze # Startup finished in 1.923s (kernel) + 4.516s (userspace) = 6.439s graphical.target # Which units slowed boot the most? systemd-analyze blame # 3.201s apt-daily-upgrade.service # 1.893s snapd.service # Show the dependency chain that determined total boot time systemd-analyze critical-chain # Security score for a unit (0 = worst, 10 = best) systemd-analyze security nginx.service # UNSAFE 2.1 (most vendor units are poorly sandboxed by default) # Validate a unit file for syntax errors systemd-analyze verify /etc/systemd/system/sector-api.service # Generate full dependency graph (requires graphviz) systemd-analyze dot --require | dot -Tsvg > /tmp/full-deps.svg
Slide 25 of 35
Unit States: Reading systemctl status
Understand every field in the status output before you can diagnose failures.
inactive activating active (running) deactivating failed crash/signal SERVICE LIFECYCLE
# systemctl status nginx.service output anatomy * nginx.service - A high performance web server Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled) # ^path ^boot-state ^whether enabled by default Active: active (running) since Wed 2026-04-09 08:00:01 UTC; 2h 17min ago # ^load-state ^sub-state ^timestamp Docs: man:nginx(8) Main PID: 1847 (nginx) Tasks: 5 (limit: 9446) Memory: 6.8M CPU: 345ms CGroup: /system.slice/nginx.service |- 1847 "nginx: master process /usr/sbin/nginx" |- 1848 "nginx: worker process"
Active States
active (running) = one or more processes active. active (exited) = oneshot completed successfully. active (waiting) = waiting for an event (path, timer, socket).
Inactive States
inactive (dead) = not running, no failure. failed = exited with error or killed by signal. activating = starting up. deactivating = shutting down. reloading = config reload in progress.
Load States
loaded = file read successfully. not-found = unit file does not exist. masked = unit is symlinked to /dev/null -- cannot be started. bad-setting = unit file has a syntax error.
Slide 26 of 35
cgroups: Resource Limits for Services
Prevent a runaway service from consuming all CPU, RAM, or I/O on the system.
[Service] # CPU: limit to 50% of one CPU core CPUQuota=50% # Memory: hard limit -- kernel OOM-kills at this threshold MemoryMax=512M # Memory: soft limit -- kernel applies memory pressure above this MemoryHigh=400M # I/O weight (relative to other services, 100 = default) IOWeight=50 # Limit the number of tasks (threads + processes) TasksMax=64
Inspect Live cgroup Usage
systemctl status unit-name shows Memory and CPU in its output. systemd-cgtop shows real-time resource usage by cgroup -- like htop but organized by service. cat /sys/fs/cgroup/system.slice/nginx.service/memory.current shows raw bytes.
Why This Matters
Without resource limits, a memory leak in one service can trigger the OOM killer system-wide, taking down unrelated services. CPUQuota prevents one service from monopolizing cores during a spike. These directives are mandatory for multi-tenant servers.
Slide 27 of 35
Transient Units: systemd-run
Run a command as a systemd service without writing a unit file. Useful for testing and one-off jobs.
# Run a command as a transient service unit systemd-run --unit=scan-job /opt/scripts/network-scan.sh # Check its status and logs like any unit systemctl status scan-job.service journalctl -u scan-job.service # Run with resource limits (test your limits before writing unit files) systemd-run --unit=limited-job --property=MemoryMax=100M python3 /opt/process-data.py # Run as a different user systemd-run --uid=www-data --gid=www-data /usr/bin/php /opt/task.php # Run with a specific environment variable systemd-run -E SECTOR_ENV=staging /opt/sector/deploy.sh # Interactive shell in a transient scope (useful for debugging cgroup isolation) systemd-run --user --pty -p MemoryMax=512M /bin/bash
Use Case
Use systemd-run to test a command under the same cgroup constraints you plan to use in a unit file. Validate that MemoryMax values are appropriate before writing the production unit. Faster than writing, deploying, testing, and rolling back a unit file.
Slide 28 of 35
Debugging Failed Services
A systematic approach. Follow this sequence before escalating or googling.
systemctl start sector-api returns immediately. systemctl status shows "failed." You need to find the root cause in under 3 minutes. This is the sequence.
# Step 1: Read the status output (includes last 10 log lines) systemctl status sector-api.service # Step 2: Read the full journal for this unit, this boot journalctl -u sector-api.service -b --no-pager # Step 3: If it failed to start, check ExecStart path and permissions systemctl cat sector-api.service # see the unit file ls -la /opt/sector-api/server.js # does the file exist? stat /opt/sector-api/server.js # what user:group owns it? # Step 4: Check if the unit file has a syntax error systemd-analyze verify sector-api.service # Step 5: Reset failure state and try again with verbose output systemctl reset-failed sector-api.service SYSTEMD_LOG_LEVEL=debug systemctl start sector-api.service journalctl -u sector-api.service -n 50 -b
Slide 29 of 35
Environment Files: Secrets and Config
Keep credentials out of unit files and version control. Load them at service start.
# /etc/sector-api/env (mode 600, owned by sector-svc) DB_PASSWORD=s3cr3t-db-key API_SECRET=a1b2c3d4e5f6 LOG_LEVEL=INFO LISTEN_PORT=8080 --- # Reference the env file in the unit (leading - means OK if missing) [Service] EnvironmentFile=-/etc/sector-api/env # The environment variables are then available to the process ExecStart=/usr/bin/node /opt/sector-api/server.js --- # Verify the environment variables are loaded correctly systemctl show sector-api.service -p Environment # Lock down the env file -- only sector-svc and root can read it chown root:sector-svc /etc/sector-api/env chmod 640 /etc/sector-api/env
Never Do This
Never hardcode credentials in the unit file itself. Unit files are world-readable by default (-rw-r--r--). Any user on the system can run systemctl cat sector-api.service and read them. Use EnvironmentFile with locked-down permissions instead.
Slide 30 of 35
Watchdog: Liveness Monitoring
systemd can restart services that stop responding, even if they do not crash.
What the Watchdog Does
With WatchdogSec= set, systemd expects the service to send sd_notify("WATCHDOG=1") at least once per watchdog interval. If the service goes silent (deadlock, infinite loop, hung thread), systemd kills and restarts it after the interval expires.
Service Requirements
The service must be instrumented to call sd_notify("WATCHDOG=1") periodically. Read the interval from WATCHDOG_USEC environment variable and notify at half the interval. Most modern service frameworks support this natively.
[Service] Type=notify # required for watchdog WatchdogSec=30 # restart if silent for 30 seconds Restart=on-failure --- # In the service process (Python example using systemd bindings) # from systemd.daemon import notify, WATCHDOG_USEC # import time, os # interval = int(os.environ.get('WATCHDOG_USEC', 30000000)) / 2000000 # while True: # do_work() # notify('WATCHDOG=1') # tell systemd "I'm alive" # time.sleep(interval)
Slide 31 of 35
Inspecting the System: List Commands
Find services, understand what is running, and audit the system state.
# List all active units systemctl list-units # List only service units systemctl list-units --type=service # List only failed units systemctl list-units --state=failed # List all unit files and their enablement state systemctl list-unit-files --type=service # List all timers with next activation time systemctl list-timers --all # List dependencies of a unit (what it depends on) systemctl list-dependencies nginx.service # List reverse dependencies (what depends on this unit) systemctl list-dependencies --reverse nginx.service # Show all properties of a unit systemctl show nginx.service # Check whether a specific property is set systemctl show nginx.service -p Restart -p RestartSec
Slide 32 of 35
Journal Management
Control journal size, rotation, and retention. Prevent unbounded disk growth.
# /etc/systemd/journald.conf -- key settings # Maximum journal size on disk SystemMaxUse=500M # Maximum size of a single journal file SystemMaxFileSize=50M # Keep journals for this many days MaxRetentionSec=30day # Compress journal files Compress=yes --- # Apply changes after editing journald.conf systemctl restart systemd-journald # Manually vacuum old journal files journalctl --vacuum-size=200M # keep only last 200 MB of journals journalctl --vacuum-time=7d # delete journals older than 7 days journalctl --vacuum-files=5 # keep only last 5 archive journal files # Check current disk usage journalctl --disk-usage
Slide 33 of 35  |  Lab Exercises
Practice Exercises
Complete these before the lab ends. Use your Ubuntu 22.04 VM.
1 Write a complete service unit for a Python script of your choice. Include all three sections ([Unit], [Service], [Install]), a non-root User, PrivateTmp=true, and Restart=on-failure. Deploy it with daemon-reload and verify it starts.
2 Create a drop-in override for ssh.service that adds Restart=always and RestartSec=3. Verify with systemctl cat ssh.service that the override is merged correctly.
3 Write a timer + service pair that runs a one-line script every 5 minutes. Verify with systemctl list-timers that it fires correctly. Check the journal to confirm execution.
4 Use journalctl to find all errors from the current boot for all units. Then narrow it to just the ssh service for the past 24 hours. Export the result to a file.
5 Run systemd-analyze security nginx.service and identify the top 3 security improvements recommended. Implement at least one using a drop-in override. Re-run the analysis and verify the score improved.
Slide 34 of 35
What's Next
systemd runs your services. Week 2 digs into what those services connect to.
ALA-03: Network Configuration
Netplan YAML, ip link and ip addr, NetworkManager vs systemd-networkd, bonding modes, VLAN configuration. network-online.target will make sense in new ways after this module.
ALA-04: Grid Diagnostics
ss, ip route, dig, tcpdump, nmap. When services start but clients cannot reach them, you need these tools. The ss output maps directly to the socket units you just learned to configure.
Week 2: Storage and Security
LVM volume management, LUKS encryption, filesystem tuning. The ProtectSystem and ReadWritePaths directives you wrote today will interact with everything you learn about storage in Week 2.
Key Integration Point
Every service you administer -- nginx, PostgreSQL, custom applications -- is now a unit you can manage, constrain, monitor, and schedule with the tools from this module. systemd is not a complexity tax. It is the control plane for the entire system.
Slide 35 of 35  |  ALA-02
ALA-02 Summary: Key Takeaways
You can now design, deploy, harden, and troubleshoot systemd services. You understand the difference between Requires and Wants, why After and Requires are orthogonal, and why daemon-reload is never optional. These are not basics -- this is how professional Linux administrators operate.
1 systemd is PID 1. Unit files live in /lib/systemd/system/ (vendor) and /etc/systemd/system/ (admin). Always work in /etc/.
2 Three sections: [Unit] (metadata + dependencies), [Service] (process config), [Install] (boot integration). All three are required for a complete unit.
3 Requires = hard dependency (stop together). Wants = soft dependency. After = ordering only. You must specify both Requires and After to get both behavior and order.
4 After editing any unit file: daemon-reload then restart. Never skip daemon-reload. It costs 0.1 seconds and saves hours of confusion.
5 Use drop-in overrides (systemctl edit) to modify vendor units. Never edit files in /lib/systemd/system/ directly -- package updates will overwrite them.
6 journalctl -u unit -b -p err is your first debugging tool. systemd-analyze verify catches syntax errors before deployment.
7 Timers replace cron. They log to the journal, support dependencies, and can be inspected with systemctl list-timers. Migrate critical cron jobs.
8 PrivateTmp, ProtectSystem, NoNewPrivileges, and User= are the minimum security baseline for every new service unit. Run systemd-analyze security to verify.