DNS Fundamentals | Advanced Linux Administration

Slide 1 of 34  |  ALA-W3-DNS  |  Week 3 of 8
DNS Fundamentals
Hierarchy, Records & Resolution
DNS Hierarchy  •  Record Types  •  Resolution Process  •  Caching & TTL
A sector node is reporting resolution failures. Half the cell cannot reach external services. You have no GUI, no NOC ticket, and thirty minutes before the operation resumes. Understanding DNS from the ground up is not optional -- it is the prerequisite to fixing anything in this stack.
34 Slides ALA-W3-DNS Week 3 of 8 Ubuntu 22.04 LTS
Slide 2 of 34
What DNS Actually Does
A distributed, hierarchical, globally consistent database that maps names to data.
The Core Problem It Solves
Humans use names. Computers route on IP addresses. DNS is the translator. Without it, every application would require hardcoded IPs, and moving a server would break every client that pointed to it.
Distributed by Design
No single server holds all DNS data. Responsibility is delegated down a hierarchy. The root servers know who is authoritative for .com. The .com servers know who runs example.com. Delegation is the core architectural principle.
Not Just Hostname to IP
DNS stores mail routing (MX), name server delegation (NS), IPv6 addresses (AAAA), text verification tokens (TXT), service discovery (SRV), and reverse lookups (PTR). It is a general-purpose record database, not just an address book.
Operational Reality
DNS is involved in almost every network operation your systems perform. Authentication, software updates, log shipping, API calls, email delivery -- all of them resolve names first. A misconfigured DNS server is a silent killer that makes everything appear broken for unrelated reasons.
Slide 3 of 34
The DNS Hierarchy
A tree of authority rooted at the 13 root server clusters. Delegation flows downward.
. (root) .com .org .net .io hexworth example www ops delegation
# The fully qualified domain name (FQDN) reads right-to-left through the hierarchy # mail.ops.hexworth.com. # | | | | # host sub domain root (trailing dot = explicit root) # # Hierarchy levels: # Root (.) --> TLD (.com, .org, .net, .io) --> SLD (hexworth.com) --> subdomains # Root zone: 13 logical root server addresses (operated by 12 organizations) # a.root-servers.net through m.root-servers.net # Each is actually an anycast cluster -- hundreds of physical servers worldwide # Verify root servers from the command line dig NS . # query root zone NS records dig NS com. # who is authoritative for .com TLD? dig NS hexworth.com # who is authoritative for hexworth.com?
Delegation
Each level delegates authority downward via NS records. The root delegates .com to Verisign. Verisign delegates hexworth.com to your name servers. You control everything below that cut.
The Trailing Dot
A fully qualified domain name ends with a dot representing the root zone: hexworth.com.. Zone files require it. When you omit it in a zone file, BIND appends the zone origin. Getting this wrong causes silent misdelegation.
Slide 4 of 34
The Resolution Process
How your system gets from a hostname to an IP address. Eight steps, all transparent.
Client Resolver Root . TLD Authority query 1 2 3 answer referral -> referral -> authoritative answer
You type ssh ops.hexworth.com. Before the TCP connection begins, your system walks through this entire resolution chain -- usually in under 50 milliseconds.
1Your application calls getaddrinfo() (libc). The OS checks its local cache. Cache hit -- done in microseconds.
2Cache miss: OS checks /etc/hosts. Any matching entry short-circuits DNS entirely.
3No local match: OS sends a recursive query to the resolver configured in /etc/resolv.conf (your ISP, corporate DNS, or 8.8.8.8).
4Recursive resolver checks its own cache. Cache hit -- answer returned immediately with remaining TTL.
5Cache miss: resolver queries a root server. Root returns NS records for the TLD (e.g., .com). Not the answer -- a referral.
6Resolver queries the TLD server. TLD returns NS records for hexworth.com. Another referral.
7Resolver queries the authoritative server for hexworth.com. That server holds the actual A record. Answer returned.
8Resolver caches the answer for the record's TTL duration, then returns it to the client. Your SSH connection begins.
Slide 5 of 34
Recursive vs Authoritative
Two fundamentally different server roles. Many admins confuse them. Do not be one of them.
RECURSIVE walks hierarchy caches results serves clients 8.8.8.8 / 1.1.1.1 host1 host2 host3 vs AUTHORITATIVE holds zone data answers with aa no recursion BIND9 / NSD ZONE FILE SOA NS A MX CNAME TXT flags: rd ra flags: aa
Recursive Resolver
A resolver that does the work of walking the hierarchy on behalf of clients. It queries root, TLD, and authoritative servers. It caches results. Clients only talk to the resolver -- they never talk to root servers directly. Examples: unbound, systemd-resolved, 8.8.8.8, 1.1.1.1.
Authoritative Server
A server that holds the actual zone data and answers with authority. It does not recurse. When queried for a zone it owns, it answers definitively. When queried for a zone it does not own, it returns REFUSED or NXDOMAIN. Examples: BIND9, NSD, PowerDNS.
# See the difference in dig output # Query a recursive resolver -- it will go find the answer for you dig @8.8.8.8 A ops.hexworth.com # Look for "RECURSION AVAILABLE" in the flags section: ra # Query an authoritative server directly -- ask for its zone dig @ns1.hexworth.com A ops.hexworth.com # Look for "AUTHORITATIVE ANSWER" flag: aa # Ask a recursive resolver for something outside its zones dig @ns1.hexworth.com A google.com # Result: REFUSED -- authoritative servers don't recurse for external queries
Security Implication
Running an open recursive resolver (one that recurses for any client on the internet) is a security liability. It enables DNS amplification DDoS attacks. BIND9 restricts recursion to defined networks by default in modern versions.
Slide 6 of 34
Record Types: A and AAAA
The most fundamental records. A maps name to IPv4. AAAA maps name to IPv6.
# Zone file syntax for A and AAAA records # Format: [name] [ttl] [class] [type] [data] # A record -- name to IPv4 address ops 300 IN A 10.0.1.50 www 3600 IN A 203.0.113.10 @ 3600 IN A 203.0.113.10 # @ means the zone apex (hexworth.com itself) # AAAA record -- name to IPv6 address ops 300 IN AAAA 2001:db8::50 www 3600 IN AAAA 2001:db8::10 # Query with dig dig A ops.hexworth.com dig AAAA ops.hexworth.com dig +short A www.hexworth.com # +short returns just the answer data
Multiple A Records = Round-Robin
You can publish multiple A records for the same name. Resolvers return them in rotation. This is primitive load balancing with no health checking. If one server goes down, clients still receive its IP and fail. Use a proper load balancer for production.
TTL Strategy
High TTL (3600+) reduces resolver load and query latency for stable records. Low TTL (60-300) allows faster failover. Lower your TTL 24 hours before planned IP changes so clients are not serving stale records during the migration window.
Slide 7 of 34
Record Types: CNAME
Canonical Name -- an alias that points one name to another name, not an IP address.
# CNAME maps an alias name to its canonical (true) name # The resolver follows the chain until it reaches an A record www IN CNAME ops.hexworth.com. # www is an alias for ops ftp IN CNAME ops.hexworth.com. # ftp also aliases ops blog IN CNAME hexworth.github.io. # alias to an external canonical name # Dig shows the full chain dig www.hexworth.com # ANSWER SECTION: # www.hexworth.com. 3600 IN CNAME ops.hexworth.com. # ops.hexworth.com. 300 IN A 10.0.1.50
CNAME at Zone Apex: Forbidden
You cannot put a CNAME at the zone apex (@, the domain root). The apex must have an SOA and NS record -- a CNAME would conflict. Use an A record there, or use your DNS provider's ALIAS/ANAME extension if you need apex aliasing.
CNAME Cannot Coexist
A name with a CNAME record cannot have any other record type at that name. You cannot add an MX or TXT to a name that is already a CNAME. This is one of the most common zone configuration errors.
Chained CNAMEs
CNAMEs can chain (A CNAME to B which CNAMEs to C). Resolvers follow the chain. However, deep chains add resolution latency and extra queries. Keep chains to one hop where possible. Some resolvers cap chain depth at 8.
Slide 8 of 34
Record Types: MX
Mail Exchanger records route email for a domain to the correct mail servers.
# MX format: [name] [ttl] IN MX [priority] [mail-server-hostname] # Lower priority number = higher preference @ IN MX 10 mail1.hexworth.com. # primary mail server @ IN MX 20 mail2.hexworth.com. # secondary (used if mail1 is down) @ IN MX 30 mail3.hexworth.com. # tertiary fallback # The mail server names must resolve to A records -- NOT CNAMEs # RFC 2181 explicitly prohibits CNAME targets in MX records mail1 IN A 10.0.2.10 mail2 IN A 10.0.2.11 # Query MX records dig MX hexworth.com dig +short MX hexworth.com # concise output: priority hostname
Priority Tie
When two MX records have equal priority, the sending MTA selects one at random. This provides primitive load balancing across mail servers at the same preference level. Different priority values create explicit primary/backup ordering.
Common Mistake
MX records must point to hostnames, not IP addresses. Publishing an MX record like @ IN MX 10 10.0.2.10 is an RFC violation. The MTA will attempt to resolve 10.0.2.10 as a hostname, fail, and either skip delivery or bounce the message.
Slide 9 of 34
Record Types: NS and SOA
NS delegates authority. SOA defines the zone itself. Both are required in every zone.
hexworth.com zone file SOA ns1.hexworth.com. admin.hexworth.com. (serial refresh retry expire min) NS ns1.hexworth.com. ns2.hexworth.com. A 10.0.1.50 AAAA 2001:db8::50 MX 10 mail1... CNAME ops... PTR ops.hex... TXT "v=spf1..." SRV _sip._tcp required required
# SOA -- Start of Authority: defines zone parameters # Format: [name] IN SOA [primary-ns] [admin-email] (serial refresh retry expire minimum) @ IN SOA ns1.hexworth.com. admin.hexworth.com. ( 2026040901 ; serial: YYYYMMDDnn -- increment on every change 3600 ; refresh: seconds before secondary re-checks primary 900 ; retry: if refresh fails, retry after this many seconds 604800 ; expire: secondary stops answering after this many seconds without refresh 300 ; minimum TTL: used for negative caching (NXDOMAIN) ) # NS records -- which servers are authoritative for this zone @ IN NS ns1.hexworth.com. @ IN NS ns2.hexworth.com. # NS targets need glue A records if they are IN the zone ns1 IN A 10.0.1.10 ns2 IN A 10.0.1.11
Serial Number Convention
The serial YYYYMMDDNN format (year, month, day, two-digit sequence) is the industry standard. Incrementing by 1 each edit within a day: 2026040901, 2026040902. The secondary server only triggers a zone transfer when the serial increases. If you forget to increment, secondaries stay stale.
Slide 10 of 34
Record Types: PTR and TXT
PTR enables reverse DNS. TXT carries arbitrary text data used for verification and security policies.
FORWARD LOOKUP (A) ops.hexworth.com 10.0.1.50 name -> IP REVERSE LOOKUP (PTR) 10.0.1.50 ops.hexworth.com IP -> name hexworth.com zone 1.0.10.in-addr.arpa zone
PTR -- Reverse DNS
A PTR record maps an IP address back to a hostname. They live in the special in-addr.arpa zone. Mail servers check PTR records to fight spam. Server logs show hostnames instead of raw IPs. ISPs or hosting providers must delegate the arpa zone to you.
TXT -- Text Records
TXT records hold arbitrary quoted strings. Modern uses: SPF (email authorization), DKIM (email signing key), DMARC (email policy), domain ownership verification (Google, Let's Encrypt), and DNSBL policy data.
# PTR record in a reverse zone file for the 10.0.1.0/24 network # Zone name: 1.0.10.in-addr.arpa. # Format: last-octet of IP IN PTR hostname. 50 IN PTR ops.hexworth.com. 10 IN PTR ns1.hexworth.com. # Query a PTR record with dig -x (reverse lookup) dig -x 10.0.1.50 dig PTR 50.1.0.10.in-addr.arpa. # equivalent explicit form # TXT records in a forward zone @ IN TXT "v=spf1 mx a:mail1.hexworth.com -all" @ IN TXT "v=DMARC1; p=quarantine; rua=mailto:dmarc@hexworth.com" # Query TXT records dig +short TXT hexworth.com
Slide 11 of 34
Caching and TTL
TTL controls how long records are cached. It is the primary lever for DNS propagation speed vs. server load.
Browser ~60s cache OS Stub resolved Recursive TTL-based cache Authoritative source of truth cache hit? cache hit? cache hit? definitive miss miss miss
# TTL is set per record in zone files, or globally with $TTL directive $TTL 3600 ; default TTL for all records unless overridden ops 300 IN A 10.0.1.50 ; this record caches for 5 minutes stable 86400 IN A 10.0.1.20 ; this record caches for 24 hours # Check current TTL of a cached record (dig shows remaining TTL) dig A ops.hexworth.com # First query: ;; ANSWER SECTION: ops... 300 IN A 10.0.1.50 # After 60 sec: ;; ANSWER SECTION: ops... 240 IN A 10.0.1.50 (countdown visible) # Flush local resolver cache resolvectl flush-caches # systemd-resolved sudo systemctl restart systemd-resolved # restart if flush fails
High TTL (3600-86400)
Appropriate for stable records that never change: NS records, mail server A records, CDN origins. Reduces query load on authoritative servers and speeds up client resolution. Risk: slow propagation on change.
Low TTL (60-300)
Appropriate for records that might change: load-balanced app servers, records before a planned migration. Allows rapid failover. Cost: higher query volume, every client re-queries more frequently.
Negative TTL
NXDOMAIN responses (name does not exist) are also cached for the SOA minimum TTL. If your application generates many lookups for nonexistent names, these negative cache entries still count against resolver memory.
Slide 12 of 34
dig: Your Primary DNS Tool
dig is the authoritative CLI for DNS queries. Learn it thoroughly. nslookup is obsolete.
# Basic query: dig [type] [name] [@server] dig A ops.hexworth.com # A record, use /etc/resolv.conf resolver dig A ops.hexworth.com @8.8.8.8 # use Google Public DNS dig MX hexworth.com # MX records for the domain dig ANY hexworth.com # all records (may be rate-limited) # Output control dig +short A ops.hexworth.com # only the answer data dig +noall +answer A ops.hexworth.com # suppress all sections except ANSWER dig +stats A ops.hexworth.com # include query timing statistics # Trace the full resolution chain from root dig +trace A ops.hexworth.com # shows every delegation step # Batch queries from a file dig -f /tmp/hostnames.txt +short # query each name in the file # Reverse lookup dig -x 10.0.1.50 # PTR lookup for 10.0.1.50 # Check if a resolver returns the same answer as another diff <(dig +short A ops.hexworth.com @8.8.8.8) \ <(dig +short A ops.hexworth.com @1.1.1.1)
Slide 13 of 34
Reading dig Output
Every section of dig output carries diagnostic information. Learn to read all of it.
; <<>> DiG 9.18.1 <<>> A ops.hexworth.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12345 ;; flags: qr rd ra ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; qr = query response rd = recursion desired ra = recursion available ;; aa = authoritative answer (not set here -- came from cache) ;; QUESTION SECTION: ;ops.hexworth.com. IN A ;; ANSWER SECTION: ops.hexworth.com. 240 IN A 10.0.1.50 ;; ^TTL -- 240 seconds remain in cache, not the original 300 ;; Query time: 8 msec ;; SERVER: 127.0.0.53#53(127.0.0.53) <-- systemd-resolved ;; WHEN: Wed Apr 09 10:00:00 UTC 2026 ;; MSG SIZE rcvd: 62
Status Codes to Know
NOERROR = query succeeded. NXDOMAIN = name does not exist. SERVFAIL = server could not complete query (misconfiguration or upstream failure). REFUSED = server declines to answer (policy or recursion restriction). FORMERR = malformed query.
Slide 14 of 34
/etc/resolv.conf and systemd-resolved
How Ubuntu 22.04 manages DNS configuration. The symlink matters.
# On Ubuntu 22.04, /etc/resolv.conf is a symlink managed by systemd-resolved ls -la /etc/resolv.conf # lrwxrwxrwx -> ../run/systemd/resolve/stub-resolv.conf # The stub resolver listens on 127.0.0.53:53 # All DNS queries go through systemd-resolved first # View current DNS configuration resolvectl status # per-interface DNS servers and settings resolvectl dns # DNS servers currently in use resolvectl statistics # cache hits, misses, queries sent # Override DNS for a specific interface (not persistent) resolvectl dns eth0 8.8.8.8 1.1.1.1 # Persistent configuration via netplan # /etc/netplan/00-installer-config.yaml # nameservers: # addresses: [10.0.1.10, 10.0.1.11] sudo netplan apply
Do Not Edit resolv.conf Directly
On Ubuntu 22.04, direct edits to /etc/resolv.conf are overwritten on reboot or network restart. Use netplan for persistent changes or configure /etc/systemd/resolved.conf for system-wide resolver settings.
Slide 15 of 34
/etc/hosts: Local Override
The hosts file short-circuits DNS for specific names. Checked before any DNS query is made.
# /etc/hosts format: IP canonical-hostname [aliases...] # Lines beginning with # are comments 127.0.0.1 localhost 127.0.1.1 sector-node-01 ::1 localhost ip6-localhost ip6-loopback # Lab/dev overrides -- force a name to a specific IP 10.0.1.50 ops.hexworth.com ops 10.0.1.10 ns1.hexworth.com ns1 # Test the hosts file lookup getent hosts ops.hexworth.com # uses NSS order: hosts file then DNS ping -c 1 ops # uses alias defined on the same line # NSS (Name Service Switch) order is in /etc/nsswitch.conf grep '^hosts' /etc/nsswitch.conf # hosts: files mdns4_minimal [NOTFOUND=return] dns # "files" = /etc/hosts checked FIRST
Operational Warning
Hosts file entries are invisible to DNS. A misconfigured hosts entry on a server will cause that server to behave differently from every other host on the network. This creates intermittent, hard-to-diagnose failures. Document all non-standard entries. Always check hosts files when debugging resolution anomalies.
Slide 16 of 34
Negative Caching and NXDOMAIN
What happens when a name does not exist -- and why that result is also cached.
NOERROR (record exists) Client Resolver A 10.0.1.50 cached for TTL seconds NXDOMAIN (does not exist) Client Resolver X NXDOMAIN neg-cached for SOA min TTL
NXDOMAIN
Non-Existent Domain. The authoritative server confirms the name does not exist in that zone. This is a definitive negative answer -- different from SERVFAIL (server could not answer) or a timeout (server unreachable).
Negative Cache TTL
NXDOMAIN responses are cached for the SOA minimum TTL (the last field in the SOA record). If you create a new DNS record, existing resolvers will not see it until their negative cache for that name expires. This is why new records appear to not propagate immediately.
# Demonstrate negative caching dig A ghost.hexworth.com # returns NXDOMAIN # ;; status: NXDOMAIN # ;; AUTHORITY SECTION shows SOA with minimum TTL # hexworth.com. 300 IN SOA ns1... (300 = negative cache TTL) # After adding ghost.hexworth.com to the zone, wait 300 seconds # OR flush the resolver cache resolvectl flush-caches # Verify the flush worked resolvectl statistics # cache size should drop to near zero dig A ghost.hexworth.com # should now return the new A record
Slide 17 of 34
Zone Transfers: AXFR
The mechanism by which secondary name servers get a full copy of the zone from the primary.
Primary (ns1) serial: 2026040902 zone master copy Secondary (ns2) serial: 2026040901 zone replica SOA check (serial?) AXFR (full zone transfer) serial higher on primary -> secondary triggers transfer
# AXFR (Asynchronous Full Transfer) -- full zone copy # The secondary polls the primary when the SOA serial is higher than its own # Request a full zone transfer from a name server dig AXFR hexworth.com @ns1.hexworth.com # On a properly secured server this will be REFUSED unless: # - you are listed in allow-transfer in named.conf # - you are the configured secondary for that zone # IXFR (Incremental Zone Transfer) -- only changed records since a serial dig IXFR=2026040801 hexworth.com @ns1.hexworth.com # If serial difference is too large, server falls back to full AXFR # Force an immediate zone check from secondary rndc refresh hexworth.com # on the secondary server
Security -- Restrict Zone Transfers
An open AXFR reveals your entire zone to any requester -- every hostname, IP, mail server, and internal structure. This is a reconnaissance goldmine. BIND9 should have allow-transfer { none; }; as a global default, with per-zone exceptions only for your secondaries by IP.
Slide 18 of 34
DNS Tools: dig vs host vs nslookup
Know all three. Use dig for anything serious. Understand why nslookup is legacy.
dig (preferred)
Full control over query type, server, flags, output format. Shows all response sections. Supports tracing, batch queries, and timing. Standard tool for diagnostic work. Part of bind9-dnsutils package.
host (quick lookups)
Concise output, good for quick scripting. host ops.hexworth.com returns A, AAAA, and MX in human-readable form. Less control than dig but faster to type for quick checks. Also in bind9-dnsutils.
nslookup (avoid)
Interactive mode is confusing, output format inconsistent across platforms, deprecated in BIND. Present on all systems for compatibility. Use it only when dig is not installed. Never write nslookup into scripts.
# Equivalent queries in each tool dig +short A ops.hexworth.com # dig: just the IP host ops.hexworth.com # host: "ops.hexworth.com has address 10.0.1.50" nslookup ops.hexworth.com # nslookup: verbose, old-style output # host: reverse lookup host 10.0.1.50 # "50.1.0.10.in-addr.arpa domain name pointer ops.hexworth.com." # Install if missing sudo apt install -y bind9-dnsutils # provides dig, host, nslookup, nsupdate
Slide 19 of 34
Split-Horizon DNS
Serve different answers for the same name depending on the client's network location.
Internal systems must resolve ops.hexworth.com to the internal IP 10.0.1.50. External clients must resolve the same name to the public IP 203.0.113.10. Same name, two different answers, based on where the query originates.
Internal View
Internal resolvers are configured to use an internal authoritative server. That server holds the zone with private IP addresses. Internal clients get direct RFC 1918 addresses and bypass NAT entirely. Reduces latency and load on firewall/NAT.
External View
External resolvers (internet) query your public authoritative servers. Those zones contain only public-facing IP addresses. Internal hostnames and RFC 1918 addresses are never exposed to the internet. Separation is a security requirement.
# BIND9 split-horizon using views in named.conf view "internal" { match-clients { 10.0.0.0/8; 192.168.0.0/16; }; recursion yes; zone "hexworth.com" { type master; file "/etc/bind/internal/db.hexworth.com"; ; internal zone data }; }; view "external" { match-clients { any; }; recursion no; zone "hexworth.com" { type master; file "/etc/bind/external/db.hexworth.com"; ; external zone data }; };
Slide 20 of 34
DNSSEC: DNS Security Extensions
Cryptographic signatures that prove DNS responses are authentic and unmodified.
What DNSSEC Provides
Cryptographic proof that a DNS response came from the authoritative server and was not tampered with in transit. Protects against cache poisoning attacks (Kaminsky attack). Does not encrypt DNS traffic -- use DNS-over-TLS (DoT) or DNS-over-HTTPS (DoH) for that.
What DNSSEC Does NOT Provide
DNSSEC does not prevent DDoS against DNS servers, does not encrypt queries (an observer can still see what you looked up), and does not protect against a compromised authoritative server. It is one layer of defense, not the whole stack.
# Check if a domain has DNSSEC enabled dig +dnssec A ops.hexworth.com # request DNSSEC records in response dig DNSKEY hexworth.com # show the zone signing keys dig DS hexworth.com @parent-ns # delegation signer record at parent # Validate a DNSSEC chain with delv (installed with bind9-dnsutils) delv @8.8.8.8 A ops.hexworth.com +multiline # Look for "fully validated" in output # Check DNSSEC validation status via systemd-resolved resolvectl query ops.hexworth.com # shows validation status
Slide 21 of 34  |  Diagnostic Pattern
DNS Troubleshooting Workflow
A systematic approach. Start local and work outward. Never guess.
/etc/hosts local first @127.0.0.53 stub resolver @8.8.8.8 bypass local @ns1.hex authoritative dig +trace full chain trace local ------> upstream ------> authoritative ------> root trace start local, work outward
1Check /etc/hosts first. Is the name hardcoded? getent hosts TARGET shows which source wins.
2Check your resolver: resolvectl status. Is it running? Is it using the right upstream DNS server?
3Query the resolver directly: dig TARGET @127.0.0.53. Does it answer? What status code?
4Bypass the local resolver: dig TARGET @8.8.8.8. If this works but @127.0.0.53 fails, the local resolver is the problem.
5Query the authoritative server directly: dig TARGET @ns1.hexworth.com. Does it return the right answer?
6Trace from root: dig +trace TARGET. Where does the chain break? Which delegation returns unexpected data?
7Check for stale cache: flush with resolvectl flush-caches, then retry. Negative cache entries (NXDOMAIN) block new records.
8Verify zone file serial: if the primary was updated but the secondary did not refresh, check dig SOA hexworth.com @ns2 vs @ns1.
Slide 22 of 34
Reverse DNS Zones
The in-addr.arpa zone maps IP addresses back to hostnames. Required for mail and logging.
IP address: 10 . 0 . 1 . 50 reverse PTR zone: 50 . 1 . 0 . 10 .in-addr.arpa. = ops.hexworth.com.
# Reverse zone naming: network address written BACKWARDS + .in-addr.arpa. # Network 10.0.1.0/24 --> reverse zone name: 1.0.10.in-addr.arpa. # Reverse zone file: /etc/bind/db.10.0.1 $TTL 3600 @ IN SOA ns1.hexworth.com. admin.hexworth.com. ( 2026040901 ; serial 3600 ; refresh 900 ; retry 604800 ; expire 300 ; minimum ) @ IN NS ns1.hexworth.com. @ IN NS ns2.hexworth.com. ; PTR records: last octet only (zone provides the rest) 10 IN PTR ns1.hexworth.com. 11 IN PTR ns2.hexworth.com. 50 IN PTR ops.hexworth.com. 51 IN PTR db1.hexworth.com. # Test reverse lookup dig -x 10.0.1.50 # should return ops.hexworth.com.
Slide 23 of 34
Common DNS Configuration Errors
The mistakes that appear in every real environment. Know them before you make them.
Missing Trailing Dot
In zone files, a hostname without a trailing dot is relative to the zone origin. mail1 IN MX 10 mailhost becomes mailhost.hexworth.com. -- not what you intended if mailhost is external. Always use trailing dots for external names.
Stale Serial Number
Editing the zone file without incrementing the serial means secondaries never pull the update. The primary serves new data; secondaries serve old data. Clients get different answers depending on which server they hit.
CNAME at Zone Apex
@ IN CNAME something.else.com. is illegal. The apex must have SOA and NS records. RFC 1912 prohibits a CNAME at the apex. BIND will reject this zone configuration entirely.
MX Pointing to CNAME
RFC 2181 prohibits MX (and NS) records from pointing to names that are themselves CNAMEs. Some mail servers will silently refuse to deliver. Always point MX to A records.
Validate Before Reload
Always run named-checkzone hexworth.com /etc/bind/db.hexworth.com before reloading BIND. A syntax error in a zone file can cause BIND to stop serving that zone entirely.
Slide 24 of 34
DNS Cache Poisoning
The attack that DNSSEC was designed to defeat. Understand the mechanism first.
Resolver cache Real Authority real answer Attacker forged (races) Client poisoned IP attacker wins the race -> cache stores forged record for TTL duration
An attacker races a legitimate DNS response. If they can guess the transaction ID and source port, they can inject a forged response into a resolver's cache. Every client using that resolver is then misdirected -- transparently -- to the attacker's server.
Kaminsky Attack (2008)
Dan Kaminsky discovered that DNS was structurally vulnerable. By flooding a resolver with forged responses for subdomains, an attacker could poison the cached NS delegation for an entire domain, not just one record. All resolvers worldwide were affected simultaneously.
Mitigations
Source port randomization (now standard). 0x20 encoding (randomize case in queries). DNSSEC (cryptographic proof). DNS-over-TLS (prevents observation of queries). Rate limiting on resolvers. Using modern resolver software with all mitigations enabled.
# Verify your resolver uses source port randomization ss -unap | grep ':53' # should see many different source ports, not just one # Check if DNSSEC validation is enabled in systemd-resolved grep DNSSEC /etc/systemd/resolved.conf # DNSSEC=yes or DNSSEC=allow-downgrade # Test DNSSEC validation (a known broken DNSSEC zone) dig A dnssec-failed.org # should return SERVFAIL if validation is on
Slide 25 of 34
DNS-over-TLS and DNS-over-HTTPS
Encrypting DNS queries to prevent observation and interception. Increasingly standard.
DNS-over-TLS (DoT)
Wraps DNS queries in TLS. Uses port 853. Designed for full-path encryption. Supported natively by systemd-resolved. Firewalls can detect and block DoT on port 853 if needed, which is sometimes desirable in enterprise environments.
DNS-over-HTTPS (DoH)
Wraps DNS queries in HTTPS on port 443. Designed to be censorship-resistant -- nearly impossible to block without blocking all HTTPS traffic. Used primarily by browsers. Harder to inspect in enterprise environments. Conflicts with some proxy configurations.
# Enable DoT in systemd-resolved # /etc/systemd/resolved.conf [Resolve] DNS=1.1.1.1#cloudflare-dns.com 9.9.9.9#dns.quad9.net DNSOverTLS=yes DNSSEC=yes sudo systemctl restart systemd-resolved # Verify DoT is working resolvectl status # shows "DNS over TLS: yes" per server # Test with kdig (knot-dnsutils package) -- explicit DoT query kdig -d @1.1.1.1 +tls-ca +tls-hostname=cloudflare-dns.com A ops.hexworth.com
Slide 26 of 34
Record Types: SRV
Service records enable automatic discovery of services without hardcoded hostnames or ports.
# SRV format: _service._proto.name TTL IN SRV priority weight port target _sip._tcp.hexworth.com. IN SRV 10 20 5060 sip1.hexworth.com. _sip._tcp.hexworth.com. IN SRV 10 20 5060 sip2.hexworth.com. _ldap._tcp.hexworth.com. IN SRV 0 100 389 ldap.hexworth.com. _xmpp-client._tcp.hex... IN SRV 5 0 5222 xmpp.hexworth.com. # Fields: priority (lower = preferred), weight (load distribution at same priority), # port (TCP/UDP port of the service), target (hostname serving the service) # Query SRV records dig SRV _sip._tcp.hexworth.com dig +short SRV _ldap._tcp.hexworth.com # returns: priority weight port target
Real-World Usage
SRV records are used by Kerberos, SIP, XMPP, LDAP, Minecraft server discovery, and Kubernetes service DNS. When you join a Windows domain, the client finds domain controllers via SRV lookups for _ldap._tcp.dc._msdcs.DOMAIN. This is why DNS must be healthy before Active Directory functions.
Slide 27 of 34
Operational DNS: Lab and Test Environments
How to set up DNS for internal lab networks without a public domain.
# Use a private domain that does not conflict with public DNS # Recommended: .internal, .lab, .local (avoid .local -- conflicts with mDNS) # Option 1: /etc/hosts on each machine (small labs only) 10.0.1.10 ns1.matrix.internal ns1 10.0.1.50 ops.matrix.internal ops 10.0.1.60 db1.matrix.internal db1 # Option 2: Run a local resolver with forwarding # systemd-resolved with a DNS stub and upstream forwarders # /etc/systemd/resolved.conf [Resolve] DNS=10.0.1.10 ; internal DNS for .internal queries Domains=~matrix.internal ; only route this domain to internal server FallbackDNS=8.8.8.8 ; use public DNS for everything else # Option 3: Full BIND9 server (next module) # Deploy ns1 with authoritative zones for matrix.internal # Point all lab hosts to ns1 as their resolver
Slide 28 of 34
TTL Math: Planning DNS Changes
Practical TTL management for migrations, failovers, and zero-downtime deployments.
TTL=3600 (1hr) TTL=60 wait Change IP TTL=3600 (stable) T-48h T=0 T+24h 60s TTL countdown DNS migration timeline -- lower TTL before, change, then restore
You are migrating ops.hexworth.com from 10.0.1.50 to 10.0.1.55 in 48 hours. The current TTL is 3600. Here is the correct procedure to minimize client impact.
48hBefore migration: lower TTL from 3600 to 60. Wait 3600 seconds for all resolvers to expire the high-TTL cached record and pick up the new low TTL.
0hMigration window: change the A record from 10.0.1.50 to 10.0.1.55. Increment serial. Reload BIND: rndc reload hexworth.com.
+1mWithin 60 seconds, all resolvers expire the old cached record. New queries get the updated IP. Verify with dig +short A ops.hexworth.com @8.8.8.8.
+24hAfter migration is confirmed stable: raise TTL back to 3600. This reduces query load and improves resolution speed for clients.
The Cost of Skipping Step 1
If you change the IP without first lowering the TTL and waiting, some resolvers will serve the old IP for up to another 3600 seconds (1 hour) after the change. Users will experience intermittent failures as some hit the new IP and some hit the old one. This is not a propagation mystery -- it is expected TTL behavior.
Slide 29 of 34
dnsmasq: Lightweight Local DNS
A lightweight DHCP and DNS server ideal for lab environments and small networks.
# Install dnsmasq (conflicts with systemd-resolved -- disable stub first) sudo systemctl disable --now systemd-resolved sudo apt install -y dnsmasq # /etc/dnsmasq.conf -- minimal configuration domain=matrix.internal local=/matrix.internal/ ; don't forward .internal queries upstream address=/ops.matrix.internal/10.0.1.50 ; serve this record locally address=/ns1.matrix.internal/10.0.1.10 server=8.8.8.8 ; upstream for everything else server=1.1.1.1 cache-size=1000 ; cache up to 1000 records log-queries ; log all queries (useful for debugging) # Start and verify sudo systemctl enable --now dnsmasq dig A ops.matrix.internal @127.0.0.1 # should return 10.0.1.50
Slide 30 of 34
Verification: Record-by-Record Audit
A systematic checklist for auditing a zone's DNS records after any change.
#!/bin/bash # dns-audit.sh -- verify all critical records for hexworth.com DOMAIN="hexworth.com" NS="ns1.hexworth.com" echo "=== SOA ===" dig +short SOA $DOMAIN @$NS echo "=== NS ===" dig +short NS $DOMAIN @$NS echo "=== MX ===" dig +short MX $DOMAIN @$NS echo "=== A records ===" for host in ops www mail1 mail2 ns1 ns2; do echo -n " $host: " dig +short A ${host}.${DOMAIN} @$NS done echo "=== TXT (SPF) ===" dig +short TXT $DOMAIN @$NS | grep 'v=spf1' echo "=== Reverse for ops ===" dig +short -x 10.0.1.50 @$NS
Slide 31 of 34  |  Applied Scenario
Applied: Diagnosing Email Delivery Failures
DNS is the first thing to check when email stops flowing. Here is the procedure.
Users report outbound email is bouncing. Remote servers are rejecting delivery from mail1.hexworth.com. The error message references DNS. You have five minutes to diagnose.
# Step 1: Verify MX records are published correctly dig +short MX hexworth.com # Expected: 10 mail1.hexworth.com. 20 mail2.hexworth.com. # Step 2: Verify MX targets resolve to A records (not CNAMEs) dig A mail1.hexworth.com dig CNAME mail1.hexworth.com # should return nothing # Step 3: Verify PTR record exists for the sending IP dig -x 203.0.113.20 # should resolve to mail1.hexworth.com. # Step 4: Verify SPF record covers the sending IP dig +short TXT hexworth.com | grep spf # Step 5: Check if PTR matches the forward A record (FCrDNS) # Forward-confirmed reverse DNS: PTR --> hostname --> same IP IP="203.0.113.20" HOSTNAME=$(dig +short -x $IP) dig +short A ${HOSTNAME%.} # should return $IP
Slide 32 of 34
Subdomain Delegation
Delegate authority for a subdomain to a separate set of name servers.
# Delegating ops.hexworth.com to its own name servers # In the hexworth.com zone file, add NS records for the subdomain ; Delegation for ops subdomain ops IN NS ns1.ops.hexworth.com. ops IN NS ns2.ops.hexworth.com. ; Glue records -- required because the NS targets are IN the delegated zone ns1.ops IN A 10.0.2.10 ns2.ops IN A 10.0.2.11 # Now the ops.hexworth.com zone file (on ns1.ops) is authoritative # for everything under ops.hexworth.com # Verify delegation dig NS ops.hexworth.com # should show ns1.ops, ns2.ops dig +trace A web.ops.hexworth.com # trace shows delegation at ops level
When to Delegate
Delegate when a team or department needs autonomous control over their subdomain, when the subdomain has different availability requirements, or when you are segmenting large internal namespaces. The parent zone only needs NS and glue records -- the delegated team manages all content records independently.
Slide 33 of 34
dig Operational Cheatsheet
The commands you will use in the field every day. Commit these to muscle memory.
# Query types dig A|AAAA|MX|NS|TXT|SOA|PTR|SRV|CNAME|ANY NAME # Output modifiers dig +short # just the answer data dig +noall +answer # only the ANSWER section dig +multiline # expand SOA/DNSKEY to readable multiline format dig +trace # trace full resolution from root dig +stats # show query timing at end dig +dnssec # request DNSSEC records dig +tcp # force query over TCP (test if UDP is blocked) # Targeting dig NAME @SERVER # query a specific server dig -p PORT # non-standard port dig -x IP # reverse lookup dig -f FILE # batch from file # Common diagnostics dig +short A NAME @127.0.0.53 # test local resolver dig +short A NAME @8.8.8.8 # bypass local, use Google dig AXFR ZONE @NS # attempt zone transfer dig SOA ZONE @NS1 && dig SOA ZONE @NS2 # compare serials
Slide 34 of 34  |  Module Summary
DNS Fundamentals: Key Takeaways
What you must be able to do without hesitation from this module.
1DNS is a delegated hierarchy. Authority flows from root to TLD to domain. Zone transfers propagate data between primary and secondary servers when the serial increments.
2Record types: A (IPv4), AAAA (IPv6), CNAME (alias), MX (mail), NS (delegation), SOA (zone definition), PTR (reverse), TXT (policy/verification), SRV (service discovery).
3TTL controls cache lifetime. Lower before migrations. Higher for stable records. NXDOMAIN responses are also cached for the SOA minimum TTL.
4Recursive resolvers walk the hierarchy for clients. Authoritative servers hold zone data and answer with authority. Both roles are distinct. Running both on the same server is valid but requires careful configuration.
5dig is your primary diagnostic tool. Use +trace to find where the chain breaks. Compare @127.0.0.53 vs @8.8.8.8 to isolate local vs. upstream resolution failures.
6Zone file rules: increment the serial on every change, trailing dots on absolute names, no CNAME at apex, no CNAME as MX or NS target. Validate with named-checkzone before reloading.
7Security: restrict AXFR to known secondaries, enable DNSSEC validation, consider DoT for privacy, never run an open recursive resolver on a public IP.