Understanding event sources in security monitoring
A SOC analyst's effectiveness depends on understanding where security events originate and how to correlate data across multiple source technologies. Modern security operations aggregate events from dozens of sources into a centralized SIEM. Without this aggregation, an analyst would need to log into 20+ different systems to investigate a single alert.
Firewalls (Palo Alto, Fortinet, Cisco ASA), IDS/IPS (Snort, Suricata, Zeek), proxy servers (Zscaler, Squid), NetFlow collectors, packet captures (full PCAP), DNS servers. Provides perimeter and internal network visibility.
EDR agents (CrowdStrike, SentinelOne, Carbon Black), antivirus/EPP, host-based IDS, Windows Event Logs, Sysmon, syslog (Linux). Provides process-level visibility on individual hosts.
Active Directory (Event IDs 4624-4634), LDAP, SSO providers (Okta, Azure AD), MFA systems (Duo), PAM solutions (CyberArk). Critical for detecting credential abuse and unauthorized access.
AWS CloudTrail (API audit), Azure Monitor / Sentinel, GCP Audit Logs, O365 Unified Audit Log, Salesforce event monitoring. Cloud environments require API-based log collection vs traditional syslog.
Understanding how logs flow from source to SIEM is essential -- if a link breaks, you lose visibility:
| Source Type | Log Format | Key Fields | Use Case |
|---|---|---|---|
| Firewall (Palo Alto) | Syslog/CEF | src_ip, dst_ip, action, rule, app, bytes | Perimeter traffic analysis, blocked C2 detection |
| Windows Event Log | XML/EVTX | EventID, Account, LogonType, SourceIP | Authentication tracking, process execution audit |
| Snort/Suricata IDS | Unified2/EVE JSON | signature_id, priority, protocol, payload | Intrusion detection, exploit identification |
| Zeek (Bro) | TSV/JSON logs | conn_state, service, duration, JA3 hash | Network metadata analysis, protocol inspection |
| AWS CloudTrail | JSON | eventName, userIdentity, sourceIPAddress | Cloud API auditing, IAM activity |
| Proxy (Zscaler/Squid) | CSV/W3C | user, URL, category, action, content_type | Web filtering, data exfil via uploads |
| EDR (CrowdStrike) | JSON/API | process, command_line, parent, file_hash | Malware execution, LOLBin abuse, process trees |
| DNS Server | Syslog/ETW | query, query_type, response, client_ip | DNS tunneling, DGA detection, C2 identification |
| Email Gateway | Syslog/CEF | sender, recipient, subject, attachment_hash | Phishing detection, malicious attachment analysis |
These are the event IDs you will encounter most frequently during SOC investigations:
Log Source Coverage Gaps: During your first week, ask: "What systems are NOT sending logs to the SIEM?" Common blind spots: IoT devices, OT/SCADA systems, shadow IT cloud services, BYOD devices, legacy systems running unsupported OS versions. An attacker who discovers a blind spot has an unmonitored attack path.
Traditional method via UDP/514 (unreliable, no encryption) or TCP/6514 with TLS (reliable, encrypted). Most network devices support syslog natively. Be aware: UDP syslog can lose events under load -- use TCP for critical sources.
Splunk Universal Forwarder, Elastic Agent, Wazuh agent. Installed on each endpoint. More reliable than syslog, handles parsing at source, survives network outages (buffers locally). Higher deployment overhead.
REST APIs for cloud services (AWS CloudTrail, Azure Activity Log, O365 API, Okta). Rate limits and pagination must be handled. Authentication via API keys, OAuth tokens, or service principals.
Kafka, Logstash, Fluentd, Cribl as intermediate layers. Handle scale (millions of EPS), transformation (field renaming, enrichment), and routing (send firewall logs to SIEM, debug logs to cold storage).
Different sources use different field names for the same data. Without normalization, you cannot write a single query that searches across sources:
| Concept | Firewall Log | Windows Event | Zeek conn.log | Normalized (CIM) |
|---|---|---|---|---|
| Source IP | src | IpAddress | id.orig_h | src_ip |
| Destination IP | dst | TargetServerName | id.resp_h | dest_ip |
| Username | user | TargetUserName | --- | user |
| Action | action | Keywords | conn_state | action |
Normalization is Critical: Splunk uses CIM (Common Information Model). Elastic uses ECS (Elastic Common Schema). Sentinel uses ASIM. Learn your SIEM's schema -- it determines whether your queries work across all log sources or only against one source type. A query that only searches firewall logs will miss the same attacker in endpoint logs.
This Splunk query correlates failed logins with subsequent successful logins from a different IP -- a classic indicator of credential compromise:
SOC analysts must be able to read raw logs from different source technologies. Select a log source below to see a sample log entry, then parse out the critical fields for investigation.
When you see a raw log for the first time: (1) Find the timestamp -- it anchors everything, (2) Find the source and destination -- who is talking to whom, (3) Find the action/result -- was it allowed, denied, or something else, (4) Find the unique identifier -- event ID, signature ID, rule name. These four fields let you write a basic SIEM query to find all related events.
1. Which Windows Event ID indicates a successful logon?
2. What is the primary benefit of log normalization in a SIEM?
3. Which tool is commonly used for network metadata analysis?
4. What protocol/port is used for secure syslog transmission?
5. Event ID 4672 indicates what type of activity?
6. What is a common blind spot in SIEM log coverage?
7. What does Splunk's CIM (Common Information Model) provide?