4.9 Domain 4 · Security Operations

Using Data Sources to Support an Investigation

Firewall, application, endpoint, OS, IDS/IPS, network, and metadata logs — pick the right source for the question you are trying to answer.

✓

Concept

Textbook

Reference

Real Scenario

Hard Choice

Common Traps

Exam Signal

The Concept

Every investigation is a question: who authenticated at 02:14?, which external IP did the compromised host reach?, what payload was actually exfiltrated? Security+ 4.9 asks you to match the question to the data source. Different logs answer different questions, and picking the wrong source wastes time while evidence ages.

The mental model: firewall logs tell you allow/deny at the perimeter, endpoint/Sysmon tells you what ran on the host, OS security logs (Windows 4624 etc., Linux auth.log) tell you who authenticated, NetFlow tells you who talked to whom, packet capture tells you exactly what was said, and metadata (email headers, EXIF, MAC times) tells you attribution and context. Match question to source in the first two seconds of an investigation.

Log data. The primary raw material.

Firewall logs — allow/deny decisions, source and destination IP/port, timestamps, interface, rule id. Answers: “did this traffic cross the perimeter and which rule matched?”
Application logs — business transactions (order placed, transfer initiated), error stacks, authentication events at the app layer. Answers: “what did the user attempt within the app?” Structured JSON logs beat free-form strings every time.
Endpoint logs — process creation, file writes, registry changes, network connections, command line. Sysmon on Windows is the gold-standard augmentation. EDR telemetry is enriched endpoint log. Answers: “what ran on this host and what did it touch?”
OS-specific security logs — Windows Event Log (key IDs: 4624 successful logon, 4625 failed logon, 4672 special privileges assigned, 4648 explicit credential use, 4688 new process). Linux: /var/log/auth.log, /var/log/secure, auditd. Answers: “who authenticated, when, and from where?”
IDS/IPS logs — alerts with signature/rule id, packet payload snippet, classification. Answers: “did we match a known attack pattern?”
Network logs — NetFlow / IPFIX / sFlow (metadata), DNS queries (who resolved what), DHCP leases (who had which IP when), proxy logs. Answers: “which hosts talked to which external services, how much, when?”
Metadata — file MAC times (modified/accessed/created), email headers (received-from, authentication results), image EXIF (camera, GPS, time). Answers: “where did this artifact come from and when did it move?”

Data sources (beyond raw logs).

Vulnerability scans — “what known weaknesses are present?” Useful to correlate a finding to a confirmed exploit.
Automated reports — daily/weekly posture summaries; context for what “normal” looked like before the incident.
Dashboards — the visualization layer on top of logs; quick anomaly detection by eye.
Packet captures — full fidelity. Wireshark / tcpdump / Zeek. Expensive to store, essential for payload reconstruction and deep protocol analysis.

Windows Event IDs to know cold.

4624 — An account was successfully logged on. Includes logon type (2 interactive, 3 network, 10 remote interactive).
4625 — An account failed to log on. Useful for detecting brute force or kerberoasting.
4672 — Special privileges assigned to new logon. Fires for admin-equivalent logons.
4648 — A logon was attempted using explicit credentials (RunAs patterns).
4688 — A new process has been created (with command line if audit-cmdline is enabled). Key for detecting LOLBin abuse.

Log integrity — centralization matters. An attacker with admin rights on a compromised host can clear or edit local logs. Centralized log shipping (to a SIEM or durable store) before an incident means you still have the record even if the host is wiped. Centralization is a control, not just a convenience.

Investigation Question	Best Log / Data Source	Backup Source
Who authenticated at 02:14 and from where?	Windows Event 4624 / Linux auth.log / IdP logs	VPN / SSO logs
Was there a brute-force attempt on this account?	Windows 4625 / auth.log failed entries / IdP failed login events	WAF logs if web-facing
Which external IPs did the compromised host contact?	Firewall logs + NetFlow + DNS queries	Proxy logs
What exact payload was exfiltrated?	Packet capture (PCAP)	Proxy logs if HTTP(S)-decrypted
What process ran at the time of the alert?	Endpoint logs / Sysmon / EDR (process creation, 4688)	Parent-child process tree in EDR
Did a known-signature attack fire?	IDS/IPS alert log	EDR detection events
Was a file modified/accessed at a specific time?	Endpoint FIM + MAC times + Sysmon file events	Backup timestamps for corroboration
Where did this email really come from?	Email headers (metadata)	Secure email gateway transit logs
Which DNS queries preceded the breach?	DNS logs / DNS filter logs / passive DNS	NetFlow for destination inference
What volumes of traffic flowed between subnets?	NetFlow / IPFIX	Firewall counters

Windows Event ID	Meaning	Investigative Value
4624	Successful logon	Who authenticated, when, how (logon type)
4625	Failed logon	Brute-force detection, typo signal
4672	Special privileges assigned	Admin-equivalent logon occurred
4648	Logon with explicit credentials	RunAs / credential theft signal
4688	New process created (with cmdline)	LOLBin detection, lateral movement

Key Takeaway

Question drives source. Auth questions → OS security log / IdP. External communication → firewall, DNS, NetFlow. Payload detail → PCAP. Host activity → endpoint/Sysmon. Attribution → metadata. Centralize early so compromised hosts cannot erase the evidence.

HR has referred a departing employee for investigation based on a tip: the person may have downloaded customer lists before giving notice. The IR team needs to reconstruct what happened over the last 30 days. Multiple data sources are available; the question is which to pull and in what order.

HR Partner“Can you just tell me what they copied?”

IR Lead“Yes, step by step. Step 1 — DLP logs: scan for any policy match on PII or customer-data keywords associated with their user account in the last 30 days. That is the bluntest, fastest signal. Step 2 — endpoint logs: Sysmon events for USB insertion, file-copy operations, cloud-sync app activity. Step 3 — proxy / web filter logs: uploads to personal cloud (Dropbox, Google Drive, WeTransfer, personal Gmail). Step 4 — email gateway logs: outbound attachments to personal addresses.”

HR Partner“What about files they accessed on shares?”

IR Lead“Good question. File server audit logs show access to sensitive shares, and if we have object-level auditing enabled we can see exactly which files they opened. We correlate the file-access timestamps against the USB-insertion timestamps and the personal-cloud upload events. That triangulation tells the story.”

HR Partner“Will this hold up if we need to take legal action?”

IR Lead“Yes — all sources are centrally shipped to the SIEM, timestamps are NTP-synchronized, chain of custody on any forensic image is documented from minute one, and logs are preserved under legal hold starting now so normal rotation does not destroy them. The logs were designed for exactly this investigation; we just have to pull them in the right order.”

Compensating Action

Start with the most specific log, widen to corroborate. DLP and endpoint tell you what was copied; proxy and email tell you where it went; file-server audit tells you what was touched. Triangulate across sources and NTP-synchronized timestamps. Preserve everything under legal hold before normal rotation deletes it.

Real Talk — Career Context

Investigations reward logging discipline done years earlier. The organization that lit up DLP, Sysmon, object-level file auditing, and proxy logging before any incident is the one that can answer HR’s question in hours. The one that did not will spend weeks guessing. Investing in logging is investing in future investigations.

On the exam: “what payload was exfiltrated?” → packet capture. “who authenticated?” → OS security log. “which domains did host resolve?” → DNS logs.

An investigator needs to reconstruct the exact HTTP request body of a suspected data exfiltration from a compromised internal server to an external IP. NetFlow and PCAP are both available; PCAP captures cover the relevant time window for critical links. Which is the right primary source for THIS question?

Option A

NetFlow — faster to query, lower storage

Shows src/dst/port/bytes/flags over the time range. Efficient for volume questions.

Option B

Packet capture — full content for payload reconstruction

Reconstructs the exact HTTP request body, headers, and sequence of bytes sent.

Option B fits better — payload reconstruction requires PCAP

Option B: NetFlow contains metadata only; it tells you that a connection happened, not what was said. To see the actual HTTP request body (or any application-layer payload), you need the packet capture. For this specific question — what was exfiltrated — PCAP is the only source.

Option A’s kernel of truth: For volume or pattern questions (“how much, how often, to which destinations”), NetFlow is the right first stop. Payload questions require PCAP.

On the exam: “reconstruct payload” / “exact content” / “exfil details” → PCAP. “volume” / “who talked to whom” → NetFlow.

⚠Logs are tamper-proof on the host

An attacker with admin rights can clear or edit local Windows Event Logs and Linux syslog. Centralize logs to a SIEM or write-once store so the compromised host cannot erase the trail.

Why it is tempting: Local logs look durable. Root access makes them not.

⚠NetFlow = packet capture

NetFlow is metadata (src/dst/port/bytes/flags). PCAP is full content. Payload reconstruction needs PCAP; volume analysis is fine with NetFlow.

Why it is tempting: Both are “network logs.” Fidelity differs by orders of magnitude.

⚠Email content proves origin

Email body can be forged. Email headers (metadata: Received-from chain, SPF/DKIM/DMARC results) are the reliable attribution source. Read the headers, not the body.

Why it is tempting: The body is what users see. Headers are what tell the truth.

⚠4624 = any login

Windows Event 4624 is a successful logon and includes a logon type: 2 interactive (keyboard), 3 network (SMB, RPC), 10 remote interactive (RDP). Type matters for investigation — remote interactive at 03:00 on a server is suspicious in a way type-2 at a kiosk is not.

Why it is tempting: All 4624s look the same at a glance. The logon type tells the story.

⚠EXIF or MAC times as gospel

File MAC times and EXIF metadata can be tampered with. Use them as corroborating evidence, not sole proof. Cross-check against server-side logs.

Why it is tempting: They feel definitive. Any user can run touch or a metadata editor.

Exam Signal

4.9 is question-to-source matching. Build a reflex map: “who authenticated” → OS security log / IdP; “which external IPs” → firewall + NetFlow + DNS; “what payload” → packet capture; “what ran on host” → endpoint / Sysmon / EDR / 4688; “where did email come from” → email headers. Centralize logs so admin-level compromise cannot erase them.

Quick Check — 4.9 Q1

An investigator must identify which user account authenticated to a specific Windows server at 02:14 local time and determine the logon type. Which data source BEST answers this?

A Firewall logs
B Windows Event Log 4624 (successful logon) on the target server or the domain controller
C EDR process-creation events
D NetFlow records

Correct: B. 4624 is the canonical Windows “who authenticated” event and includes the logon type field. Firewall logs show traffic, EDR shows processes, NetFlow shows volumes.

Source: CompTIA SY0-701 Objectives v5.0 — 4.9

Quick Check — 4.9 Q2

An investigator needs to reconstruct the exact HTTP POST body of a suspected exfiltration. Which data source is REQUIRED?

A NetFlow metadata
B Full packet capture (or decrypted proxy logs) for the relevant link and time window
C DHCP lease file
D Windows 4625 events

Correct: B. Payload reconstruction requires full content — PCAP or a decrypted proxy log. NetFlow has no payload; DHCP and 4625 are unrelated to HTTP payload.

Source: CompTIA SY0-701 Objectives v5.0 — 4.9

Quick Check — 4.9 Q3

A host was compromised for 14 days and local Windows Event Logs were cleared by the attacker. Which design decision MOST effectively preserves the evidence?

A Trust the local cleared logs
B Centralized log forwarding (to a SIEM or write-once store) configured before the incident
C Run chkdsk
D Reimage and hope

Correct: B. Centralized forwarding means the off-host copy survives local tampering. The design decision must exist before the incident to have forwarded the data.

Source: CompTIA SY0-701 Objectives v5.0 — 4.9

Gallery

Contacts

Using Data Sources to Support an Investigation

Suspected insider data exfil — picking the right logs

NetFlow — faster to query, lower storage

Packet capture — full content for payload reconstruction

Option B fits better — payload reconstruction requires PCAP

Services

Learn

Company

Gallery

Contacts

Using Data Sources to Support an Investigation

Suspected insider data exfil — picking the right logs

NetFlow — faster to query, lower storage

Packet capture — full content for payload reconstruction

Option B fits better — payload reconstruction requires PCAP

Stay Current on Certifications

Services

Learn

Company