Security Monitoring and System Hardening involves protecting systems and networks through architectural design, reducing their attack surface, and continuously monitoring their behavior to identify security threats.
Let’s begin with the basic components of a network and system infrastructure:
| Component | Description |
|---|---|
| Servers | Centralized machines that host applications, databases, or services. |
| Endpoints | Devices like desktops, laptops, mobile phones, and IoT devices. |
| Firewalls | Security devices that filter incoming and outgoing traffic based on rules. |
| Routers | Devices that connect networks and direct traffic between them. |
| Switches | Devices that connect systems within the same network (Layer 2 communication). |
| Virtual Machines | Software-based systems emulating physical machines for resource efficiency. |
Imagine an office network where employees use computers (endpoints). All traffic from those endpoints is managed by switches (internal network), which send data to the router for external communication (e.g., accessing the internet). Firewalls are placed to ensure only authorized traffic is allowed.
Network segmentation is like dividing a large house into locked rooms, where only authorized people can access specific areas. It prevents lateral movement—where an attacker compromises one system and moves to others.
Security zones organize networks into different trust levels, each with specific rules. These zones reduce the risk of unauthorized access.
| Zone | Purpose | Example |
|---|---|---|
| DMZ | A Demilitarized Zone for public-facing services (e.g., web servers). | Hosting a company’s website. |
| Trusted Zone | Internal network with high security for trusted devices. | Office employee network. |
| Untrusted Zone | External networks, like the internet. | Internet traffic. |
| Secure Enclaves | Highly secure areas for sensitive systems (e.g., databases). | Financial records server. |
System and network hardening reduce vulnerabilities by limiting unnecessary features and applying security controls.
Access control ensures only authorized users can access systems or resources.
MFA requires two or more authentication methods to access systems:
Example: Logging into an email account requires a password (first factor) and a 6-digit code sent to a mobile app (second factor).
System and Network Architecture involves:
Logs are records of activities that occur within a system, network, or application. Think of logs as diaries that systems maintain to track what happens—who accessed them, what actions were taken, and if anything suspicious occurred.
Logs are categorized based on the source of the information:
| Log Type | Description | Examples |
|---|---|---|
| System Logs | Records events related to system operations. | Boot events, crashes, errors. |
| Application Logs | Tracks activities within specific applications. | User logins, failed attempts. |
| Security Logs | Records events like logins, access attempts, and policy violations. | Authentication successes/failures. |
| Network Traffic Logs | Captures network activities like incoming/outgoing connections and traffic. | Firewall logs, packet captures. |
Log management involves collecting, storing, and analyzing logs to detect threats and troubleshoot issues. It consists of:
SIEM (Security Information and Event Management) tools automate log management and help detect anomalies. A SIEM collects logs from multiple sources, correlates events, and generates alerts for unusual activities.
Use regular expressions to filter relevant data from large log files.
Example:
Search for specific error codes or user accounts:
grep "Failed login" /var/log/auth.log
When analyzing logs, focus on critical fields that provide insights:
| Field | Description | Example |
|---|---|---|
| Timestamp | Date and time of the event. | 2024-06-01T12:30:45 |
| Source IP | IP address from where the activity originated. | 192.168.1.100 |
| Destination IP | IP address of the targeted system. | 10.0.0.5 |
| User Account | User performing the action. | admin or guest |
| Event ID | Unique identifier for the type of event. | 4625 for failed login. |
| Action Taken | The activity recorded (e.g., login, file access). | File deleted, Login success |
Continuous Security Monitoring (CSM) is the ongoing process of monitoring systems, networks, and applications to detect and respond to security incidents quickly. Unlike periodic audits, CSM operates 24/7 to identify unusual activities and malicious behaviors in real time.
Different tools are used to monitor various parts of the IT infrastructure:
To effectively monitor systems, you need to track specific metrics that highlight abnormal behavior:
| Metric | Description | Example |
|---|---|---|
| CPU Utilization Spikes | High CPU usage over a short period could indicate malware activity. | CPU jumps to 100% unexpectedly. |
| Network Bandwidth Anomalies | Sudden surges or drops in network traffic could indicate a DoS attack or data theft. | Huge outbound traffic at 2 a.m. |
| Unauthorized Access Attempts | Repeated login failures, privilege escalations, or access outside working hours. | 50 failed logins from one IP. |
| New Processes or Files | Unexpected processes or files appearing on endpoints. | “cmd.exe” running unexpectedly. |
A cron job runs every 5 minutes to check system resource usage:
#!/bin/bash
top -b -n1 | grep "Cpu(s)" > /var/log/cpu_usage.log
The security logs show failed login attempts:
Timestamp: 2024-06-12 01:00:30
User: admin
Source IP: 192.168.1.10
Action: Login Failed
Threat detection involves identifying malicious activities, while threat analysis determines the nature, scope, and impact of those threats. Together, they help security teams detect, understand, and respond to cyber threats effectively.
Indicators of Compromise (IoCs) are pieces of evidence that suggest a system has been compromised. Security analysts use IoCs to detect and respond to security incidents.
Detect changes to files or the presence of malicious files.
Common examples:
malware.exe.Practical Example: If you download a suspicious file, the security team compares its SHA-256 hash to known malware hashes on VirusTotal. If it matches, the file is flagged as malicious.
Track suspicious activities across the network.
Examples:
Practical Example:
If network logs show outbound traffic to badwebsite.com, this indicates data exfiltration.
Detect abnormal or unauthorized behavior within systems.
Examples:
cmd.exe.Practical Example: If a normal application suddenly tries to modify system files or launch new processes, this could indicate malware.
Threat Intelligence provides contextual information about current and emerging cyber threats. It helps organizations identify attackers’ strategies, tools, and behaviors.
192.168.10.50).When a threat is detected, analysts use various techniques to analyze it and determine its behavior.
Definition: Analyzing malware without executing it.
How It Works:
Tools:
strings: Extract readable text from binary files.pefile: Inspect PE (Portable Executable) file headers.Binwalk: Analyze firmware and binary images.Practical Example:
strings malware.exe to identify hardcoded URLs or commands within a malware file.Definition: Executing malware in a controlled environment (sandbox) to observe its behavior.
How It Works:
Tools:
Practical Example:
malware.exe file in Cuckoo Sandbox and observe it creating outbound connections.Definition: Observing malware behavior, such as:
Tools:
8080 for communication.Threat Hunting is a proactive and iterative process of searching for threats or malicious activities within an organization’s systems, networks, or endpoints that may have bypassed existing security controls.
Proactive Threat Hunting involves actively searching for threats without waiting for automated alerts. It is driven by:
Definition: Start with a hypothesis based on known attacker behaviors, threat intelligence, or prior incidents.
Steps:
Example:
Definition: Start by analyzing logs, traffic flows, and endpoint data to identify anomalies without a specific hypothesis.
Steps:
Example:
Microsoft-Windows-PowerShell/Operational log).Use a SIEM to filter for suspicious PowerShell commands:
EventID=4104 AND CommandLine=*Invoke-Command*
Threat hunters use specific techniques to identify threats. Here are three common methods:
Definition: Identify unusual patterns or activities compared to baseline behaviors.
Definition: Map threats to known attacker Tactics, Techniques, and Procedures (TTPs) using the MITRE ATT&CK Framework.
Definition: Use threat intelligence feeds to identify IoCs and hunt for threats.
192.0.2.10 is a known command-and-control server.Objective: Identify unauthorized lateral movement across systems.
Steps:
Hypothesis: "An attacker may have compromised one system and moved to others using SMB (Server Message Block)."
Data Collection: Gather Windows logs and network traffic data.
Indicators:
PsExec or PowerShell for remote execution.Use tools like Splunk to filter for:
EventID=4624 AND LogonType=3
Analyze SMB traffic in Wireshark for suspicious connections.
Network-based tools focus on monitoring and securing network traffic. These tools help detect malicious activities like intrusions, data exfiltration, or denial-of-service attacks.
| Tool | Description | Use Case |
|---|---|---|
| Snort | Open-source IDS/IPS that analyzes packet traffic. | Detecting port scans, DDoS attacks. |
| Suricata | High-performance IDS/IPS with multi-threading. | Handling high-bandwidth networks. |
| Zeek (Bro) | Network security monitor for traffic analysis. | Identifying unusual traffic flows. |
Packet analyzers capture and inspect network packets for suspicious activities.
| Tool | Description | Use Case |
|---|---|---|
| Wireshark | Open-source tool for analyzing captured network packets. | Identifying unauthorized traffic. |
| tcpdump | Command-line packet sniffer. | Capturing packets for analysis. |
NetFlow tools analyze network flow data to detect anomalies and suspicious patterns.
| Tool | Description | Use Case |
|---|---|---|
| SolarWinds NetFlow | Monitors and visualizes network flow data. | Identifying traffic anomalies. |
| Plixer Scrutinizer | Provides flow analysis and reporting. | Detecting large file transfers. |
Endpoint tools focus on securing devices like servers, desktops, laptops, and mobile devices. They detect and respond to threats targeting endpoints.
| Tool | Description | Use Case |
|---|---|---|
| CrowdStrike Falcon | Cloud-based EDR for real-time monitoring. | Detecting malware execution. |
| SentinelOne | AI-driven endpoint security and response platform. | Blocking ransomware attacks. |
| Microsoft Defender | Built-in EDR for Windows systems. | Monitoring Windows endpoints. |
| Tool | Description | Use Case |
|---|---|---|
| Malwarebytes | Anti-malware software for real-time protection. | Detecting and removing malware. |
| Kaspersky Endpoint | Antivirus tool with behavioral analysis. | Blocking viruses and spyware. |
Security Orchestration, Automation, and Response (SOAR) tools automate repetitive tasks, integrate security tools, and orchestrate incident response workflows.
| Tool | Description | Use Case |
|---|---|---|
| Palo Alto Cortex XSOAR | SOAR platform for automated incident response. | Automating phishing incident response. |
| Splunk SOAR (formerly Phantom) | Integrates SIEM and tools for security automation. | Automating alert triage. |
Security frameworks provide guidelines and best practices for implementing and managing cybersecurity processes. Organizations use these frameworks to ensure security controls are effective and aligned with industry standards.
Description: A knowledge base of adversary Tactics, Techniques, and Procedures (TTPs) used in cyberattacks.
Key Components:
Practical Use:
Description: A risk-based framework for managing and improving cybersecurity posture.
Core Components:
Practical Use:
Description: A prioritized set of 18 security controls to improve cybersecurity hygiene.
Examples of Controls:
Practical Use:
Advanced Persistent Threats (APTs) involve multi-stage, long-duration attacks often using stealthy techniques. Understanding how to trace an APT through logs is a core skill for security analysts.
Initial Access – Phishing, drive-by downloads, etc.
Execution – Running malicious code on victim systems.
Persistence – Registry modifications, scheduled tasks.
Privilege Escalation – Token manipulation, kernel exploits.
Lateral Movement – Pass-the-Hash, Remote Desktop Protocol (RDP) usage.
Data Collection and Exfiltration – ZIP files sent to external IPs.
Email logs show phishing delivery.
EDR logs capture suspicious PowerShell execution.
Windows Event Logs (4624, 4688) show abnormal logins and process creation.
Firewall logs reveal traffic to suspicious external IPs (data exfiltration).
Learning how to correlate events across these logs helps detect each step in the kill chain.
Modern ransomware campaigns often move laterally before detonating payloads.
SMB traffic from infected host to others (port 445).
Unauthorized use of credentials (Event ID 4624 with logon type 3).
Suspicious use of psexec or remote WMI commands.
Rapid file modifications or encryption patterns.
Practicing these case analyses using SIEM tools (e.g., Splunk, QRadar) builds detection and investigation capabilities.
Security logs are critical for:
Incident investigations
Forensics
Compliance audits
If logs are purged too early, crucial evidence may be lost.
90 Days: Minimum for detection and initial investigation.
180 Days to 1 Year: Common for compliance and internal audits.
7 Years or more: Required in some regulated industries.
| Standard | Retention Requirement |
|---|---|
| PCI DSS | Retain audit logs for at least 1 year, with 3 months immediately available. |
| HIPAA | 6 years for policies and procedures (not logs specifically, but systems handling PHI must be auditable). |
| NIST 800-92 | Recommends logs be retained long enough for forensics, legal, and compliance purposes. |
Use centralized log management (e.g., SIEM).
Classify logs by criticality (firewall, auth logs, endpoint activity).
Use secure storage (WORM disks, encrypted archives).
Implement access control and tamper detection.
The NIST SP 800-61 Rev.2 defines a widely used 5-phase incident response lifecycle:
Tools, policies, response teams, runbooks.
Training and simulation exercises.
Identifying indicators of compromise (IoCs).
Log review, correlation, threat intelligence usage.
Containment: Stop the spread (quarantine systems, block IPs).
Eradication: Remove malware, fix vulnerabilities.
Recovery: Restore operations, monitor for reinfection.
Review what worked and what didn’t.
Update IR plans and detection tools.
Document findings in final report.
Understanding this lifecycle is essential for anyone involved in SOC or blue team operations. Exams often test this framework directly.
Security Operations Centers (SOCs) are typically divided into tiers to handle alerts based on complexity.
| Tier | Role | Responsibilities |
|---|---|---|
| L1 (Tier 1) | Alert Analyst / Triage | Monitor dashboards, respond to alerts, escalate events. Basic log analysis. |
| L2 (Tier 2) | Incident Responder | Perform in-depth investigations, threat hunting, containment actions. Use threat intel. |
| L3 (Tier 3) | Threat Hunter / Forensics | Proactively hunt threats, perform root cause analysis, malware reverse engineering, develop SIEM rules. |
L1: Understanding log types, SIEM dashboards, false positive triage.
L2: Using EDR/SIEM correlation, writing detection queries, hands-on containment.
L3: Deep packet analysis, writing YARA/Sigma rules, building detection content.
L1: Splunk, ServiceNow, basic scripting.
L2: CrowdStrike, Sysinternals, VirusTotal, threat intel platforms.
L3: Zeek, Wireshark, Velociraptor, Volatility, custom scripts.
Understanding these role distinctions helps in both career development and exam readiness.
| Topic | Value |
|---|---|
| APT/Ransomware Case Analysis | Builds practical detection skills through real-world scenarios. |
| Log Retention & Compliance | Prepares for regulatory topics and audit readiness. |
| NIST Incident Lifecycle | Essential framework for structured response. |
| SOC Tiered Roles | Clarifies duties and expected skills for L1–L3 analysts. |
A security analyst receives multiple SIEM alerts indicating suspicious login attempts from different geographic regions. What is the most appropriate first step in the alert triage process?
Validate whether the alerts represent legitimate activity or false positives.
SOC triage begins with validation of alerts to determine if they represent real security events. Analysts typically review SIEM context such as user account activity, IP reputation, login history, and authentication logs. This prevents escalation of benign alerts such as legitimate VPN access or traveling users. Investigating severity or initiating containment before validation may waste resources and disrupt legitimate services. Alert validation ensures that only confirmed or highly suspicious events proceed to deeper investigation and response processes.
Demand Score: 91
Exam Relevance Score: 88
What security technology primarily aggregates logs from multiple systems and correlates events to identify potential security incidents?
Security Information and Event Management (SIEM).
SIEM platforms collect logs from endpoints, servers, network devices, and applications. They normalize the data and apply correlation rules to detect suspicious patterns such as repeated failed logins, privilege escalation attempts, or unusual network behavior. Unlike standalone security tools, SIEM systems centralize monitoring and provide analysts with dashboards, alerts, and investigative capabilities. In SOC environments, SIEM serves as the primary monitoring system that supports detection, investigation, and response activities.
Demand Score: 87
Exam Relevance Score: 92
During log analysis, a security analyst observes numerous outbound connections from a server to an unfamiliar external domain. Which step should the analyst perform first?
Verify the domain reputation and analyze associated network traffic.
Unexpected outbound communication often indicates potential command-and-control activity or data exfiltration. The analyst should first determine whether the domain is known malicious by checking threat intelligence feeds, DNS records, and reputation services. Reviewing packet logs, connection frequency, and associated processes helps determine whether the behavior is malicious or related to legitimate services. This verification step helps analysts determine whether escalation to incident response is required.
Demand Score: 84
Exam Relevance Score: 86
Why is log normalization important within a SIEM platform?
It standardizes logs from different sources so events can be correlated and analyzed consistently.
Security devices generate logs in different formats. Log normalization converts these varied formats into a consistent structure so correlation engines can compare fields such as timestamps, IP addresses, and event types. Without normalization, SIEM systems cannot reliably detect patterns across multiple devices. For example, failed authentication attempts across a firewall and an identity provider could not be correlated effectively without standardized fields.
Demand Score: 82
Exam Relevance Score: 89
A SOC analyst identifies a repeated pattern of failed login attempts followed by a successful login. What type of attack does this likely indicate?
Brute-force authentication attack.
Brute-force attacks involve automated attempts to guess account passwords by repeatedly submitting login requests. In logs, this pattern appears as multiple failed authentication attempts from the same source followed by a successful login. Analysts must determine whether the account is compromised by checking login origin, session activity, and subsequent privilege changes. Immediate actions may include forcing password resets or temporarily disabling the account.
Demand Score: 80
Exam Relevance Score: 85