Filtering allows you to reduce the number of events being processed and focus only on the relevant data. Splunk provides multiple commands to help with this.
search – Basic FilteringThe search command is a shorthand for applying field=value filters. It works well with indexed fields and simple conditions.
Example:
search status=200
This returns events where the field status equals 200.
You can also write it without explicitly using search:
index=web status=200
This is identical in function, as Splunk treats bare field=value as an implicit search.
where – Advanced FilteringThe where command supports expression-based conditions, including mathematical, logical, and string comparisons.
Syntax:
... | where <condition>
Example:
where duration > 5 AND uri="/home"
Here you can use operators such as:
>, <, ==, !=
AND, OR, NOT
Functions like like(), match(), isnull()
where is evaluated after field extraction and eval operations, so it supports more advanced logic than search.
regex – Filtering with Regular ExpressionsThe regex command allows pattern-based filtering. It's useful when:
Field values are not cleanly structured
You need to match patterns, not exact values
Example:
regex uri="^/api/.*"
This filters events where the uri field starts with /api/.
Note: regex is not as efficient as indexed search, so use it selectively, especially on large datasets.
After filtering, you may want to clean up, rename, or transform fields to make the data easier to work with or visualize.
fields + and fields -Use the fields command to control which fields are included in or excluded from the results.
Examples:
fields host, status, uri_path ← include only these fields
fields - _raw, _time ← exclude raw data and timestamps
This improves performance and simplifies the result table, especially in dashboards.
rename – Clarify Field NamesUse the rename command to make field names more user-friendly.
Example:
rename uri_path as URL
This is especially helpful when:
Preparing data for dashboards
Aligning with naming conventions
Making raw fields more readable for non-technical users
replace, eval – Modify Field ValuesYou can use eval and replace() to clean or transform data.
Example:
eval user=replace(user, "_", " ")
This replaces underscores in usernames with spaces.
Other transformations might include:
Creating new fields
Converting units
Performing string formatting
Let’s combine the above techniques into a practical example.
From a set of web logs, retrieve:
Events with status=200
Only include relevant fields
Filter based on bytes > 1000
Convert bytes to megabytes (MB)
index=web status=200
| fields host, uri_path, bytes
| where bytes > 1000
| eval MB=round(bytes/1024/1024, 2)
index=web status=200: Retrieves successful web requests
fields host, uri_path, bytes: Limits output to three key fields
where bytes > 1000: Filters out small traffic
eval MB=...: Creates a new field showing size in megabytes, rounded to 2 decimal places
| Command | Purpose | Example |
|---|---|---|
search |
Basic field=value filtering | search status=200 |
where |
Advanced logic-based filtering | where duration > 5 |
regex |
Pattern-based filtering | regex uri="^/api/" |
fields + |
Include specific fields | fields host, uri_path |
fields - |
Exclude fields | fields - _raw |
rename |
Change field names | rename uri_path as URL |
eval |
Create or transform field values | eval size_mb=bytes/1024/1024 |
replace() |
Modify string patterns within field values | replace(user, "_", " ") |
regex vs Indexed Field Filtering: Performance ImpactAlthough regex is a flexible filtering tool, it’s far less efficient than filtering via indexed fields. This distinction is critical in both optimization and exam scenarios.
index=web status=500 ← Indexed field, highly efficient
vs.
index=web | regex status="5.." ← Non-indexed, slower
The first search leverages tsidx metadata for fast filtering.
The second search evaluates regex at the event level, causing full scan of the data set.
Takeaway: Always prefer indexed field filtering when possible. regex is best reserved for unstructured fields or complex patterns not supported via field=value filters.
eval Logic with Nested and case StructuresWhile basic eval expressions are common, many exam-level questions test your ability to handle multi-branch logic using case() or nested if().
if:| eval size_type=if(bytes>1048576, "Large", "Small")
case() for multi-condition branching:| eval traffic_class=case(
bytes>10485760, "Very High",
bytes>1048576, "High",
bytes>102400, "Moderate",
true(), "Low"
)
case() allows multi-condition evaluation, similar to switch-case in programming.
true() is used as a default (fallback) match.
This structure improves readability and is often preferred in dashboards or summary panels.
lookup with FiltersWhen enriching data via lookups, it’s common to follow with where for selective filtering.
index=network_traffic
| lookup threat_list ip_address as src_ip OUTPUT threat_type
| where isnotnull(threat_type)
Explanation:
lookup adds threat classification based on IP match.
where ensures only events with matched threats are retained.
This combination is typical in security use cases (e.g., blacklist/whitelist filtering) and often appears in practical Splunk interview or certification scenarios.
where, like, and isnullA well-constructed where clause can use string matching, null detection, and logical combinations.
| where like(user, "admin%") AND isnull(department)
Filters events where:
user starts with "admin"
department field is missing or null
This is useful in scenarios such as:
Identifying privileged users without department assignment
Detecting partial or broken onboarding data
Filtering incomplete audit records
Pro Tip: isnull() only detects true null values. If a field is present but empty (""), use:
| where isnull(department) OR department=""
| Area | Example | Purpose |
|---|---|---|
| Performance-aware filtering | status=500 vs regex status="5.." |
Prioritize indexed fields for speed |
| Multi-branch logic | eval traffic_class=case(...) |
Conditionally classify data |
| Lookup + filter | `lookup ... | where isnotnull(...)` |
| Compound conditions | where like(...) AND isnull(...) |
Complex business logic filtering |
Why is bin often used before reporting commands?
Because it groups continuous values like time or numbers into buckets that are easier to aggregate consistently.
Without binning, similar values may remain too granular for meaningful charting or summaries. This is especially important for time-based reporting where the analysis should occur at defined intervals rather than raw timestamps. On the exam, if the problem mentions grouping events into ranges or intervals before counts or charts, bin is a likely answer. A common mistake is using stats directly on highly granular values and getting fragmented results.
Demand Score: 61
Exam Relevance Score: 90
What kind of result set is xyseries designed to create?
It reshapes row-based data into a matrix-like structure with x-axis, series, and value fields.
That makes it useful when preparing data for charts or pivot-style visual output. The exam usually tests whether you recognize data reshaping needs rather than memorizing every argument. If the scenario says “convert rows into a chart-friendly table with one column per series,” xyseries is a strong fit. A common error is using stats alone when a matrix-oriented output is required.
Demand Score: 52
Exam Relevance Score: 86
When is untable useful?
untable is useful when you need to reverse a pivoted or wide format back into row-oriented records.
This often happens when chart-ready or report-style data must be normalized again for later filtering, aggregation, or export. The exam uses it to test understanding of two-way reshaping, not only one-way formatting. If the data is spread across many columns and you need it back in key-value rows, untable is conceptually appropriate.
Demand Score: 47
Exam Relevance Score: 83
Why does foreach matter in practical SPL manipulation?
It lets you apply repeated logic across multiple fields without writing the same expression over and over.
That makes searches easier to maintain when many similarly named fields need the same normalization, replacement, or calculation. The exam significance is efficiency and maintainability of SPL, not search runtime performance. If the scenario mentions “apply the same operation to multiple fields,” foreach is often the intended answer. A common mistake is writing many repetitive eval statements when the requirement clearly suggests iteration.
Demand Score: 44
Exam Relevance Score: 85