This topic covers how to use transforming commands in Splunk to process raw event data, summarize it, and create structured outputs that can be visualized.
Transforming commands take raw event data and transform it into structured summary data. This structured data is essential for creating visualizations, reports, and dashboards in Splunk. Unlike filtering commands (e.g., search, where), transforming commands change the shape and structure of the data.
count, sum, avg, etc.).stats CommandThe stats command is one of the most versatile transforming commands in Splunk. It allows users to aggregate data using various statistical functions.
stats <function>(<field>) AS <new_field_name> BY <grouping_field>
count: Counts the number of events.sum: Calculates the total of a numeric field.avg: Computes the average value of a numeric field.min: Finds the smallest value in a field.max: Finds the largest value in a field.Count the total number of events in an index:
index=web_logs | stats count
Result: Displays the total number of events.
Calculate the total revenue by product:
index=sales | stats sum(price) AS TotalRevenue BY product
Result: A table showing the total revenue for each product.
Find the average page load time by URL:
index=web_logs | stats avg(load_time) AS AvgLoadTime BY url
Result: Displays the average load time for each URL.
chart CommandThe chart command is specifically designed for generating data that can be visualized in charts like bar charts and pie charts. It is similar to stats but has more structured outputs for visualization.
chart <function>(<field>) BY <field_1>, <field_2>
BY fields.Count events grouped by status code:
index=web_logs | chart count BY status_code
Result: A table showing the count of events for each status_code.
Sum of sales grouped by region and product:
index=sales | chart sum(price) BY region, product
Result: A table showing the total sales for each product in each region.
timechart CommandThe timechart command is tailored for time-series data. It allows you to aggregate data over time intervals, making it ideal for trend analysis.
timechart span=<interval> <function>(<field>) BY <field>
span argument defines the time interval (e.g., 1 minute, 1 hour, 1 day)._time as the x-axis in visualizations.Count events per hour:
index=web_logs | timechart span=1h count
Result: A time series showing the count of events per hour.
Calculate average sales per day grouped by product:
index=sales | timechart span=1d avg(price) BY product
Result: A time series showing the daily average price for each product.
Transforming commands prepare data for various types of visualizations. Each visualization type is suited for specific use cases:
chart count BY status_codechart sum(price) BY producttimechart span=1d counttimechart span=1d sum(price)Use Filters Before Transformations
Always filter out irrelevant data before applying transforming commands to reduce the processing load.
Example:
index=web_logs status_code=200 | stats count BY url
Define Time Intervals in timechart
Use the span argument to specify appropriate time intervals for large datasets.
Example:
timechart span=1h count
Optimize for Performance
stats for simple aggregations instead of transaction, which is resource-intensive.chart and stats.Leverage Visualizations Wisely
You can use multiple aggregation functions within a single stats, chart, or timechart command to generate comprehensive summaries.
index=sales | stats count AS TotalTransactions, avg(price) AS AvgPrice BY product
Result: A table showing the total number of transactions and the average price for each product.
index=web_logs | stats count BY status_code, http_method
Result: Counts grouped by both status_code and http_method.
Conditional aggregation lets you calculate metrics for specific subsets of data.
index=sales | stats count(eval(price > 100)) AS HighValueTransactions BY region
Result: Counts the number of transactions with a price greater than 100, grouped by region.
evalindex=web_logs | stats count AS Total, count(eval(status_code=404)) AS NotFound BY user
Result: Counts the total requests and the 404 errors for each user.
If a field contains multiple values (e.g., a field representing tags or categories), you can manipulate these using transforming commands.
index=products | stats count BY mvexpand(tags)
Result: Counts events for each unique value in the tags multivalue field.
Transforming commands in Splunk are commonly used to generate visualizations. However, each chart type comes with specific constraints that users must understand to properly use them in dashboards or reports.
timechart Command – Limitation on X-AxisThe timechart command is specifically designed to produce time-series visualizations.
X-axis is fixed: It always uses the _time field as the X-axis.
You cannot use other fields (e.g., user, host, url) as the X-axis in a timechart.
Example:
index=web_logs | timechart count BY status_code
_time is the X-axis and status_code values define the series.chart Command – Limitation on BY ClauseThe chart command is flexible but has a maximum of two BY fields:
One is used for X-axis, and
The second (if present) determines the series grouping (legend).
Example:
index=sales | chart sum(price) BY product, region
This generates a matrix of product (X-axis) and region (series).
If you try using more than two BY fields, Splunk will return an error.
stats Command – No Limit on Grouping FieldsThe stats command does not have a limit on the number of BY fields.
You can group your aggregated data across multiple dimensions.
Example:
index=orders | stats sum(quantity) BY category, brand, region
| Command | X-axis Flexibility | Grouping Limit | Visualization Type |
|---|---|---|---|
| timechart | Only _time |
1 BY field |
Line chart, column chart |
| chart | Any field | Up to 2 | Column, pie, bar, area |
| stats | Any field | Unlimited | Table or raw stats summary |
In real-world scenarios — and on the exam — you may be given raw log data without predefined fields. You'll need to extract fields manually and then perform aggregation using transforming commands like stats, chart, or timechart.
If fields are not automatically recognized (i.e., not present in the left Field Sidebar), you can extract them at search time using:
rex: Extract using regular expressions.
extract or kv: Automatically extract key-value pairs from the event data.
rex for Inline Field ExtractionThe rex command allows you to extract fields from unstructured text using regex patterns.
Example:
index=logs | rex field=_raw "user_id=(?<user_id>\d+)"
| stats count BY user_id
user_id field and then count the number of events per user.extract for Key-Value PairsThe extract command (or its alias kv) can parse key=value formatted data.
Example:
index=logs | extract pairdelim=" " kvdelim="="
| stats avg(duration) BY user
user=alice duration=30, the extract command will recognize and extract both fields.Prompt Example:
You are given this raw log line:
event="login" user_id=admin status=success duration=5
Expected SPL to extract and summarize:
index=security_logs
| rex field=_raw "user_id=(?<user_id>\w+)\sstatus=(?<status>\w+)\sduration=(?<duration>\d+)"
| stats avg(duration) BY user_id, status
user_id, status, and duration, then calculates the average duration per user and status.Understand the visualization limitations of timechart, chart, and stats.
Be proficient in using rex or extract when dealing with unstructured logs.
Practice translating raw log lines into fields, then aggregating and visualizing the results.
What advantage does timechart provide when generating time-based visualizations in Splunk dashboards?
It automatically bins events into consistent time intervals and aggregates them.
timechart simplifies time-series reporting by automatically dividing events into defined time buckets such as seconds, minutes, hours, or days. It then applies statistical functions such as count, sum, or avg to each bucket. This automatic binning ensures consistent spacing in visualizations and reduces manual query complexity. Without timechart, users must manually bin _time using the bin command and then apply a transforming command like chart or stats. The automatic behavior of timechart makes it the preferred command for dashboards displaying trends over time. Misusing stats or chart for this purpose often results in inconsistent intervals or overly granular results.
Demand Score: 65
Exam Relevance Score: 80
Why might a chart command fail to produce the expected results when visualizing trends over time in Splunk?
Because chart does not automatically group events by time intervals.
The chart command aggregates results by specified fields but does not treat _time specially. If a user attempts to visualize time trends using chart count by _time, the results often appear fragmented because each event timestamp is treated as a unique value. Time-based analysis requires events to be grouped into consistent buckets such as minutes or hours. Without binning, the visualization becomes noisy and unusable. Users typically resolve this by running bin _time span=... before the chart command or by using timechart, which automatically creates evenly spaced time buckets. The common mistake is assuming that _time is automatically bucketed when using chart.
Demand Score: 64
Exam Relevance Score: 82
When should a Splunk user choose the timechart command instead of chart when building a visualization?
Use timechart when the analysis is based on time-series data and requires automatic time-based aggregation.
timechart is specifically designed for time-series analysis. It automatically groups events into time buckets using the _time field and applies statistical functions such as count, avg, or sum. This removes the need for manually specifying time-based grouping. In contrast, chart performs aggregation across categorical fields but does not automatically manage time buckets. To analyze trends over minutes, hours, or days, timechart simplifies the process and ensures consistent time intervals. Using chart for time analysis typically requires manual binning with commands like bin _time. A common mistake is attempting to build time-series visualizations with chart, which often results in incorrect or uneven time grouping.
Demand Score: 68
Exam Relevance Score: 84