ITSI means IT Service Intelligence. It is a special application built by Splunk. Its goal is to help organizations monitor and understand their IT systems in a smarter, more connected way.
In a modern company, many systems work together to provide services. For example:
A website might depend on a web server, a database, and a payment service.
If one part (like the database) slows down or crashes, the whole website might break.
Traditional monitoring tools often look at just one thing at a time:
CPU usage
Memory usage
Error logs
But they don’t connect the dots to show how everything affects the actual business service. That’s where ITSI is different.
Think of ITSI as a "central brain" that:
Collects data from many systems
Analyzes it in real time
Finds out if something is wrong
Shows how problems in technical parts affect entire business services like "Online Store" or "Email Platform"
ITSI doesn’t just show numbers or graphs. It shows how services are doing.
For example:
This helps teams:
Focus on what matters most
Fix problems faster
Avoid business downtime
ITSI doesn’t wait hours to analyze logs. It:
Collects data in real time (as it happens)
Updates dashboards and alerts immediately
Finds patterns or problems instantly
This helps your IT team react quickly, sometimes even before users notice a problem.
ITSI can use machine learning to:
Learn what “normal” performance looks like
Detect anomalies (sudden strange behavior)
Warn teams before a system fully fails
This turns ITSI into not just a monitor, but a predictive tool.
ITSI is made of several main parts that work together. These parts are:
Services
KPIs (Key Performance Indicators)
Glass Tables
Notable Events
Deep Dives
In everyday life, a service is something that provides a function or result.
For example:
A food delivery service delivers meals.
A banking service lets you manage your money.
In ITSI, a service is a collection of IT components (like servers, databases, apps) that work together to provide a business function.
“Login System”
“Search Feature”
“Checkout Process”
“Email Notification Service”
These aren’t just servers or tools—they represent the whole function, from end to end.
Let’s say your website goes down. A simple monitoring tool might say:
But ITSI goes further. It tells you:
So, ITSI helps you:
Focus on services, not just raw numbers.
Understand why something is broken.
Prioritize what to fix first based on business impact.
Each service in ITSI has:
KPIs (Key Performance Indicators) – to measure how the service is doing.
Thresholds – to define what is "Normal", "Warning", or "Critical".
Dependencies – connections to other services it relies on.
Health Score – a number between 0 and 100 that shows how healthy the service is overall.
Service: “Order Checkout Service”
| Component | KPI Measured | Threshold |
|---|---|---|
| Payment API | Response time | Critical if > 2 sec |
| Database | Query success rate | Warning if < 98% |
| App Server | Error rate | Critical if > 5% |
If any of these KPIs cross their thresholds, the health score of the Checkout Service drops—and you get alerted!
A service is a group of KPIs that represent a real business or technical function.
ITSI uses services to monitor what matters, not just machines.
Each service gives you a real-time health score based on its KPIs.
This makes problem-solving faster and more focused.
KPI stands for Key Performance Indicator.
In everyday language, a KPI is a measurement that tells you how something is doing.
For example:
In school, your GPA is a KPI for your academic performance.
In business, “monthly sales” might be a KPI for success.
In ITSI, a KPI is a metric that tells you how healthy or unhealthy part of a service is.
Remember the “Order Checkout Service” example? To know if that service is healthy, you need to measure things like:
How fast the payment API responds
Whether database queries are successful
If there are errors when people try to check out
Each of these measurements is a KPI.
KPIs help you:
Quantify performance
Detect problems as they happen
Trigger alerts when something goes wrong
Calculate service health scores
KPIs in ITSI are built using searches—specifically Splunk searches (SPL) that gather data from logs, metrics, or other sources.
Example: If you want to measure how fast your payment API responds, your KPI search might look like:
index=payments sourcetype=api_logs | stats avg(response_time) as avg_response
This search will return the average response time for your API.
You can then say:
If avg_response < 1.5 seconds → Normal
If 1.5–2 seconds → Warning
If > 2 seconds → Critical
This is called setting thresholds, and we’ll go deeper into that later.
Base Search
The main search that pulls in the data (e.g., error count, load time).
Thresholds
Rules that define KPI states like Normal, High, Critical.
Split-By Fields
Allows you to track the same KPI across multiple servers or applications.
For example: one KPI for "CPU usage", but split by host.
Time Policies
Use different thresholds for different times (e.g., more relaxed at night).
Importance Weight
KPIs can be more or less important when calculating a service’s health score.
| Search Field | Example Setting |
|---|---|
| KPI Name | "Login Error Rate" |
| Base Search | `index=auth sourcetype=login_logs |
| Thresholds | Normal: 0–50, Warning: 51–100, Critical: >100 |
| Importance | High (because login problems affect all users) |
| Schedule | Every 5 minutes |
A KPI is a measurement that monitors part of a service.
KPIs use searches to gather data in real time.
They are the building blocks for alerts, health scores, and visual dashboards.
Every service in ITSI is made up of multiple KPIs.
A Glass Table in ITSI is a custom-built dashboard. It shows live data using shapes, colors, icons, and animations—not just charts or graphs.
Think of it like a control center screen in a movie:
A map that shows system health
Icons that light up if something goes wrong
Color-coded indicators for performance
Real-time movement and changes
In short: It helps you visualize your services and their health in a way that's easy to understand at a glance.
Imagine you have a glass tabletop, and under that glass is a picture of your IT infrastructure—servers, applications, data flows.
Then, on the glass, you place live performance data, indicators, and alerts right on top of those systems.
That’s the idea: a transparent, real-time, layered view of your environment.
Shapes and Images
You can draw boxes, circles, arrows, or upload your own background (like a network diagram or data center layout).
Icons and Text
Add symbols (like server icons) and text labels for easy identification.
KPI Widgets
Link any shape or icon to a KPI. The shape will change color based on the KPI’s status:
Green = Normal
Yellow = Warning
Red = Critical
Animations
Arrows or lines can blink or move to show active traffic or issues.
Drilldowns
Click on an element (like a red server icon), and it can:
Open a Deep Dive
Show a dashboard
Run a search
Imagine your Glass Table shows the following:
Background Image: A map of your network
Web Server Icon: Linked to KPI "Web Server CPU"
Database Icon: Linked to KPI "DB Query Success Rate"
API Gateway Icon: Linked to KPI "Error Rate"
When something breaks:
The affected icon turns red
You can click it to investigate the problem
The whole dashboard updates live, with no need to refresh
IT Operations Teams: To monitor infrastructure in real time
NOC (Network Operations Center): For big-screen displays
Executives: To get a high-level view of business service status
You can create different Glass Tables for different audiences. For example:
Technical version: more details, logs, and metrics
Executive version: only high-level health indicators and impact
Start simple – Don’t try to draw your entire system at once.
Use colors carefully – Green/yellow/red works best for quick understanding.
Test your links – Make sure each icon actually connects to a working KPI.
Keep it clean – Too many elements can be confusing.
Glass Tables let you build custom dashboards using real-time data.
You can display data in interactive, visual ways—beyond normal charts.
They help teams spot problems quickly and understand how they affect services.
You can click into problems to investigate right away.
A Notable Event is an alert created by ITSI when something significant happens—usually when a KPI crosses a threshold like “Critical” or “High”.
But it’s more than just a normal alert.
A Notable Event in ITSI is:
Smart: It includes context, such as which service or server was affected.
Actionable: You can assign it, acknowledge it, or even use it to trigger a script or create a ticket.
Grouped and Prioritized: ITSI can group similar events and assign them a severity level.
Think of it as your to-do list for incidents.
There are two main ways:
From a KPI Threshold Breach
Example: If a KPI like “API Error Rate” goes above its critical threshold, a Notable Event is generated.
From a Correlation Search
You can create special searches that detect patterns (like “high CPU + high memory = possible server overload”) and generate an event when that pattern appears.
Each Notable Event includes important information such as:
Time of the event
Service or KPI that triggered it
Severity level (Info, Normal, Warning, High, Critical)
Affected entities (e.g., hostnames, applications)
Custom fields (like region, environment, team)
Event status: e.g., new, acknowledged, resolved
Actions taken or assigned users
This makes it much easier for teams to understand what happened, where, and what to do next.
You can manage events in the Episode Review Dashboard in ITSI. This is like a control panel where you can:
Group related events together
Suppress events during known downtime or maintenance
Assign ownership to team members
Tag and filter events for faster searching
Run automated actions, like sending a Slack message or opening a ServiceNow ticket
Let’s say your “Checkout Service” has a KPI called “Payment API Response Time”.
Its thresholds are:
Normal: < 1.5 sec
Warning: 1.5–2.5 sec
Critical: > 2.5 sec
Suddenly, the KPI goes to 3.1 seconds. ITSI will:
Mark the KPI as Critical
Trigger a Notable Event
Display it in the dashboard
Possibly group it with similar events (like “high error rate”)
Let your team act on it right away
Let’s say you have scheduled maintenance every Sunday.
You don’t want alerts firing during this time.
You can set up Event Suppression Policies to tell ITSI:
“If this KPI is critical during Sunday from 2 AM to 4 AM, ignore it.”
This helps reduce alert noise and false positives.
Notable Events are intelligent alerts created when KPIs or patterns cross a threshold.
They are detailed, grouped, and actionable.
You can manage them from a central dashboard and assign them to team members.
They help IT teams respond faster and reduce downtime.
A Deep Dive in ITSI is an interactive, time-based dashboard that lets you:
Visually explore how KPIs behaved over time
Correlate multiple KPI trends together
Investigate incidents
Pinpoint root causes
Think of it as your IT microscope—a focused space to explore what went wrong, when it started, and why.
Let’s say you received a Notable Event that says:
“Checkout Service is in a Critical State – API Error Rate is high.”
You’re not sure if the API is the problem, or if something else is causing the issue. So you:
Open a Deep Dive for the Checkout Service.
Add relevant KPIs like:
API Error Rate
Database Query Time
Web Server Response Time
Look at how each one behaved over the past 60 minutes.
Suddenly you see:
All KPIs looked normal until 2:15 PM
Then the Database Query Time spiked
A minute later, API errors began
That tells you the real problem started in the database, and it caused the API to fail. Problem diagnosed!
Time-Series Graphs
Each KPI appears as a chart over time. You can see:
When the value started to change
How severe the change was
Whether other KPIs changed at the same time
Event Timeline Panel
This section shows Notable Events and their timestamps, so you can match them to KPI spikes.
Zoom & Pan Tools
You can zoom in on a 5-minute window or scroll across hours to look at the full context.
Layout Editor
You can drag, resize, or arrange KPIs in a way that makes sense for your investigation.
Custom Time Ranges
Deep Dives are not limited to "right now". You can look back:
One hour ago
Yesterday during an outage
Last week during a known spike
NOC (Network Operations Center) teams use them to investigate real-time alerts.
SREs (Site Reliability Engineers) use them for post-incident analysis.
App owners use them to monitor patterns or test new deployments.
Imagine this scenario:
A mobile app team deploys a new version of the login module.
30 minutes later, users start reporting login failures.
You open a Deep Dive and add KPIs:
Login Success Rate
App Server CPU
Authentication API Errors
From the Deep Dive, you notice:
CPU load spiked during deployment
Error rates increased right after
Success rate dropped sharply
You now have a clear timeline of what went wrong, which is perfect for:
Fixing the problem fast
Documenting the incident
Preventing it in the future
Deep Dives help you investigate problems over time using KPI data.
They show how, when, and where an issue started.
You can compare multiple KPIs, view events, and find the root cause.
They are essential tools for real-time troubleshooting and post-incident reviews.
In the SPLK-3002 exam, it is common for questions to test your understanding of whether ITSI is included by default with Splunk or if it is an additional offering. Clarifying this helps prevent confusion.
Splunk IT Service Intelligence (ITSI) is a premium app, not included by default with a standard Splunk installation.
It requires separate licensing and installation, typically sourced from Splunkbase or through enterprise agreements.
This distinction is critical in enterprise environments and may appear in exam questions that differentiate between core Splunk features and premium modules.
KPI sources are evolving. Modern observability strategies go beyond logs and metrics to include traces, especially with growing adoption of OpenTelemetry.
KPIs in ITSI can be calculated from various data sources, including:
Logs: Traditional Splunk log data (e.g., from application logs, system logs)
Metrics: Time-series numerical data (e.g., CPU usage, memory, request counts)
Traces: Distributed tracing data, especially from modern microservice environments
You can ingest OpenTelemetry-formatted data into Splunk Observability Cloud or through custom collectors in ITSI to construct trace-based KPIs. This integration supports full-stack monitoring.
Understanding the variety of KPI input types prepares you to work in hybrid or cloud-native environments.
Glass Tables are often associated with real-time dashboards, but they are capable of much more. Many users don’t realize they can also visualize historical trends for deep-dive analysis.
Glass Tables in ITSI support both real-time and historical data binding, offering flexible visualization for service health, KPIs, and entity states.
You can:
Bind panels to real-time search results for live monitoring
Use historical KPI data for retrospective analysis
Switch between time windows (e.g., last 5 minutes, past 24 hours)
This dual-mode visualization makes Glass Tables suitable for both active troubleshooting and long-term trend analysis.
What is the primary purpose of Splunk IT Service Intelligence?
To monitor and analyze the health and performance of IT services.
Splunk IT Service Intelligence extends the Splunk platform by focusing on service-level monitoring rather than individual infrastructure metrics. It aggregates data from multiple sources and evaluates key performance indicators associated with services. By analyzing these KPIs, ITSI calculates service health scores that help organizations understand how system performance impacts business operations. This service-centric monitoring approach provides better visibility into complex IT environments than traditional infrastructure monitoring.
Demand Score: 72
Exam Relevance Score: 86
Which core ITSI feature provides a high-level overview of service health across multiple services?
Service Analyzer.
Service Analyzer is a dashboard that provides an aggregated view of service health across the entire IT environment. It displays services along with their health scores, allowing administrators to quickly identify services experiencing performance degradation or failures. From this overview, operators can drill down into detailed dashboards such as Deep Dive views to investigate underlying issues. Service Analyzer therefore acts as the primary entry point for monitoring overall service health in ITSI.
Demand Score: 68
Exam Relevance Score: 90
How does ITSI differ from traditional infrastructure monitoring?
It focuses on service-level monitoring instead of individual system metrics.
Traditional monitoring tools typically track metrics such as CPU utilization, disk usage, or network performance for individual devices. ITSI aggregates these metrics into service-level KPIs that represent the overall health of business services. By correlating multiple metrics and analyzing them together, ITSI provides insight into how infrastructure performance affects application availability and business operations. This service-centric approach enables organizations to prioritize issues based on business impact rather than isolated technical metrics.
Demand Score: 66
Exam Relevance Score: 85