Troubleshooting ITSI

Troubleshooting ITSI Detailed Explanation

1. Purpose of Troubleshooting ITSI

The goal of troubleshooting in ITSI is to identify and fix issues that impact the platform’s performance or accuracy, such as:

Misconfigured thresholds
Slow or failing searches
Missing or incomplete data
Errors in dashboards or visualizations

By regularly checking for these issues, you ensure your service health scores and alerts are accurate and trustworthy.

2. Common Troubleshooting Areas

a. Search Performance

ITSI runs many scheduled searches to calculate KPIs. If searches are slow or failing, KPIs won’t update correctly.

Tips:

Use Search Inspector to analyze individual searches for:
- Execution time
- Search phase delays (e.g., dispatch, reduce)
Check scheduler logs for:
- Skipped searches
- Concurrency issues (too many searches running at once)

b. Missing Data

If a KPI shows “No Result” or isn’t updating, the underlying data might be missing or misrouted.

Tips:

Check index settings to make sure data is going to the correct location.
Review your base searches for typos or incorrect filters.
Use ITSI Data Audit dashboards to:
- Spot stale data
- Identify gaps in search results
- Monitor data latency or inconsistencies

c. Threshold Misconfigurations

Improper thresholds can result in false alerts or missed incidents.

Tips:

Review KPI thresholds to ensure they are:
- Not too sensitive (causing too many alerts)
- Not too loose (missing real issues)
Verify that Time Policies are applied correctly (e.g., different rules during business hours vs. off-hours)

d. Notable Event Issues

Sometimes, Notable Events are not being created, grouped, or escalated as expected.

Tips:

Review Aggregation Policy logs for errors or misapplied conditions.
Make sure Correlation Searches are:
- Enabled
- Returning the expected results
- Properly tagged and categorized

e. Glass Table or Deep Dive Errors

If visualizations aren’t working:

KPIs may not be correctly linked to their data source
There could be permission issues or invalid tokens used in the Glass Table

Tips:

Check if each KPI used in a visualization has:
- Valid search results
- Correct entity bindings
Review access controls for users who are unable to view tables or interact with Deep Dives

3. Tools for Troubleshooting

ITSI provides several tools and utilities to help you diagnose and resolve issues:

a. `itsi_troubleshooting_toolkit` (Add-On)

Optional app that provides:
- Troubleshooting dashboards
- Health check reports
- Environment diagnostics

Great for larger deployments or complex issues.

b. `_internal` Logs

Use Splunk’s internal logs to:
- Track error messages
- Find search timeouts or skipped searches
- Identify system-level warnings

Example search:

index=_internal sourcetype=scheduler OR sourcetype=itsi*

c. KPI Search Logs

Each KPI search can log:
- Errors in SPL
- Long runtimes
- No-result conditions

Check these logs when a KPI is not displaying any data.

4. Best Practices for Troubleshooting ITSI

Monitor Search Concurrency and Load

Avoid scheduling too many KPI searches at the same time.
Spread out search schedules using cron expressions to balance system load.

Document and Version Control Services

Keep track of changes to KPIs, thresholds, and services.
Use version control tools or export/import methods to back up configurations.

Work with Splunk Support for Complex Issues

For problems that persist or are hard to diagnose, engage Splunk Support.
Provide them with:
- Logs
- Configuration snapshots
- Environment details

This speeds up resolution and avoids guesswork.

Summary: What to Remember About Troubleshooting ITSI

Troubleshooting ITSI is about ensuring data accuracy, search performance, and alert reliability.
Focus on search health, data flow, thresholds, event policies, and dashboards.
Use built-in tools like Search Inspector, Audit Dashboards, and internal logs.
Follow best practices to keep your ITSI environment stable, efficient, and trustworthy.

Troubleshooting ITSI (Additional Content)

1. Identifying KPI Anomalies: Stale, Invalid, or No-Result States

In ITSI, a KPI may appear abnormal even if no visible error occurs. Three common problematic states are:

Stale: The KPI is not receiving updated data within its expected schedule.
- Common causes: Skipped searches, index delays, search concurrency overload.
Invalid: The base search returns non-numeric or incorrectly formatted data.
- Common causes: SPL syntax issues, wrong eval logic, or missing fields.
No Result: The search runs but returns zero matching events.
- Common causes: Field typos (e.g., hostnme instead of hostname), time range misalignment, or insufficient permissions.

Tip: Use Search Inspector, and temporarily modify the base SPL with | head 10 or | stats count to validate live data retrieval.

Being able to recognize and differentiate these conditions is essential to determine whether the issue lies in the search, the data, or ITSI configuration.

2. Debugging Complex Correlation Searches

When a correlation search fails to trigger expected Notable Events—despite known conditions being met—the following steps are recommended:

a. Narrow the Query Scope

Use lightweight SPL for testing, such as:

| tstats count where index=itsi_summary by host, _time

Or leverage:

| datamodel itsi_summary.kpi search

This helps confirm whether the expected data is available before running the full logic.

b. Enable Debug Logging

Increase log verbosity for itsi or the correlation search scheduler. This can reveal:

Syntax parsing issues
Skipped searches due to role-based permissions
Aggregation timeouts or misaligned tokens

Always test correlation searches in isolation with known conditions and minimal filters before scaling to full production logic.

3. Using `itsi_summary` and `itsi_notable_archive` for Troubleshooting

These two indexes are vital for verifying ITSI output flows:

itsi_summary:
- Stores raw KPI results (aggregated or per-entity)
- Useful for checking if KPI values exist, are timely, and match thresholds
- Example SPL:
```
index=itsi_summary kpi="CPU Usage" | timechart avg(kpi_value) by host
```
itsi_notable_archive:
- Stores archived Notable Events for review, audit, or forensic analysis
- Helps determine if a correlation search ever fired an event
- Example SPL:
```
index=itsi_notable_archive rule_title="High CPU and Memory" | stats count by service_name, severity
```

Together, these indexes offer a full trace of detection logic → alert generation → event archival, making them essential for root cause analysis in ITSI environments.

Summary

Understand special KPI states: Learn how to identify and resolve stale, invalid, and no result conditions.
Use smart debugging techniques for correlation searches: narrow queries, validate datasets, and increase logging.
Leverage internal indexes: Query itsi_summary and itsi_notable_archive to cross-check KPI values and Notable Event history.
Apply validation best practices: Use lightweight SPL commands like | stats count, and always run base searches interactively before embedding them in KPIs or alerts.

Shopping cart

Subtotal:

SPLK-3002 Troubleshooting ITSI

Detailed list of SPLK-3002 knowledge points

Troubleshooting ITSI Detailed Explanation

1. Purpose of Troubleshooting ITSI

2. Common Troubleshooting Areas

a. Search Performance

Tips:

b. Missing Data

Tips:

c. Threshold Misconfigurations

Tips:

d. Notable Event Issues

Tips:

e. Glass Table or Deep Dive Errors

Tips:

3. Tools for Troubleshooting

a. `itsi_troubleshooting_toolkit` (Add-On)

b. `_internal` Logs

c. KPI Search Logs

4. Best Practices for Troubleshooting ITSI

Monitor Search Concurrency and Load

Document and Version Control Services

Work with Splunk Support for Complex Issues

Summary: What to Remember About Troubleshooting ITSI

Troubleshooting ITSI (Additional Content)

1. Identifying KPI Anomalies: Stale, Invalid, or No-Result States

2. Debugging Complex Correlation Searches

a. Narrow the Query Scope

b. Enable Debug Logging

3. Using `itsi_summary` and `itsi_notable_archive` for Troubleshooting

Summary

Frequently Asked Questions

Product Center

Exam Categories

Support & Community

Shopping cart

Subtotal:

SPLK-3002 Troubleshooting ITSI

Troubleshooting ITSI

Detailed list of SPLK-3002 knowledge points

Troubleshooting ITSI Detailed Explanation

1. Purpose of Troubleshooting ITSI

2. Common Troubleshooting Areas

a. Search Performance

Tips:

b. Missing Data

Tips:

c. Threshold Misconfigurations

Tips:

d. Notable Event Issues

Tips:

e. Glass Table or Deep Dive Errors

Tips:

3. Tools for Troubleshooting

a. itsi_troubleshooting_toolkit (Add-On)

b. _internal Logs

c. KPI Search Logs

4. Best Practices for Troubleshooting ITSI

Monitor Search Concurrency and Load

Document and Version Control Services

Work with Splunk Support for Complex Issues

Summary: What to Remember About Troubleshooting ITSI

Troubleshooting ITSI (Additional Content)

1. Identifying KPI Anomalies: Stale, Invalid, or No-Result States

2. Debugging Complex Correlation Searches

a. Narrow the Query Scope

b. Enable Debug Logging

3. Using itsi_summary and itsi_notable_archive for Troubleshooting

Summary

Frequently Asked Questions

a. `itsi_troubleshooting_toolkit` (Add-On)

b. `_internal` Logs

3. Using `itsi_summary` and `itsi_notable_archive` for Troubleshooting