KV Store Collection and Lookup Management

KV Store Collection and Lookup Management Detailed Explanation

The KV Store (Key-Value Store) is a built-in, NoSQL-style database in Splunk used to store and manage structured, dynamic, and application-driven data. It plays a critical role in supporting interactive apps, dashboards, and advanced correlation use cases.

This topic explains what the KV Store is, how it is used, how it is managed and maintained, and what to watch for in terms of performance and troubleshooting.

1. What Is KV Store?

The KV Store is a key-value database integrated into Splunk’s platform. It stores structured data (JSON-style documents) and allows apps to store, retrieve, and manipulate that data for a variety of operational needs.

Key Characteristics:

Built on MongoDB, but fully embedded and managed by Splunk.
Stores collections, which are similar to tables in relational databases.
Supports dynamic fields and schema flexibility (no rigid column types).
Can be used by Splunk apps, dashboards, and custom scripts.

2. Use Cases for KV Store

KV Store is designed to support more interactive and app-specific functionality than traditional CSV-based lookups.

Common Use Cases:

Dynamic Form Options
- Populate dropdown menus in dashboards with live or context-specific data.
- Update options based on user actions or data flows.
Lookup-based Correlation
- Store enrichment data like IP-to-location mappings or device status.
- Use the KV Store as a lookup source for real-time correlation in searches.
Session State or App Logic Storage
- Keep track of user session information.
- Store intermediate data for long-running apps or dashboards.
- Maintain workflow state or user-specific configurations.

3. KV Store Management

Configuration and Administration

collections.conf
- Defines the structure of a KV Store collection.
- Used to declare field names, types, and collection behavior.
REST API
- Provides powerful methods for inserting, updating, retrieving, and deleting KV Store records.
- Common endpoint:
```
/servicesNS/<user>/<app>/storage/collections/data/<collection_name>
```

Example operations:

GET: Retrieve records
POST: Add new records
DELETE: Remove entries
PUT: Update existing records

Backup and Restore

Use the kvstore-to-json.py script (found in Splunk’s admin tools) to back up data into JSON format.
This is especially useful before:
- Upgrades
- Cluster reconfiguration
- Moving collections between environments

Tip: Always document app and collection ownership to avoid permission issues during restores.

Permissions and Access Control

Access to KV Store collections can be controlled through:
- App context: Limit which app can read/write to the collection.
- transforms.conf: If the KV Store is exposed as a lookup, you can control search access.

4. Maintenance and Performance Considerations

Monitor for Size Growth

Over time, KV Store collections can grow significantly, especially if used for logs or tracking.
Excessive growth can impact:
- Search head performance
- Disk usage
- Replication in Search Head Clusters

Best Practices:

Regularly audit collections for unused or outdated records.
Set TTL (time-to-live) settings in app logic if applicable.

Clean Up Unused Collections

Periodically remove stale or deprecated collections via:
- REST API
- App management
- Manual deletion from collections.conf and storage paths

Note: Always back up before deleting production collections.

Handling Corruption and Errors

If the KV Store becomes corrupted or unresponsive, common symptoms include:

Dashboards or apps fail to load.
Lookup errors or form fields not populating.
Search head memory or CPU spikes.

Recommended Actions:

Restart the search head:
```
splunk restart
```
Use the repair tool:
```
splunkd --repair_kvstore
```
Review logs:
- mongod.log in $SPLUNK_HOME/var/log/splunk/
- splunkd.log for collection access errors

KV Store Collection and Lookup Management (Additional Content)

The KV Store (Key-Value Store) in Splunk is a built-in, JSON-style, schema-flexible database designed to support dynamic application data, interactive dashboards, and real-time correlation. Unlike static lookups (like CSV files), KV Store is fully writable and queryable.

1. Built-in Fields and Access Control

Each KV Store record is stored as a document with special reserved fields that affect functionality and security.

_key:
- Unique primary key generated for each document (unless explicitly set).
- Used for record-level updates or deletes.
_user:
- Indicates which Splunk user created the record.
- Helps restrict visibility in multi-user environments.
_app:
- Ties the record to a specific app context.
- Access to records can vary based on current user and app scope.

Note: When querying or inserting KV Store data via REST, these fields may affect the returned results or permissions.

2. Performance Optimization with Indexed Fields (Splunk 8.x+)

Starting from Splunk 8.x, KV Store supports accelerated fields via the accelerated_fields setting in collections.conf.

[device_inventory]
accelerated_fields = ip, hostname, status

Benefits:

Acts like indexing in traditional databases.
Greatly improves lookup and query speed on high-frequency fields (e.g., IP addresses, usernames).
Especially useful for large-scale lookups or filters in dashboards.

3. KV Store Synchronization in SHC (Search Head Cluster)

KV Store synchronization behaves differently from file-based configurations.

Only the Captain node manages write and replication control for KV Store data across SHC members.
Non-captain nodes may not reflect updates in real-time or accept write requests.

Best Practice:

Always route REST API writes to the Captain node.
Use this command to determine the current captain:

splunk show shcluster-status

4. Configuration Dependencies in Splunk 8.x+

To ensure proper behavior of KV Store in clustered environments, Splunk 8+ introduces an explicit KV Store replication role.

Setting (in server.conf):

[kvstore]
kvstore_replication_role = captain

Without this, replication may stall or fail silently in SHC environments.
Must be set on every SHC member for consistent behavior.

5. Common SHC KV Store Issues

Example Symptom:

One SHC member displays "stale bundle" warnings.
Lookup or app errors show missing KV Store data.

Diagnosis Tool:

| rest /services/kvstore/status

This shows the replication, sync status, and health of each KV Store instance.

6. KV Store Monitoring Commands

To assess replica health and data availability across nodes, use:

| rest /services/admin/kvstore/status

This REST endpoint provides a cluster-wide view of replica states, consistency, and version mismatches.

7. TTL-Based Expiry Strategy (Manual)

KV Store does not natively support TTL (Time-To-Live) per record. However, administrators can simulate TTL logic by:

Creating a last_updated or expiration field in each record.
Running a scheduled search to periodically delete expired records:

| inputlookup my_collection_lookup
| where last_updated < relative_time(now(), "-30d")
| outputlookup my_collection_lookup append=false

Or use a REST call in a script to delete records based on time.

8. Cross-Environment Migration (Backup and Restore)

Export:

Use kvstore-to-json.py (or REST API) to export data in .json format.
Each entry includes the _key field, ensuring record identity preservation.

Restore Notes:

Ensure the target environment:
- Has the collection created in collections.conf.
- Has matching field definitions (mismatches can cause silent failures).
Use POST or PUT requests to upload records back via REST API.

Helpful REST endpoint:

/servicesNS/<user>/<app>/storage/collections/data/<collection_name>

Summary

KV Store is a powerful tool for managing structured, application-specific data in Splunk. Its integration into SHC and ability to support dynamic updates makes it a foundational component for modern dashboards and advanced apps.

Topic	Key Notes
Built-in Fields	_key, _user, _app control access and identity
Indexed Fields	`accelerated_fields` improves performance
SHC Behavior	Only Captain manages KV writes and replication
Version-Specific Settings	Use `kvstore_replication_role` in 8.x+
Troubleshooting	Use REST API to check stale bundles and sync status
TTL Workaround	Use `last_updated` + scheduled clean-up search
Backup & Restore	Export via `kvstore-to-json.py`, maintain `_key`

Shopping cart

Subtotal:

SPLK-2002 KV Store Collection and Lookup Management

Detailed list of SPLK-2002 knowledge points