In Splunk, "deployment" refers to how the system is installed and organized across one or more machines. It determines how data flows through the system, how it's stored, and how users access and analyze it. There are two main types of deployment: standalone and distributed.
A standalone deployment means all Splunk functions are installed on a single machine. This machine handles:
Collecting data (data input)
Parsing and indexing data (storing it for search)
Running searches and reports
Displaying results in dashboards and visualizations
This type of setup is simple and quick to install.
When is standalone deployment suitable?
Learning and training environments
Personal testing or development use
Small businesses with limited data volume
Advantages of standalone deployment:
Easy to set up and manage
No need for network configuration between components
Fewer system resources required
Disadvantages of standalone deployment:
Not designed for large data volumes
Limited performance and scalability
Not suitable for enterprise production use
A distributed deployment separates the different functions of Splunk across multiple machines. Each machine has a specific role, and they work together as a system.
This model is used in most production environments because it supports scalability, performance, and fault tolerance.
Main roles in a distributed deployment:
Forwarders – These are lightweight agents installed on data sources (e.g., servers, network devices). They collect and send data to indexers.
Indexers – These systems receive, parse, and store the data. They also respond to search requests by providing results.
Search Heads – These machines let users run searches, build dashboards, set alerts, and view reports. They act as the interface for users.
Why use a distributed setup?
To handle larger data volumes
To improve reliability and performance
To isolate and scale specific functions as needed
When is distributed deployment suitable?
Medium to large businesses
Environments with high data ingestion rates
Teams with multiple users querying data
| Category | Standalone Deployment | Distributed Deployment |
|---|---|---|
| Number of machines | One | Multiple |
| Complexity | Low | Medium to high |
| Performance | Limited | High |
| Scalability | Poor | Excellent |
| Use case | Testing, learning | Production, large-scale environments |
| Data handling | Basic | Handles high volume and concurrency |
In a distributed Splunk environment, different components (or roles) are installed on different machines. Each component has a specific responsibility. Understanding these roles is essential before you attempt any Splunk deployment or configuration.
A Universal Forwarder is a lightweight Splunk agent installed on data source systems (such as application servers, web servers, or databases). Its only job is to collect and forward raw data to a central Splunk system (usually to Indexers).
Key characteristics:
Small footprint with minimal resource usage
Cannot parse or transform data
No web interface or dashboards
Typically installed on many servers across an organization
Use case: Best suited for sending logs or metrics from servers in a production environment to Indexers for further processing.
A Heavy Forwarder is a full Splunk instance used to collect, parse, and forward data. Unlike the Universal Forwarder, the Heavy Forwarder can:
Perform data transformation using configuration files (props.conf and transforms.conf)
Filter or route data to different destinations
Support inputs that need scripting or parsing
Use case: Useful when the data needs to be modified or routed before it reaches the Indexer. For example, masking sensitive data or splitting events.
The Deployment Server is a special role used to manage and distribute configuration files and apps to multiple Universal Forwarders (and optionally Heavy Forwarders).
Functions include:
Central management of input configurations (inputs.conf, etc.)
Grouping forwarders into "server classes" based on IP, hostnames, etc.
Automatically deploying changes when needed
Use case: When you have hundreds or thousands of forwarders and you want to update or control them from a single place.
An Indexer is the core of the Splunk backend. It performs several vital functions:
Receives data from Forwarders or inputs
Parses and processes the raw data
Creates and stores event indexes for search
Responds to search requests from Search Heads
Key role: It stores both the raw data and the index metadata that allows fast searching later.
Use case: All production environments use one or more Indexers depending on data volume and redundancy requirements.
A Search Head provides the user interface to Splunk. It is where users:
Write and run searches using SPL (Search Processing Language)
Create dashboards, reports, and alerts
View search results and visualizations
The Search Head does not store data. It simply queries Indexers (or a cluster of Indexers) to retrieve results.
Use case: In large environments, Search Heads are separated from Indexers so user searches don't slow down indexing.
The Cluster Master, also called the Manager Node, is used when you deploy Indexer Clustering. It performs administrative tasks such as:
Managing cluster configurations
Monitoring the health of peer nodes (Indexers)
Coordinating data replication
It does not index or search data.
Use case: Required in environments that use clustered Indexers for high availability and data redundancy.
A Deployer is used when you set up a Search Head Cluster. Its job is to:
Push configuration bundles (apps, settings) to all Search Head cluster members
Ensure consistency across the cluster
Use case: Ensures all Search Heads in a cluster share the same configurations and apps, avoiding manual updates to each member.
The License Master manages Splunk’s license usage across all components in a deployment.
Responsibilities:
Tracks how much data is ingested per day
Issues license warnings or violations if limits are exceeded
Allocates license volumes across environments or departments
Use case: In medium to large environments, a centralized License Master ensures compliance and better monitoring of license usage.
Splunk Enterprise can be installed on various platforms, including Linux, Windows, Docker containers, and Cloud platforms. Each method has its own advantages and is used in different environments depending on scale, automation needs, or operating system preferences.
Linux is the most common platform for deploying Splunk in production. It provides better performance and flexibility for configuration.
Installation formats available:
.tgz (tarball file): Best for manual installation and custom directory paths.
.rpm (Red Hat-based systems): For package-managed installations using yum or dnf.
Installation steps using tarball (.tgz):
Download Splunk from the official site.
Move the file to your target directory.
Extract using:tar -xvzf splunk-<version>-<build>-Linux-x86_64.tgz -C /opt
Change directory:cd /opt/splunk/bin
Start Splunk for the first time and set admin password:./splunk start --accept-license
Advantages:
Offers full control over the installation path and files.
Widely used in production environments.
Easier to integrate with enterprise monitoring tools.
Splunk provides a simple graphical installer for Windows environments.
Installation steps:
Download the .msi or .exe installer from the Splunk website.
Run the installer and follow the GUI wizard.
Choose the installation path and set admin credentials.
After installation, access Splunk Web at:http://localhost:8000
Windows-specific features:
Native integration with Event Logs, WMI, Active Directory.
Can collect data using prebuilt Windows apps and add-ons.
Use cases:
This method is used for containerized or cloud-native deployments. It is common in CI/CD pipelines or when you want to automate deployment with infrastructure as code.
Docker Installation:
Pull the official image:docker pull splunk/splunk:latest
Run a container with environment variables:docker run -d -p 8000:8000 -e SPLUNK_START_ARGS="--accept-license" -e SPLUNK_PASSWORD="changeme" splunk/splunk
Kubernetes Installation:
Use Helm charts provided by Splunk.
Requires knowledge of Kubernetes and configuration files.
Enables rapid scaling and orchestration.
Use cases:
Development, testing, and cloud-native production environments.
Ideal for organizations using microservices or DevOps.
Splunk Cloud is a fully managed Splunk service hosted by Splunk or on major cloud providers (AWS, GCP, Azure). It eliminates the need for infrastructure management.
Key features:
No installation required.
Managed scaling, backups, and updates by Splunk.
Connect to Splunk Cloud using Forwarders or HEC.
Use cases:
Organizations that want to use Splunk without managing hardware.
Fast onboarding for analytics teams without DevOps involvement.
Limitations:
Less control over backend configurations.
Certain custom apps or configurations may not be supported.
| Method | Platform | Best for | Notes |
|---|---|---|---|
| Linux (.tgz/.rpm) | Server Linux | Production, performance | Most flexible and powerful |
| Windows (.exe/.msi) | Windows Server | Small-scale, Windows integration | Easy GUI installation |
| Docker | Any (container) | Testing, DevOps, automation | Requires Docker knowledge |
| Kubernetes | Cloud-native | Enterprise DevOps teams | Scalable, complex to manage |
| Splunk Cloud | Fully managed | Zero-maintenance, rapid deployment | Managed by Splunk, fast but less control |
By default, Splunk stores all of its data and configurations under the main installation directory ($SPLUNK_HOME). However, in production environments, it is a best practice to:
Store indexed data (hot, warm, cold buckets) on high-performance storage with sufficient capacity.
Place log files (such as _internal logs) on a separate disk if possible, to avoid performance impact.
Separate configuration files (in the etc/ directory) from data directories to improve clarity and backups.
Benefits:
Easier disaster recovery
Better performance tuning
Safer during upgrades or migrations
etc/system/local and etc/apps/ for Configuration HierarchySplunk has a layered configuration system, meaning configuration files can exist in different locations, each with a specific priority. Two key directories are:
etc/system/local: This is the highest precedence, but it's only for critical overrides.
etc/apps/<your_app>: Preferred location for most custom configurations and knowledge objects (dashboards, searches, inputs, etc.)
Why use apps instead of system/local?
Apps are modular and portable
Easy to version-control and deploy across environments
Compatible with Deployment Server, Search Head Clustering, and Deployer
Example:
If you want to configure a custom log input, create an app like TA_custom_inputs, and place inputs.conf inside etc/apps/TA_custom_inputs/local/.
Time consistency is critical in Splunk deployments because:
Timestamps are used to organize, index, and search data.
Searches over time ranges can miss data if the time is not aligned across Indexers, Forwarders, and Search Heads.
Clustering (Indexer/Search Head) relies on synchronized logs for replication and troubleshooting.
Best practice: Configure all systems to use a reliable Network Time Protocol (NTP) server.
In production environments, you should scale horizontally, not vertically.
Horizontal scaling: Add more Indexers or Search Heads as data volume or user demand increases.
Vertical scaling: Adding more CPU/RAM to a single instance (helpful but has limits).
Examples:
If indexing becomes slow: Add more Indexers.
If search performance drops during peak hours: Add more Search Heads.
Use load balancing between Universal Forwarders and Indexers for better distribution.
Splunk supports both scaling methods, but horizontal scaling is better for long-term growth.
Use version control (like Git) to track and manage configuration changes across teams.
Test changes in staging or dev environments before applying them to production.
Keep a change log for all manual edits made to .conf files.
Validate your configuration changes using Splunk's btool command to debug configuration merging:
splunk btool inputs list --debug
Organize apps clearly:
Use prefixes like TA_ for Technical Add-ons (data inputs only)
Use SA_ for shared libraries or utilities
Use app_ or business names for user-facing apps
| Practice Area | Best Practice Summary |
|---|---|
| Directory Organization | Separate logs, configs, and indexed data |
| Configuration Hierarchy | Use etc/apps/ instead of system/local |
| Time Management | Sync all nodes with NTP servers |
| Scaling | Add Indexers/Search Heads as needed for performance |
| Versioning and Testing | Use Git, test in dev, and document all changes |
| Validation Tools | Use btool and the Monitoring Console to check config status |
Search Head Clustering (SHC) and Indexer Clustering serve fundamentally different purposes, though both are deployed in distributed Splunk architectures.
SHC is a high-availability solution designed to support UI-level redundancy, load balancing, and search distribution.
It focuses on coordinating scheduled searches, replicating knowledge objects, and ensuring that multiple search heads appear as one logical interface to the user.
SHC uses a Captain to coordinate scheduled searches and a Deployer to distribute configuration and apps.
Indexer Clustering is primarily focused on data-level redundancy and high availability of indexed data.
It ensures that multiple copies of raw data and index files are maintained across a cluster of peer indexers, governed by a Cluster Manager.
Key parameters such as Replication Factor (RF) and Search Factor (SF) control the number of data copies and their searchability.
| Feature | SHC | Indexer Clustering |
|---|---|---|
| Focus | Search interface and metadata | Raw data redundancy |
| Key Coordinator | Captain | Cluster Manager (Master Node) |
| Configuration Tool | Deployer | Configuration files (via CLI or UI) |
| Data Stored | None (search head only) | Raw data and indexes |
| Primary Benefit | UI redundancy and search scaling | Data resiliency and durability |
While the Deployment Server (DS) is an effective tool to manage configurations for Universal Forwarders (UFs), security and control over which clients are allowed to connect is critical, especially in large-scale environments.
The following parameters can be configured in serverclass.conf on the Deployment Server to restrict or permit UF access:
whitelist.: Allows matching clients based on clientName, host, or IP address.
blacklist.: Explicitly denies access from specified clients.
Examples:
[serverClass:windows_clients]
whitelist.0 = host::win*
[serverClass:restricted]
blacklist.0 = ip::10.0.0.50
This access control helps prevent unauthorized forwarders from enrolling and receiving apps or configurations.
The Splunk License Master is responsible for monitoring the volume of daily indexed data and enforcing license limits.
If your Splunk environment exceeds the licensed daily indexing volume for five non-consecutive days in a 30-day window, you will enter a "License Violation" state.
In violation mode:
User searches (non-admin) are disabled.
Only admin-level accounts can run searches for remediation.
Violations do not delete data, but will halt scheduled reports and alerting for regular users.
Use the Monitoring Console (MC) under License Usage to visualize indexed volume per sourcetype, index, or forwarder.
Alternatively, monitor the log file:
$SPLUNK_HOME/var/log/splunk/license_usage.log
While Splunk Cloud provides a scalable and managed solution, it introduces restrictions compared to on-prem Splunk Enterprise.
No direct access to shell or file system, which restricts:
Running custom Python scripts not validated by Splunk.
Using btool for deep configuration inspection.
Limited access to certain apps or add-ons, especially those requiring:
File system interaction
Shell scripting or custom binaries
Apps deployed must go through App Vetting to be accepted in Splunk Cloud.
Questions may ask “Which feature is not available in Splunk Cloud?”, so remember that low-level debugging and certain scripted inputs are not supported.
A common exam trap is the misuse of Deployment Server in Search Head Clustering.
Deployment Server is NOT supported for SHC members.
You must use a Deployer to push configuration bundles (apps, dashboards, saved searches) to SHC nodes.
If you try to manage SHC members using a Deployment Server:
Configuration may become inconsistent across the cluster.
Captain election and knowledge object replication may fail.
Supportability and upgrade paths may be broken.
Clarify the functional separation of SHC and Indexer Clustering under deployment models.
Emphasize access control for Universal Forwarders via serverclass.conf using whitelist/blacklist.
Reinforce licensing consequences, especially the 5-day violation rule and admin-only search fallback.
Detail Splunk Cloud limitations, especially around btool, unsupported apps, and scripting.
Highlight the incompatibility of Deployment Server with SHC, and the necessity of using a Deployer.
When should a Splunk deployment transition from a standalone instance to a distributed architecture with indexers and search heads?
A Splunk deployment should transition to a distributed architecture when data ingestion volume, user concurrency, or search workloads exceed the performance limits of a single instance.
Standalone deployments are suitable for small environments, typically for development or low-volume workloads. As data ingestion grows or multiple users run concurrent searches, CPU, memory, and disk I/O contention increases. Distributed architecture separates responsibilities: indexers handle data ingestion and storage, while search heads handle query processing. This separation improves scalability and performance. It also enables high availability features such as indexer clustering and search head clustering. A common scaling pattern is first separating the search head from the indexer role, followed by introducing multiple indexers with clustering to support higher ingestion rates and redundancy. Failure to separate roles in larger environments often leads to search delays and indexing bottlenecks.
Demand Score: 65
Exam Relevance Score: 72
In a distributed Splunk deployment, why should the Cluster Manager not run on the same host as the Search Head?
The Cluster Manager should run on a dedicated host because it manages cluster operations and must remain independent from search workloads.
The Cluster Manager is responsible for coordinating indexer clustering tasks such as bucket replication, fix-ups, and configuration bundle distribution. Search heads, on the other hand, process user searches and dashboards, which can generate unpredictable resource spikes. If both roles run on the same host, search activity can impact cluster management operations. This may delay replication factor enforcement or cluster recovery tasks. Operational separation ensures that cluster management remains stable even during heavy search usage. Best practice architecture therefore assigns the Cluster Manager to its own instance to maintain cluster health and reliability while allowing search heads to scale independently.
Demand Score: 60
Exam Relevance Score: 70
What architectural benefit does separating search heads and indexers provide in a growing Splunk environment?
Separating search heads and indexers improves scalability by isolating search workloads from indexing operations.
Indexers focus on ingesting, parsing, and storing incoming data streams. These tasks require sustained disk throughput and CPU resources. Search heads execute search queries, perform knowledge object processing, and coordinate distributed search. If both roles run on the same system, heavy search activity can slow indexing pipelines, potentially causing ingestion delays. By separating these tiers, indexing performance remains stable while search capacity can scale independently by adding additional search heads. This architecture also supports features such as search head clustering and distributed search, enabling organizations to support larger user bases and higher data volumes without impacting ingestion reliability.
Demand Score: 61
Exam Relevance Score: 68