Infrastructure Planning: Resource Planning

Infrastructure Planning: Resource Planning Detailed Explanation

This topic focuses on how to choose and size the hardware and infrastructure that will run Splunk efficiently. Planning the right resources is critical for ensuring good performance, scalability, and reliability.

1. Key Infrastructure Components

Splunk is a distributed system, and different components (like Search Heads and Indexers) have different resource needs. Let’s look at the key hardware and infrastructure considerations.

a. CPU & Memory

Search Heads: CPU-Bound

Search Heads are mainly responsible for running searches and dashboards.
This involves parsing data, generating visualizations, and computing stats.
Therefore, they rely heavily on CPU power.

Best Practice: Use servers with multiple cores (8+), especially if you expect high search concurrency.

Indexers: I/O-Bound and Memory-Intensive

Indexers handle the storage of data and respond to search queries.
They read/write a lot of data to disk and need fast I/O (disk performance).
Also, they use significant RAM for caching and indexing.

Best Practice: Provide plenty of memory (16–64 GB+) and high-speed disks (SSDs) to ensure smooth performance.

b. Storage

Splunk indexes consume a lot of disk space, and storage performance directly affects system speed. Splunk organizes storage into hot, warm, cold, and frozen buckets — each stage has different storage needs.

Hot/Warm Buckets

Active and recent data.
Must support fast read/write operations.
SSD (Solid-State Drives) are recommended for speed.

Cold Buckets

Older data, searched less often.
Slower disks (HDD) are acceptable to save cost.

Frozen Buckets

Data is either deleted or archived to external storage (like AWS S3).
Not stored on the main Splunk system.

IOPS Planning

IOPS = Input/Output Operations Per Second.
In high-volume environments (hundreds of GB/day), Splunk recommends IOPS > 800.

c. Network Bandwidth

Splunk components talk to each other over the network — this includes:

Forwarders sending data to Indexers.
Indexers sharing data in a cluster.
Search Heads querying Indexers.

Best Practices:

Use low-latency, high-throughput networks — at least 1 Gbps, preferably 10 Gbps for large deployments.
For Indexer Clustering, replication of data (to meet RF and SF) consumes a lot of bandwidth.
If using Multi-site Clustering, ensure strong bandwidth between sites.

2. Resource Sizing

Sizing means figuring out how many and what type of servers you'll need based on your expected data volume, user activity, and architecture complexity.

a. Indexer Sizing

Rule of Thumb:

1 indexer per 100–300 GB/day of incoming data.

The actual number depends on:

Type of data (some formats are heavier).
Search load (how often and how complex are the searches).
Whether you're using replication (each piece of data is stored multiple times).

Example:

If you expect to ingest 900 GB/day, you should plan for 3 to 9 indexers.
Use more if:
- Searches are complex.
- You want faster response times.
- You use higher replication factors.

b. Search Head Sizing

Search Heads run searches, dashboards, alerts, and reports. Their size depends mostly on how many users are running simultaneous searches.

Rule of Thumb:

1 Search Head per 8–10 concurrent users.

If you expect:

30 users to run heavy searches at the same time → plan for 3–4 SHs.
For high availability or load balancing, use a Search Head Cluster (SHC).

c. Cluster Master, Deployment Server (DS), and License Master

These components don't handle large amounts of data directly, so they don’t need heavy hardware in small or test environments.

Best Practice:

Use lightweight VMs (2–4 vCPU, 8–16 GB RAM) for:
- Cluster Master (manages indexer cluster)
- License Master (tracks data volume)
- Deployment Server (manages forwarders)

However, in production environments, you should:

Add redundancy (high availability/failover).
Monitor these nodes for latency or failures.
Place them on separate machines from data-processing roles.

Infrastructure Planning: Resource Planning (Additional Content)

1. SmartStore and Its Impact on Resource Planning

SmartStore is a modern data storage architecture in Splunk that separates hot/warm data (local) from cold data (remote) by utilizing object storage (e.g., AWS S3, GCP GCS, Azure Blob Storage) as the backend for cold buckets.

Key Implications:

Reduced local disk pressure on indexers:
- Since cold buckets are offloaded to object storage, the need for large volumes of high-speed local storage is greatly reduced.
Increased network and throughput requirements:
- Searches on cold data require fetching data on-demand from remote object storage.
- This introduces potential latency and requires high-throughput network connectivity (often 10 Gbps+).

Best Practice:

When deploying SmartStore, increase network bandwidth provisioning and consider potential search latency impacts due to object storage read delays.

Why it matters:
SmartStore shifts the architectural bottleneck from disk I/O to network I/O, especially for long-term search workloads. Resource planning must account for this change.

2. Virtualization vs. Bare Metal for Core Roles

While Splunk supports running on virtual machines (VMs), careful consideration is needed for production deployments—especially for performance-sensitive components like indexers and search heads.

Comparison Overview:

Deployment Type	Use Case	Notes
Virtual Machines	Suitable for test/dev, low-throughput environments	Flexible but limited IOPS/CPU consistency
Bare Metal / High-IOPS Cloud Hosts	Recommended for indexers and SHs in production	Provides consistent performance, especially under load

Best Practice:

For production, avoid placing indexers and search heads on virtual machines unless I/O performance (IOPS) and CPU cycles are guaranteed by the underlying platform.

Why it matters:
Indexers perform continuous write operations and heavy disk usage; Search Heads execute large memory-bound distributed searches. Virtualization overhead can impact reliability.

3. Monitoring Console as a Resource Planning Tool

The Monitoring Console (MC) is a built-in Splunk app used for real-time performance visibility across the entire deployment.

Key Roles:

Visualizes:
- CPU usage, disk I/O, search concurrency, memory pressure
Alerts on:
- Indexer or search head overload
- Delays in scheduled search execution
Assists in:
- Capacity planning
- Identifying bottlenecks before they cause system impact

Best Practice:

Enable and configure the Monitoring Console immediately after deployment. Use it to regularly assess system health and make informed resource scaling decisions.

Why it matters:
Ongoing visibility into resource utilization is critical for identifying scaling needs, especially as data volume, user count, and search concurrency grow over time.

Shopping cart

Subtotal:

SPLK-2002 Infrastructure Planning: Resource Planning

Detailed list of SPLK-2002 knowledge points

Infrastructure Planning: Resource Planning Detailed Explanation

1. Key Infrastructure Components

a. CPU & Memory

Search Heads: CPU-Bound

Indexers: I/O-Bound and Memory-Intensive

b. Storage

Hot/Warm Buckets

Cold Buckets

Frozen Buckets

IOPS Planning

c. Network Bandwidth

Best Practices:

2. Resource Sizing

a. Indexer Sizing

Rule of Thumb:

Example:

b. Search Head Sizing

Rule of Thumb:

c. Cluster Master, Deployment Server (DS), and License Master

Best Practice:

Infrastructure Planning: Resource Planning (Additional Content)

1. SmartStore and Its Impact on Resource Planning

Key Implications:

Best Practice:

2. Virtualization vs. Bare Metal for Core Roles

Comparison Overview:

Best Practice:

3. Monitoring Console as a Resource Planning Tool

Key Roles:

Best Practice:

Frequently Asked Questions

Product Center

Exam Categories

Support & Community