This is the first and most important step in designing any server solution. Before you choose any hardware or software, you need to fully understand what the customer needs, both technically and from a business perspective.
You must ask the right questions and listen carefully to your customer. You’re not just asking what technology they want, but why they need it.
These are non-technical goals that affect the server design:
Cost Optimization
Can the customer afford the solution?
Can we reduce total cost over time?
Think about CAPEX (upfront cost) and OPEX (ongoing costs).
Compliance Requirements
For example:
GDPR in Europe (data privacy)
HIPAA in the US (healthcare data)
You may need to store data within certain countries or use encryption.
Availability / Uptime
How much downtime is acceptable? (Usually, the answer is: very little!)
This affects choices like redundant power, failover, and high-availability clusters.
Security and Data Protection
Is data encrypted at rest and in transit?
Do they need backup and recovery?
Can we detect unauthorized access?
These are the actual performance and capacity needs of the system.
Compute Needs
Number of CPU cores, threads, and clock speed.
How much RAM is required?
Use for web servers, databases, etc.
Storage Requirements
Capacity: How much space do they need?
IOPS (Input/Output Operations per Second): How fast do they need to read/write data?
Throughput: How much data moves per second (MB/s or GB/s)?
Network Bandwidth
How fast does the server need to send/receive data?
1GbE is basic; 10GbE, 25GbE, or more for large workloads.
High Availability (HA) and Disaster Recovery (DR)
If one server fails, is another ready to take over?
DR means data and apps can be recovered if the whole site goes down.
Virtualization Support
Are they using VMware, Hyper-V, or KVM?
This affects hardware compatibility and sizing.
When planning a server solution, who you talk to matters. Not everyone has the same goals.
You need to talk to all the key people (called stakeholders):
IT Administrators
Know the current infrastructure and technical requirements.
Will be responsible for managing the servers.
Application Owners
Understand how their apps use resources (CPU, memory, I/O).
Will define how critical performance and uptime are.
Finance/Procurement Teams
Handle budgeting and purchasing.
Must understand the return on investment (ROI).
End Users
May not be technical, but their experience matters.
Example: “The system is slow” may point to IOPS or CPU issues.
As a designer, your job is to balance the needs of all these people.
Constraints are things that limit your options.
Rack space: Not all solutions fit in all racks.
Power: Does the building or data center have enough power?
Cooling: Do they have proper airflow and HVAC?
Budget Limitations: May force you to find the most cost-effective solution.
TCO (Total Cost of Ownership): What will the server cost over its lifetime (energy, support)?
ROI: Will the investment pay off by saving time, increasing productivity, or reducing errors?
Data residency: Must data be stored within a certain country?
Encryption: Is hardware or software encryption required?
Auditability: Are logs needed for compliance reviews?
Once you’ve gathered the customer’s requirements, the next step is to figure out how big and powerful the solution needs to be. This process is called sizing — and it includes performance planning, redundancy, and making sure the system can grow.
Before you choose any servers or storage, you must understand what kinds of workloads the customer runs. Each type of workload behaves differently.
Web Services
Lightweight but sensitive to latency.
Needs stable networking and low CPU overhead.
DBMS (Database Management Systems)
Heavy on I/O and memory.
Requires fast storage and sometimes HA clustering.
File/Print Servers
High storage needs.
Moderate CPU and memory.
Virtualization (VMware, Hyper-V, KVM)
Multiple virtual machines (VMs) share physical resources.
You must size for total combined needs: CPU, RAM, storage, bandwidth.
AI/ML and HPC (High Performance Computing)
Require GPUs or many CPU cores.
Need high memory bandwidth and parallel compute power.
To size a system properly, track these four key metrics:
CPU Utilization (%)
How heavily the processor is used.
Important for apps that do calculations or run scripts.
Memory Consumption (GB)
How much RAM the workload needs at peak usage.
Avoid oversubscription to prevent crashes or slowness.
Storage IOPS
IOPS = Input/Output Operations Per Second.
Crucial for databases and real-time applications.
Storage Latency and Throughput
Latency = time delay in storage responses (measured in ms or µs).
Throughput = volume of data transferred over time (MB/s or GB/s).
Network Throughput
Amount of data flowing through the NICs.
High-throughput apps need 10GbE or higher.
HPE provides special tools to help you size solutions accurately.
HPE ProLiant Sizer for VMware vSphere
Helps plan ProLiant server configurations for virtualized environments.
You enter the number of VMs, their workloads, and the tool recommends CPU, memory, storage, and networking.
HPE Right Mix Advisor
Helps choose between on-prem and cloud workloads.
Evaluates performance, cost, and risk.
These are simple models or benchmarks to help estimate performance needs:
Rule of Thumb Sizing
Uses standard estimates like:
4 vCPUs per app server
8GB RAM per database
Good for quick estimates, but not precise.
Benchmark Standards
SPEC CPU: Measures processor performance.
PassMark: Common online CPU and memory benchmarks.
TPC (Transaction Processing Council): For databases and transactional systems.
These benchmarks help compare different hardware options.
Your design should be resilient. That means it must keep working even if part of it fails. Here's how to plan for that.
N+1
One additional component (power supply, fan, server) is available as a backup.
Example: If you need 4 servers to run the system, have 5 (1 is a spare).
N+N
Fully mirrored systems — if one half fails, the other keeps working.
More expensive, but more reliable.
Dual Power Supplies
Each server has two power inputs.
Connected to different power sources (PDU A and PDU B).
Dual Network Paths
Each server has two NICs connected to different switches.
Ensures network availability even if one switch fails.
RAID 1 – Mirroring
RAID 5 – Striping with parity
RAID 6 – Double parity
RAID 10 – Mirror + Stripe
These ensure services continue running even if one server fails.
HPE Serviceguard
VMware HA (High Availability)
Windows Failover Clustering
Once you've identified the requirements and planned the system size, it’s time to design the architecture — this means choosing the right servers, storage, and network configurations. This section helps you build a system that performs well, is scalable, and easy to maintain.
Choosing the right server form factor and features is critical for meeting the needs of your design.
Rack Servers
Example: HPE ProLiant DL380
Fit in standard data center racks (usually 1U or 2U).
Best for general-purpose computing and virtualization.
Great for medium to large deployments.
Tower Servers
Example: HPE ProLiant ML350
Look like desktop towers.
Quiet and easy to deploy in office environments.
Ideal for small offices or branch locations.
Blade Servers
Example: HPE BladeSystem
Installed into enclosures like HPE c7000.
Space-saving and power-efficient.
Ideal for data centers with high-density requirements.
Choose based on performance needs:
Intel Xeon or AMD EPYC processors.
Some servers support dual-socket or quad-socket CPUs for more cores.
Higher core count = better for virtualization or parallel processing tasks.
Calculate based on:
Number of virtual machines.
Database size.
Application memory usage.
Some servers support NVDIMM or NVDIMM-P:
Non-Volatile DIMMs — they keep data even after power loss.
Great for databases and mission-critical apps.
Allows you to add:
GPUs (for AI/ML)
Network cards (1/10/25/40/100GbE)
RAID controllers
Fibre Channel HBAs
Important for customizing the server to match application needs.
Hot-plug components (disks, PSUs, fans) can be replaced without shutting down the server.
Very useful in environments where uptime is important.
A good design includes enough storage capacity, performance, and redundancy.
SAS: Enterprise-grade, reliable, good for performance.
SATA: Lower cost, good for archive or low-IO workloads.
NVMe: Extremely fast; used for caching or high-performance apps.
M.2 drives: Small form factor SSDs often used for boot drives.
SAN (Storage Area Network):
High-performance block-level storage.
Uses Fibre Channel or iSCSI.
NAS (Network Attached Storage):
File-level storage accessed over TCP/IP.
Simpler to manage but usually slower than SAN.
HPE Primera and Nimble:
HPE’s enterprise-grade storage arrays.
Offer high availability, InfoSight integration, deduplication, and more.
vSAN (by VMware):
Uses local disks to create shared storage in a VMware cluster.
Cost-effective alternative to SAN.
HPE SimpliVity storage:
Built into HPE hyperconverged systems.
Includes backup and deduplication features.
Servers need strong, fast, and redundant network connections — especially for virtualization and storage access.
1GbE: Basic standard; fine for management or light workloads.
10GbE: Common in virtualization and storage networks.
25/40/100GbE: Used in high-performance computing, large-scale data transfers.
NIC Teaming / Bonding:
Combines two or more network interfaces for:
Increased bandwidth
Failover protection
LACP (Link Aggregation Control Protocol):
Redundant Uplinks:
Virtual Connect:
A module in HPE BladeSystem or Synergy.
Simplifies network management by abstracting physical connections.
FlexFabric Modules:
Provide high-speed, converged networking (LAN + SAN).
Used in Synergy to manage compute and storage connectivity in one place.
Modern IT environments rely heavily on virtualization (running multiple virtual machines on a single physical server) and cloud services. Your server design must fully support these technologies — both for current use and future scalability.
A hypervisor is software that runs virtual machines (VMs) on a physical server. HPE servers must be compatible and optimized for the hypervisors your customer is using.
VMware ESXi
Industry standard for enterprise virtualization.
ProLiant servers are certified and optimized for VMware.
Supports vCenter, vMotion, vSAN, DRS, HA.
Microsoft Hyper-V
Integrated into Windows Server.
Great for Windows-centric environments.
Supports clustering, live migration, and SCVMM integration.
Linux KVM
Open-source, cost-effective.
Common in cloud and DevOps environments.
Nutanix (on HPE ProLiant DX Series)
Hyperconverged platform combining compute, storage, and virtualization.
Uses AHV (Acropolis Hypervisor) or VMware.
Preloaded on certified ProLiant DX servers.
Design Tip: Always check the hypervisor compatibility list for the chosen HPE server model (via VMware HCL, Microsoft HCL, etc.).
When designing for a virtualized environment, there are specific features and requirements you must plan for:
What it is: Move a VM from one physical host to another without downtime.
Why it's important: Ensures availability during maintenance or failure.
Automatically moves VMs between hosts for:
Load balancing
Energy savings (power off unused hosts)
If one host fails, affected VMs are automatically restarted on another host.
Requires shared storage or a hyperconverged setup like SimpliVity or vSAN.
Plan for:
vSwitch or Distributed vSwitch setup
VLAN tagging and isolation
NIC teaming (for redundancy)
Storage vMotion: Move a VM’s disk to a different datastore without downtime.
vVols (Virtual Volumes):
HPE storage (e.g., Nimble, Primera) supports vVols.
Allows per-VM storage management, better snapshots and backups.
A hybrid cloud uses a mix of on-premises infrastructure and public cloud. HPE offers several tools to help customers integrate with the cloud.
Cloud experience delivered on-prem.
Pay-per-use pricing.
Scales like the cloud but runs in the customer’s data center.
Supports workloads like SAP HANA, VMware, VDI, AI/ML, etc.
Cloud-based block storage.
Replicate or back up data to the cloud (AWS, Azure, GCP).
Easily move volumes between cloud and on-prem.
Backup as a Service: Automatically backup virtual machines and files to the cloud.
Disaster Recovery as a Service: Spin up systems in the cloud if your site goes down.
Design Tip: When planning for hybrid cloud, you must consider:
Security and compliance for offsite data.
Bandwidth and latency.
Integration with identity and access control (e.g., Active Directory, IAM).
A well-designed server solution isn’t just about performance — it must be secure, easy to manage, and ready for automation. This section focuses on tools and technologies that help manage infrastructure and protect it from threats.
Efficient server management is key to reducing manual work, increasing uptime, and ensuring firmware compliance. HPE provides powerful tools to manage servers from setup to retirement.
What is it?
Key Features:
Role-Based Access Control (RBAC)
Assign roles like admin, operator, viewer.
Controls what users can see or change.
Template-Based Provisioning
Create Server Profiles that include:
BIOS settings
Network/storage connections
Firmware baseline
Apply the profile to any server to configure it automatically.
Reduces errors and speeds up deployment.
Firmware Compliance
Monitor and enforce that all servers meet the same firmware standard.
Detects outdated or inconsistent versions.
Monitoring and Alerts
Get real-time system health updates.
Integrates with tools like SNMP, email alerts, or SIEM systems.
What is it?
Key Capabilities:
Remote Console
View and control the server remotely (like a virtual keyboard/mouse).
Access even if the OS isn’t running.
Secure Boot
Verifies firmware integrity during power-up.
Prevents boot if the firmware is tampered with.
Hardware Health Monitoring
Intelligent Power Management
Monitor and optimize power usage.
Helps with cooling planning and cost control.
Two-Factor Authentication and Directory Integration
Security is a major concern for all IT systems. HPE servers offer built-in security features that help protect data, firmware, and user access.
A hardware-level fingerprint that verifies the integrity of server firmware at boot time.
If tampered firmware is detected, the system halts the boot process.
Prevents firmware-based attacks (e.g., ransomware injected into BIOS).
When a server is decommissioned or repurposed, you can perform a secure erase of drives.
Ensures that no residual data is left on HDDs, SSDs, or NVMe drives.
Complies with data protection regulations (e.g., GDPR).
Use directory-based accounts (e.g., Active Directory) or local iLO users.
Set password policies and expiration rules.
Enable two-factor authentication (2FA) for extra protection.
iLO and OneView can validate firmware signatures.
If an update fails or causes issues:
Rollback to the last known good firmware version.
Avoids server downtime due to bad updates.
Best Practices Summary:
| Area | Best Practice |
|---|---|
| Access Control | Use RBAC and directory integration (LDAP/AD) |
| Patch Management | Maintain firmware baseline across all servers |
| Physical Security | Rack locks, secure rooms |
| Network Security | Use separate VLANs for iLO and production traffic |
| Data Protection | Use secure erase before disposal or reuse of drives |
Even the best-designed server solution is incomplete without documentation. This step ensures that everything you've planned can be clearly communicated to technical teams, procurement departments, and support engineers.
This is a visual blueprint of your solution. It should include both logical and physical elements.
Shows how components interact.
Include:
Virtual machines and their roles (e.g., DB, web, app)
Network segmentation (management, production, storage)
Storage architecture (vSAN, SAN, NAS)
Shows actual server and cable placement in racks.
Include:
Server names and models
IP addresses
Cable runs (to switches, storage, power)
Rack elevation (server position in rack)
Numbering for PDU connections
A well-documented layout helps:
Avoid cabling errors
Speed up deployment
Support troubleshooting
A BOM lists everything needed to build and deploy your solution. This goes to procurement teams and ensures the project is fully scoped.
Server Models and SKUs
Add-ons
CPUs (exact models and quantity)
RAM (type, speed, quantity)
Storage (type: SSD/HDD, interface: SAS/SATA/NVMe)
Network cards or GPUs
RAID controllers
Power supplies (standard or redundant)
Rail kits and mounting kits
Software Licenses
Management & Support
iLO Advanced license
HPE Insight Remote Support
HPE InfoSight integration (for supported platforms)
Services
HPE Pointnext (installation, setup, and lifecycle services)
On-site support, 4-hour or next-business-day SLA
Tip: Every item in the BOM must include a part number (SKU) and description, so the procurement team can easily place orders.
A good design should also show the cost across the entire server lifecycle, not just the purchase price.
Upfront cost of:
Hardware
Licenses
Installation and setup
Ongoing costs like:
Electricity and cooling
Support contracts
Staff to manage servers
Licensing renewals
Example: VMware often licenses per CPU socket, so dual-CPU servers may cost more in licenses.
Microsoft Windows Server may be licensed per core.
Be aware of OEM licenses, subscription models, and bundled support.
| Deliverable | Purpose |
|---|---|
| Architecture Diagrams | Help IT and cabling teams understand how everything fits |
| BOM List | Helps procurement buy exactly the right parts |
| Cost Estimate | Helps decision-makers understand upfront and long-term costs |
| Security & Management Plan | Ensures the system is secure, compliant, and easy to run |
Even with the best planning tools, poor assumptions or oversight can lead to costly performance issues, downtime, or underutilized infrastructure. Below is a breakdown of frequent design errors, the impact, and how to avoid them.
Design does not include redundant power supplies or fails to consider rack-level cooling limits.
Unexpected shutdowns
Component damage due to thermal stress
Limited uptime during utility or HVAC failures
High risk of system failure during power interruptions or cooling degradation — especially in high-density deployments or hot climates.
Always design with dual PSUs, each connected to separate PDUs or UPS circuits (PDU A / B)
Use iLO or OneView to monitor power draw and thermal margins
Validate rack airflow direction (front-to-back) and avoid mixing airflow patterns
Factor in ASHRAE environmental standards (18°C–27°C temp, 40–60% humidity)
Sizing based on total vCPU or memory requirement, without considering oversubscription limits.
Slow VM performance
Contention for CPU or RAM
Host crashes under peak load
Performance bottlenecks in production environments, especially under load surges or during VM migrations.
Use realistic consolidation ratios (e.g., 4:1 vCPU to pCPU for general workloads, lower for DB or high IOPS systems)
Plan for headroom (20–30%) to accommodate spikes or failover
Avoid memory oversubscription unless using memory ballooning or swapping features wisely
Use HPE ProLiant Sizer for VMware vSphere to validate VM density and host capacity
Designing with too few NICs or ports, assuming minimal network throughput.
Backup jobs fail or run slowly
vMotion or iSCSI traffic congests production VLANs
No redundancy in case of NIC or switch failure
Traffic collisions, latency spikes, and single points of failure for key workloads (e.g., storage, backup, live migration)
Use dedicated NICs or vNICs (via FlexFabric or Virtual Connect) for:
Management
Backup
VM traffic
Storage (iSCSI/NFS/Fibre Channel over Ethernet)
Implement NIC teaming (Windows) or bonding (Linux) for redundancy and load balancing
Provision minimum 10GbE uplinks for virtualization clusters or storage-heavy apps
Sizing only for current needs, without capacity for growth or change.
Need to replace hardware after a short period
Project delays due to procurement or datacenter constraints
Wasted investment and higher TCO due to repeated upgrades or premature obsolescence.
Include a growth buffer (20–30%) for CPU, RAM, disk, and NIC usage
Use modular platforms (e.g., HPE Synergy, DL380 Gen11 with PCIe expansion) to allow scaling without forklift upgrades
Document scaling limits per rack, chassis, or platform
Selecting storage based on capacity only, not performance.
Slow database queries
High latency for VMs or apps
Long RAID rebuild times
I/O bottlenecks that hurt critical workloads, especially for databases or virtual desktops
Estimate IOPS and latency requirements per workload
Choose RAID levels wisely:
RAID 10 for write-heavy workloads
RAID 6 for large-capacity arrays
Consider NVMe SSDs or HPE Smart Cache for hot data acceleration
Use HPE SSA or Intelligent Provisioning to configure RAID with appropriate stripe size and cache policies
Choosing hardware or firmware without checking OS, hypervisor, or application compatibility.
Unsupported drivers
Firmware-induced crashes
Failed OS installs
Project delays, support calls, and possible security exposure
Check HPE Server QuickSpecs and VMware/Microsoft HCLs
Use SPP (Service Pack for ProLiant) for validated firmware bundles
Always test with a pilot system before scaling deployment
| Pitfall | Preventive Action |
|---|---|
| Power/cooling ignored | Dual PSUs, thermal design, iLO sensor checks |
| Virtualization overcommitment | Realistic VM-to-host ratio, capacity headroom |
| Too few NICs | Separate traffic types, enable teaming/bonding |
| No growth margin | Design for 12–24 months expansion |
| Storage design flawed | Match IOPS to RAID type, use SSD where needed |
| Compatibility issues | Cross-check firmware/OS/hypervisor matrix |
What is the minimum number of disks required for RAID 10 in an HPE server?
Four disks.
RAID 10 combines RAID 1 (mirroring) and RAID 0 (striping). Because data must first be mirrored and then striped across mirrored pairs, the minimum configuration requires two mirrored sets. Each mirror requires two disks, so at least four disks are necessary. RAID 10 is widely recommended for enterprise workloads because it offers strong read/write performance and high fault tolerance. Even if one disk fails in each mirrored pair, the array can continue operating. For exam scenarios, RAID 10 is typically selected when the workload requires high I/O performance and strong reliability, such as databases or virtualization environments.
Demand Score: 88
Exam Relevance Score: 96
Why might RAID 10 be preferred over RAID 5 for virtualization workloads?
Because RAID 10 provides better write performance and faster rebuild times.
Virtualization workloads generate many simultaneous disk writes from multiple virtual machines. RAID 5 requires parity calculations for every write operation, which increases latency and reduces write performance. RAID 10 does not use parity and instead mirrors data directly, allowing faster writes and more predictable performance. Additionally, RAID 10 rebuilds only involve copying data from the surviving mirror disk, while RAID 5 rebuilds require reconstructing parity across all disks. This makes RAID 10 faster and safer during disk failure scenarios. As a result, RAID 10 is often recommended for enterprise virtualization hosts and high-performance applications.
Demand Score: 86
Exam Relevance Score: 94
When designing an HPE server for virtualization workloads, what resource typically becomes the most critical sizing factor?
Memory (RAM).
Virtualization hosts run multiple virtual machines simultaneously, and each VM requires dedicated memory resources. While CPU resources can be overcommitted to some extent using hypervisor scheduling, memory overcommitment is more limited and can cause severe performance degradation if not managed carefully. Insufficient RAM leads to memory swapping or ballooning, which dramatically reduces VM performance. For this reason, server architects usually prioritize large memory capacities when designing virtualization servers. In certification scenarios, if a question describes a system hosting many virtual machines, selecting configurations with higher memory capacity is typically the correct design decision.
Demand Score: 84
Exam Relevance Score: 92
Why is it common practice to place the operating system on RAID 1 in HPE server deployments?
RAID 1 provides redundancy with minimal disk requirements and sufficient performance for OS workloads.
Operating systems generally require relatively small amounts of storage but must remain available for the server to boot and function. RAID 1 mirrors data across two disks, ensuring that the system can continue operating even if one disk fails. This configuration provides redundancy while using only two disks, making it cost-effective and simple to manage. Because operating systems typically generate moderate disk activity compared to application data, RAID 1 performance is usually sufficient. In enterprise server designs, RAID 1 is commonly used for OS partitions, while higher-performance RAID levels such as RAID 10 are used for application or database storage.
Demand Score: 80
Exam Relevance Score: 90
When designing storage for a database workload, which RAID level is most commonly recommended?
RAID 10.
Database workloads generate heavy random read and write operations. RAID 10 provides excellent performance because it stripes data across mirrored disks, enabling parallel reads and writes without the overhead of parity calculations. This results in lower latency and higher throughput compared to RAID 5 or RAID 6. Additionally, RAID 10 offers strong fault tolerance and faster rebuild times, which reduces the risk of performance degradation during disk failures. For mission-critical database systems, these advantages often outweigh the reduced usable storage capacity caused by mirroring. In certification exams, RAID 10 is frequently the preferred design choice for high-performance database environments.
Demand Score: 82
Exam Relevance Score: 93