This knowledge area covers the practical, hands-on skills required to deploy, configure, and operate a VMware-based infrastructure. Even in an operations-focused exam, you must understand not only what to configure but why and how each component works together.
Before installing ESXi or building a VMware platform, you must confirm:
Server model is supported
Check VMware’s Hardware Compatibility Guide (HCL/VCG).
NICs, storage controllers, disks are supported
Incompatible components may cause boot failures, performance issues, or data loss.
Firmware and drivers
Must match VMware-supported versions.
Firmware mismatches can cause PSODs (purple screen of death) or instability.
vSphere Lifecycle Manager (vLCM) can automate firmware compliance (depending on vendor add-ons).
Before deployment, verify:
VLAN availability
Management network
vMotion network
vSAN network
Storage network (NFS/iSCSI)
IP ranges
Each ESXi host needs multiple VMkernel IPs depending on features.
vCenter, NSX Managers, and Edge nodes need static IPs.
Routing
Management plane networks must be routable to vCenter.
NSX overlay transport network requires an underlay with proper routing.
MTU settings
vSAN, vMotion, and NSX overlay networks often require jumbo frames (MTU 9000).
Mismatched MTU causes packet drops and performance degradation.
These foundational services must be correct and consistent:
DNS
Forward and reverse lookup must work for all nodes.
Incorrect DNS breaks vCenter installation and host joining.
NTP
Time must be synchronized across ESXi, vCenter, NSX, and other systems.
Time drift affects authentication, logs, cluster stability.
Certificates
Required for secure communication between components.
vCenter uses VMCA; can integrate with enterprise CAs.
Time sync across nodes
Common ESXi installation methods:
ISO installation — manual installation using physical or virtual media.
PXE boot / scripted install — scalable automated deployments.
Auto Deploy
Stateless ESXi
Hosts boot from network and load configuration profiles.
Ideal for large-scale environments.
Local installation
ESXi installed on local disk, SD card, or boot device.
Persistent configuration.
Stateless installation (Auto Deploy)
ESXi boots fresh each time; config delivered via Host Profiles.
Easier for mass configuration but requires reliable network boot.
vCenter Server Appliance (vCSA):
Deploy OVA/installer
Configure management IP
Specify storage size
Initialize SSO domain
Register ESXi hosts
Sizing depends on number of hosts/VMs:
Tiny / Small / Medium / Large profiles
Memory, vCPU, and disk size vary accordingly
In older versions:
External PSC supported multi-site deployments.
Now PSC functionality is embedded, simplifying architecture.
vCenter can integrate with:
Active Directory (Integrated Windows Authentication)
LDAP directories
SAML/OIDC identity providers
MFA-enabled enterprise IdPs
This enables centralized RBAC and compliance.
Steps include:
Deploy NSX Manager nodes (1 or 3 for redundancy).
Form a management cluster.
Register NSX with vCenter.
Prepare ESXi hosts as transport nodes.
Deploy Edge nodes for routing, NAT, VPN, load balancing.
Steps:
Verify disk controller and disk compatibility.
Create disk groups (cache + capacity).
Enable vSAN on the cluster.
Run performance and health checks.
Apply storage policies and verify compliance.
Add hosts into vCenter inventory.
Apply licensing.
Set up host profiles for standardization.
A cluster enables:
HA (High Availability) — VM restart after host failure.
DRS — load balancing across hosts.
EVC
Ensures CPU compatibility across hosts.
Required for vMotion across different CPU generations.
Capture configuration from a reference host.
Apply profile to maintain consistency (especially useful with Auto Deploy).
VSS (vSphere Standard Switch)
Configured per-host.
Good for small environments.
VDS (vSphere Distributed Switch)
Centralized across hosts.
Required for advanced features (NSX, network I/O control).
Each ESXi host needs VMkernel ports for:
Management traffic
vMotion
vSAN
iSCSI/NFS storage
Fault Tolerance logging
Replication traffic (vSphere Replication)
Each VMkernel port may require specific VLANs, MTU, and NIC teaming.
NIC teaming provides redundancy and load balancing:
Failover order
Active / standby NIC configuration.
Load balancing policies
Route based on originating port
Route based on IP hash
Route based on physical NIC load (LBT)
Proper teaming reduces risk of outages.
SAN (FC/iSCSI)
Zoning (FC)
Initiator/target configuration (iSCSI)
NAS (NFS)
Mount NFS shares as VM datastores.
Ensure proper MTU and multipathing.
Datastores are created using:
VMFS (block)
NFS (file)
vSAN (object-based)
Enable cluster-wide vSAN.
Create disk groups.
Define storage policies (RAID, FTT).
Validate health and baseline performance.
Define which hosts participate in overlay networks.
Map VDS uplinks to transport nodes.
NSX provides:
Segments (L2 networks)
T1 Gateways for distributed routing
T0 Gateways for uplink connectivity
Edge clusters for advanced services
Create base security rules: “deny-all” or “allow-required”.
Apply micro-segmentation using VM tags, groups, services.
Templates standardize VM creation.
Content libraries allow:
Versioned templates
ISO storage
Automatic synchronization across vCenters
Cloning creates identical VM copies.
Snapshots save VM state (temporary—not backups).
Customization specs allow unique OS settings (hostname, IP, SID reset).
Administering virtual machines includes:
Power on/off/reset
vSphere Tools upgrade
Guest OS patches
Monitoring performance
Create custom roles for least-privilege access.
Assign permissions at VM, folder, cluster, or datacenter level.
Use:
Active Directory
LDAP
SAML/OIDC providers
This supports enterprise RBAC and MFA.
Review logs for unusual access.
Track permission changes.
Ensure compliance with security standards.
Used when:
Patching hosts
Replacing hardware
Troubleshooting physical issues
DRS automatically evacuates VMs (if enabled).
Performed through:
vSphere Lifecycle Manager
Vendor-specific firmware integration
Rolling maintenance cycles
Upgrade workflow:
Upgrade one host
Validate
Proceed to next host
Maintain cluster availability throughout
Back up:
VM images
vCenter configuration
NSX configuration
vSAN metadata (implicitly via cluster redundancy)
Verify:
File-level recovery
Full VM recovery
Application-consistent recovery
vCenter/NSX restore procedures
Defines:
ESXi version
Vendor drivers
Firmware levels (if vendor add-on supports it)
Image-based management ensures consistency across hosts.
Includes:
Pre-checks (compatibility, hardware health)
Staged/rolling updates
Remediation per-host or per-cluster
Must ensure compatibility between:
vCenter
ESXi
NSX
vSAN
Hardware firmware/drivers
VMware publishes a compatibility matrix.
Recommended order:
vCenter
ESXi hosts
NSX Manager / Edge nodes
vSAN components and disk format upgrades
Upgrading in wrong order may cause cluster outages.
Verify:
DNS and NTP
Cluster health
vSAN health
Capacity
Certificates
Backups
Staged: Download updates first → apply later.
Immediate: Download and apply in one step.
If the upgrade fails:
Roll back snapshots (for management VMs)
Restore vCenter backups
Use ESXi bootbank rollback (previous version)
Revert NSX upgrades if supported
Securing ESXi hosts ensures the integrity of the virtual infrastructure and reduces attack surface. Hardening must balance security with operational usability.
Secure Boot configuration
Secure Boot validates all ESXi boot components, drivers, and kernel modules.
The firmware checks the signatures of ESXi binaries, ensuring only trusted images are loaded.
If unsigned or tampered modules are present, ESXi will refuse to boot.
Secure Boot must be supported by the server’s UEFI firmware and enabled both at BIOS and ESXi levels.
Lockdown Mode (Disabled, Normal, Strict)
Lockdown Mode restricts direct access to ESXi hosts.
Disabled mode allows full access including SSH and DCUI.
Normal mode allows DCUI access for recovery but restricts SSH and direct host connections.
Strict mode disables DCUI entirely, permitting only vCenter-mediated access.
Lockdown Mode enforces controlled administrative boundaries and helps achieve compliance.
Host firewall rule management
The ESXi built-in firewall regulates access to management agents such as hostd, vpxa, and NTP services.
Administrators can modify allowed IP ranges or enable only the necessary rules.
Firewall misconfiguration can block vMotion, vSAN, or other operational functions, making rule management a critical part of hardening.
Certificate replacement workflow
ESXi uses machine certificates to authenticate with vCenter and other services.
Certificate replacement may be required for security compliance or expiration handling.
Replacement must be done using the ESXi certificate management tools or via vCenter’s certificate authority (VMCA) to avoid trust failures.
ESXi services lifecycle management (SSH, ESXi Shell, CIM)
SSH and ESXi Shell provide powerful troubleshooting capabilities but should remain disabled unless needed.
CIM services expose hardware monitoring; they should run only if required by monitoring tools.
Service lifecycle management ensures minimal attack surface while allowing operational flexibility.
Multipathing provides storage resiliency and improves throughput across storage fabrics.
Native Multipathing (NMP) vs vendor plug-ins
NMP is VMware’s default multipathing framework, supporting Storage Array Type Plug-ins (SATPs) and Path Selection Policies (PSPs).
Vendor plug-ins provide array-specific logic, advanced failover behavior, and optimized performance. Examples include PowerPath/VE or custom SATPs.
Path Selection Policies (Fixed, MRU, Round Robin)
Fixed policy uses a preferred path; failover occurs when it becomes unavailable.
MRU (Most Recently Used) uses the last active path but does not automatically revert to the original.
Round Robin distributes I/O evenly across paths to improve performance, especially for active-active arrays.
Path failover behavior
Failover occurs when the active path experiences errors or becomes unreachable.
Failover timing depends on array response, SATP behavior, and detection mechanisms such as SCSI sense codes.
Storage I/O troubleshooting fundamentals
Key troubleshooting indicators include:
Latency (device, kernel, and guest levels)
Queue depth saturation
Path flapping events
Storage array controller performance
Tools such as vCenter performance charts and esxtop provide detailed insight.
DRS ensures balanced resource usage and respects placement policies across clusters.
Resource Pools operational guidelines
Resource pools should be used for organizational grouping or workload prioritization, not as folders.
Improper nesting and unbalanced reservations can lead to unexpected VM throttling.
CPU and memory reservation impact on cluster behavior
Reservations guarantee minimum resources but reduce overall flexible capacity.
Large reservations constrain DRS placement decisions and reduce HA admission capacity.
VM and Host affinity/anti-affinity considerations
Affinity rules keep VMs together; anti-affinity keeps redundant VMs apart.
Host affinity rules can restrict DRS flexibility and must be used carefully to avoid fragmentation.
Maintenance mode evacuation logic with DRS
When a host enters maintenance mode, DRS attempts to relocate all VMs while honoring reservations and affinity rules.
Misconfigured rules or insufficient resources can prevent full evacuation.
Proactive monitoring ensures continuous performance and availability.
Key performance metrics (CPU Ready, Co-Stop, Memory latency, Storage latency)
CPU Ready indicates CPU contention.
Co-Stop reflects scheduling delays for multi-vCPU VMs.
Memory latency signals memory pressure, ballooning, or compression.
Storage latency measures response time at device, kernel, and VM levels.
Custom alarm creation
Custom alarms allow alerting on thresholds such as unusual VM behavior, network drops, or storage anomalies.
Proper thresholds and scoping ensure actionable and non-noisy alerting.
vCenter health alarms behavior
vCenter monitors internal services and dependencies, including its database, certificates, and appliance health.
Health alarms help identify configuration drift and failures early.
Syslog and remote log collector integration
Centralized logging enables correlation across ESXi, vCenter, NSX, and storage systems.
Remote collectors support audit and compliance requirements and simplify troubleshooting.
Operational networking tasks in NSX support troubleshooting and visibility.
Traceflow
Traceflow injects synthetic packets through NSX to show the exact path and identify where packets are dropped or allowed.
Useful for validating firewall and routing behavior.
Port Mirroring
Port mirroring sends traffic copies to analyzer tools for packet inspection.
Supports both local and remote mirroring for deeper traffic visibility.
NSX Upgrade Coordinator workflows
Upgrade Coordinator automates upgrade sequencing across NSX components, including Manager clusters, Edge nodes, and transport nodes.
It validates dependencies, performs pre-checks, and orchestrates rolling upgrades.
Transport Node troubleshooting
Common issues include VTEP misconfiguration, MTU mismatches, and host preparation failures.
Transport Node status must be validated across NSX Manager, ESXi, and the physical network.
Distributed Firewall rule conflict analysis
Rules are processed top-down, and conflicts or shadowing may occur.
Administrators must analyze rule hit counts, section hierarchy, and group membership to identify conflicts.
vSAN operations revolve around storage health, data placement, and lifecycle management.
Disk replacement workflows
Replacing cache or capacity disks triggers object rebuilds.
Proper disk evacuation prevents data loss and ensures seamless replacement.
vSAN Object Repair Timer
The delay timer dictates when vSAN begins to repair absent objects after transient failures.
This prevents unnecessary rebuilds during short network or host outages.
Proactive rebalance
Rebalancing redistributes components across disk groups to reduce hot spots or asymmetric storage usage.
vSAN performance service dashboards
Dashboards provide insight into:
Disk group latency
IOPS per node and per object
Congestion metrics
These metrics guide capacity planning and troubleshooting.
vSAN Encryption and KMS integration
vSAN encrypts data at the cluster level, using KMS-managed keys.
Encryption is performed at the storage-device boundary, providing consistent behavior across nodes.
VCF introduces an additional layer of lifecycle and infrastructure management.
SDDC Manager password rotation workflows
Password rotation ensures all infrastructure credentials remain compliant with security policy.
SDDC Manager synchronizes password updates across vCenter, NSX, ESXi, and internal services.
Certificate rotation workflows
Certificates must be rotated periodically to maintain trust and security.
VCF orchestrates rotation across components while preserving service continuity.
Workload Domain creation and expansion
Workload Domains provide isolated compute and lifecycle boundaries.
Expansion adds hosts or creates new clusters based on network pool and storage configuration.
VCF LCM error handling and log collection
Lifecycle Manager logs and bring-up logs provide insight into upgrade failures, version mismatches, or configuration drift.
Proper log collection speeds troubleshooting and reduces remediation time.
Network Pool configuration and updates
Network Pools define the VLANs and IP ranges used for host TEPs, Edge Node connectivity, and host commissioning.
Updating Network Pools requires careful dependency consideration.
VCF system backup and restore
System backups include SDDC Manager configuration, vCenter Server, NSX Manager cluster, and vSAN metadata (implicitly).
Restore procedures must follow strict sequencing to prevent inconsistencies.
Reliable backup and restore strategies ensure recoverability across platform and application layers.
vCenter restore sequencing (Stage 1 and Stage 2)
Stage 1 deploys the vCenter appliance with basic configuration.
Stage 2 restores data from the backup into the appliance.
Incorrect sequencing leads to partial or failed recovery.
NSX Manager cluster restore requirements
All nodes must be restored from backups taken at the same time.
Federation and Edge clusters require particular ordering and validations to avoid inconsistent state.
Application-consistent vs crash-consistent restore selection
Application-consistent restores require in-guest quiescing and ensure transactional integrity.
Crash-consistent restores behave like power-on after power loss and may be acceptable for stateless workloads.
Snapshot chain management and consolidation tasks
Long snapshot chains degrade performance and risk corruption.
Consolidation merges snapshot deltas back into the base disk.
Troubleshooting consolidation failures requires reviewing disk locks, inactive snapshot states, and storage latency.
What prerequisite validation does Cloud Builder perform before deploying the VCF management domain?
Cloud Builder validates network configuration, host compatibility, DNS resolution, and NTP synchronization.
Before initiating deployment, Cloud Builder performs a comprehensive validation of the environment to ensure the VCF stack can be successfully deployed. These checks verify that ESXi hosts meet hardware compatibility requirements and that network settings such as VLANs, IP pools, and gateways are correctly configured. It also validates external dependencies including DNS and NTP because these services are required for communication between deployed components. If any prerequisite fails validation, the deployment process stops before infrastructure is created. This validation stage prevents partially deployed environments that could be difficult to troubleshoot later.
Demand Score: 85
Exam Relevance Score: 92
What process is used to add new ESXi hosts to an existing workload domain in VMware Cloud Foundation?
Hosts must first be commissioned in VCF Operations management and then assigned to the workload domain.
Host commissioning is the process of preparing ESXi hosts so they can be used within VMware Cloud Foundation. Administrators add hosts to the VCF Operations management inventory where the system validates compatibility, networking configuration, and firmware levels. Once commissioned, the hosts can be assigned to workload domains where they join clusters managed by vCenter. This automated process ensures that hosts meet VCF requirements before becoming part of production clusters. Directly adding hosts to vCenter without commissioning bypasses VCF lifecycle management and is not supported. Commissioning ensures the platform can properly manage updates and lifecycle operations for those hosts.
Demand Score: 79
Exam Relevance Score: 90
Why should administrators perform lifecycle updates through VCF Operations instead of updating components individually?
Because VCF Operations ensures version compatibility and orchestrates the correct upgrade sequence across the entire stack.
Updating individual components such as vCenter, NSX, or ESXi outside of VCF lifecycle workflows can introduce version mismatches that break compatibility within the software-defined data center stack. VCF Operations maintains a validated bill of materials for the platform and orchestrates upgrades in the correct order. The system also performs pre-checks to verify cluster health, resource availability, and configuration compliance before beginning upgrades. This automation significantly reduces the risk of operational failures during maintenance windows. In enterprise environments, using the centralized lifecycle management system is considered best practice and ensures that the infrastructure remains supported by VMware.
Demand Score: 81
Exam Relevance Score: 91
What is the purpose of the management domain in VMware Cloud Foundation?
The management domain hosts the infrastructure components required to operate and manage the VCF platform.
The management domain is the first domain deployed during the VCF bring-up process. It contains core services such as vCenter Server, NSX management components, and VCF Operations management tools. These services are responsible for monitoring, lifecycle management, and orchestration of the entire environment. By isolating management components in a dedicated domain, VMware ensures operational stability and reduces the risk that application workloads could interfere with platform management services. Workload domains are then created to host tenant or application workloads. This architecture improves reliability and simplifies operational management of the software-defined data center.
Demand Score: 83
Exam Relevance Score: 88
What happens if pre-checks fail during a lifecycle upgrade in VMware Cloud Foundation?
The upgrade process halts and reports the failed validation so administrators can resolve the issue before continuing.
Lifecycle upgrades in VCF include automated validation checks to ensure the environment is ready for upgrades. These pre-checks evaluate cluster health, host connectivity, resource availability, and component compatibility. If a validation fails, the upgrade workflow stops and generates an error message identifying the issue. Administrators must correct the problem—such as resolving host connectivity issues or freeing cluster resources—before reattempting the upgrade. This safeguard prevents upgrades from proceeding in unstable environments, which could cause service outages or corrupted infrastructure states.
Demand Score: 78
Exam Relevance Score: 87
Why are cluster health checks important before performing infrastructure changes in VCF?
Cluster health checks ensure the infrastructure is stable and capable of handling maintenance operations without service disruption.
Before performing upgrades, host maintenance, or configuration changes, administrators must verify that clusters are healthy and have sufficient capacity. Health checks examine factors such as host connectivity, datastore accessibility, network stability, and vSAN status. If clusters are already experiencing degraded performance or hardware failures, performing additional maintenance could worsen the situation and potentially cause outages. VCF lifecycle workflows include automated health validations for this reason. Ensuring a healthy cluster state helps maintain workload availability during maintenance windows and reduces operational risk.
Demand Score: 77
Exam Relevance Score: 84