Architecture and Technologies

Architecture and Technologies Detailed Explanation

1. Avi Load Balancer Architecture Overview

1.1 Key Design Philosophy

These are the core ideas behind how the Avi Load Balancer works. Think of them as the "rules" that shaped how the system was built.

1.1.1 Distributed Architecture: Separation of Control and Data Planes

In traditional systems, everything is managed in one place, which can lead to bottlenecks and failures.

But Avi uses a distributed architecture, meaning:

The Control Plane (the brain) is separated from the
Data Plane (the part that moves the traffic).

This separation has big advantages:

Better performance
Easier scaling
More reliability

Think of it like an airport:

The tower (control plane) gives instructions.
The planes (data plane) actually move passengers (data).

They work together, but do different jobs.

1.1.2 Software-Defined: Everything is Managed via APIs

Avi is a software-defined load balancer.

This means:

It’s controlled entirely by software, not hardware appliances.
You can configure everything using APIs or a web interface.

For example:

Want to add a new virtual IP? Use an API call.
Want to scale out traffic? The controller can do that automatically.

This is part of the trend toward infrastructure as code.

1.1.3 Elastic Scale-Out: Grows and Shrinks Based on Traffic

Let’s say your application suddenly gets a lot of traffic.

In older systems, it might crash or slow down.
In Avi, the system can automatically add more capacity (more Service Engines) to handle the load.

And when traffic is low?

It can scale down to save resources.

This is called elastic scaling, and it makes Avi very efficient.

1.1.4 Multi-Cloud Support: Run Anywhere

Avi can run:

In your private data center
On VMware vSphere
In public clouds like:
- AWS (Amazon)
- Microsoft Azure
- Google Cloud Platform (GCP)

This means you can have one system to manage all your application traffic — no matter where it's hosted.

1.2 Control Plane vs. Data Plane

1.2.1 Control Plane – Handled by the Avi Controller

Think of the Avi Controller as the brain of the load balancer.

Its jobs:

Manage configuration: Set up how things work.
Manage policies: Control security, behavior, routing.
Collect and analyze metrics: Track latency, errors, etc.
Create and manage Service Engines (we’ll explain these soon).
Provide API and UI interfaces: You interact with the system through the controller.

Important:

You typically have 3 controller nodes for High Availability (so it doesn't go down if one fails).

1.2.2 Data Plane – Managed by Service Engines (SEs)

Now let’s look at the Service Engines (SEs).

These are the parts that actually handle traffic.

Their job is to:

Accept traffic from users (clients).
Load balance that traffic to backend servers.
Handle encryption (SSL/TLS), routing, compression, and more.

In simple terms:

Avi Controllers decide what should happen.
Service Engines do what the Controller says.

This separation makes the system:

More reliable
Easier to scale
Easier to manage

2. Core Components

In this part, we will look more closely at the three most important parts of the Avi Load Balancer system:

Avi Controller – The central brain
Service Engines (SEs) – The workers that handle real traffic
Virtual Services (VS) – The services that users connect to

2.1 Avi Controller

What is it?

A virtual machine (VM) that runs the control logic.
Deployed as a cluster of 3 nodes (for high availability).

What does it do?

Think of the Controller as the central command center.

It handles:

Policy Management
- Defines how traffic should be handled.
- Example: "Send traffic from the US to Server Group A, and traffic from Europe to Server Group B."
Configuration Management
- Stores all the setup: IP addresses, SSL certificates, pools, etc.
Lifecycle of SEs (Service Engines)
- Automatically creates, starts, scales, or deletes SEs as needed.
Monitoring and Analytics
- Collects performance data, health metrics, logs.
- Shows graphs and alerts in the UI.
User Interfaces
- You can manage Avi using:
  - A Web UI (very user-friendly)
  - A REST API (for automation and scripting)

Deployment

Usually deployed in VMware vSphere or a public cloud.
Minimum of 3 nodes is recommended to ensure:
- High availability (HA)
- No single point of failure

2.2 Service Engines (SEs)

Service Engines (SEs) are the components that actually handle the data traffic.

What are SEs?

Lightweight VMs or containers.
Created and controlled by the Controller.
These are the "hands and feet" of the system.

What do they do?

SEs perform the core traffic functions:

Receive incoming requests
- From clients/users trying to reach your applications.
Make smart decisions
- Choose the best backend server (based on load, health, or other rules).
Deliver responses
- Return the correct content back to the user.
Do extra work like:
- SSL termination (decrypt HTTPS)
- Compression
- Caching
- Traffic shaping

Deployment Modes

SEs can be deployed in different high availability (HA) setups:

Mode	Description
Active/Standby	One SE is active; one is backup. If the active fails, the standby takes over.
Active/Active	Multiple SEs handle traffic at the same time (load is shared).
Elastic HA	A pool of SEs that can grow or shrink depending on the traffic needs.

SE Group

SEs are organized into SE Groups.
Each group can have its own:
- Resource limits
- Scaling rules
- Fault domain settings

2.3 Virtual Services (VS)

This is the interface that clients/users see and connect to.

What is a Virtual Service?

A Virtual Service (VS) is a logical object that:
- Has a virtual IP (VIP).
- Accepts traffic from clients.
- Forwards that traffic to one or more pools of backend servers.

You can think of it like a front door:

It listens on an IP and port (e.g., 10.1.1.100:443)
When traffic comes in, it decides where to send it.

Each VS Can Be Configured With:

Pools
- A pool is a group of backend servers.
- Each pool has:
  - Server IPs
  - Ports
  - Health monitors
SSL Profiles
- To handle HTTPS connections:
  - Certificates
  - Supported TLS versions
  - Cipher lists
Health Monitors
- Regular checks (e.g., HTTP GET) to ensure backend servers are up and healthy.
Policies and Rules
- Control traffic based on headers, source IP, cookies, etc.
- For example:
  - Redirect HTTP to HTTPS
  - Block access from certain countries
Analytics Profiles
- Control how much traffic data is collected/logged.

Types of Virtual Services

Type	Use Case Example
L4 (TCP/UDP)	Load balancing raw TCP or UDP traffic (e.g., MySQL)
L7 (HTTP/HTTPS)	Load balancing web traffic with application awareness

3. Technologies and Protocols

This section explains the technical features of Avi — how it handles different kinds of network traffic, how it decides which server to use, how it handles encryption, and how it checks if servers are working properly.

3.1 Supported Protocols

Avi supports multiple network protocols, which are like "languages" computers use to talk to each other.

Layer 4 (Transport Layer) Protocols

These are low-level protocols used for moving raw data.

TCP (Transmission Control Protocol)
- Reliable, ordered delivery
- Used for things like HTTP, SSH, FTP, database connections
- Avi can load balance TCP traffic directly.
UDP (User Datagram Protocol)
- Faster, but less reliable than TCP
- Used for streaming, DNS, VoIP
- Avi can balance UDP traffic too.
FTP (File Transfer Protocol)
- Used for transferring files between systems
- Avi can handle and balance FTP sessions with special handling.

Layer 7 (Application Layer) Protocols

These are protocols for specific applications, like websites and secure connections.

HTTP (HyperText Transfer Protocol)
- Basic web traffic (e.g., http://example.com)
- Avi can read and act on HTTP headers and URLs.
HTTPS (HTTP Secure)
- Encrypted web traffic (e.g., https://example.com)
- Avi can terminate SSL, inspect the content, and forward it securely.
SSL Offloading
- Means Avi decrypts HTTPS traffic, reducing the workload for backend servers.

Advanced Protocols

Avi also supports modern and specialized protocols:

WebSockets
- Used for live updates in web apps (e.g., chat, dashboards)
- Keeps a connection open between client and server.
gRPC
- A high-performance remote procedure call (RPC) system
- Common in microservices environments.
HTTP/2
- Faster and more efficient version of HTTP
- Avi fully supports HTTP/2 with multiplexing.
DNS
- Domain name lookups (e.g., resolving www.google.com to an IP)
- Avi can be used to load balance DNS traffic.
SIP (Session Initiation Protocol)
- Used in VoIP and video conferencing systems
- Avi has basic support.

3.2 Load Balancing Algorithms

Avi can intelligently choose which backend server to send a request to, based on many possible strategies.

Here are the most common:

Round Robin

Sends each new request to the next server in a list.
Example: A → B → C → A → B...

Best for: When all servers are equally powerful.

Least Connections

Sends traffic to the server with the fewest active connections.

Best for: Long-lived connections like streaming or database access.

Least Load

Sends traffic to the server with the lowest CPU or memory usage.
Requires server metrics to be available.

Consistent Hashing

Uses a unique value (like client IP or session ID) to choose the server.
Helps with session persistence — keeps the same client going to the same server.

Fastest Response Time

Monitors how quickly each server replies, and sends traffic to the fastest one.

Best for: Optimizing user experience.

Custom Logic (DataScripts)

You can write custom rules using Lua scripts.
Example: "Send traffic from Europe to Pool A, others to Pool B."

This gives you flexible, programmable traffic control.

3.3 SSL/TLS Capabilities

Modern apps must secure their traffic. Avi provides powerful tools for managing SSL (also called TLS).

SSL Offloading

Avi decrypts HTTPS traffic so backend servers don’t have to.
Improves performance of backend applications.

SSL Passthrough

Avi doesn’t decrypt the traffic — it passes it directly to the backend.
Used when full end-to-end encryption is needed.

SSL Termination and Re-encryption

Avi decrypts traffic, inspects it, and then re-encrypts it before sending to the backend.

Best for: Security and visibility together.

Certificate Management

Avi stores and manages SSL certificates:
- Upload your own certs
- Generate CSRs
- Build certificate chains
Useful for HTTPS and SNI configurations.

Perfect Forward Secrecy (PFS)

A modern security feature: prevents a hacker from decrypting past data even if they get the certificate key.
Supported with modern TLS configurations.

TLS 1.3 Support

The latest version of TLS — faster and more secure.
Avi supports TLS 1.3, including configuration of cipher suites.

SNI (Server Name Indication)

Allows one IP address to host multiple HTTPS websites with different SSL certs.
Avi uses SNI to direct traffic based on hostname (e.g., site1.com vs. site2.com).

3.4 Health Monitoring

Avi constantly checks if your backend servers are healthy and reachable.

Common Health Monitor Types

ICMP (Ping): Is the server online?
TCP: Can a connection be opened?
HTTP/HTTPS: Can the server return a 200 OK page?
DNS: Is the DNS server responding?
LDAP/RADIUS: For directory or authentication server checks.

Active Monitoring

Avi sends regular probes (e.g., every 5 seconds) to each server.

Passive Monitoring

Avi watches real traffic and reacts if it sees:
- Many timeouts
- Error codes
- Connection failures

Custom Monitors

You can create custom health checks (e.g., check for a specific response or keyword).

4. Analytics and Telemetry

Avi isn’t just a load balancer — it’s also an observability platform. That means it helps you see what’s happening in real time — in your apps, your network, your performance — and troubleshoot problems faster.

4.1 Real-Time Application Insights

Avi gives deep visibility into your application’s performance and health. This means it can track every transaction, from the user’s device to your backend server.

Here’s what Avi can show you in real time:

Latency

Measures how long it takes for a user to get a response.
Avi breaks this into:
- Client to SE latency (network delay)
- SE processing time
- SE to server latency (how fast the backend replies)

This helps you identify where the slowness is happening.

Throughput

Shows how much data is flowing through a Virtual Service.
Example: 500 Mbps of HTTPS traffic to app.mycompany.com.

Error Rates

Tracks application and connection errors, such as:
- HTTP 4xx and 5xx errors
- TCP resets
- SSL handshake failures

You can drill down to find:

Which user had the error
Which server caused it
What time it happened

SSL Handshake Statistics

Tracks:
- How many secure connections are made
- How long the TLS handshakes take
- What TLS versions and ciphers are being used

L7 (Layer 7) HTTP Header and Cookie Data

Avi can analyze HTTP headers, cookies, and URLs.
Useful for debugging:
- Session cookies
- Custom headers
- User-agent strings (browser info)

Why It Matters:

These insights help you:

Troubleshoot faster
Identify performance issues
Understand user behavior
Improve reliability

4.2 Logging and Metrics

Avi collects both logs and time-series metrics.

Real-Time Streaming Logs

Every request can be logged live, in real time.
You can stream logs to:
- Syslog servers
- Kafka clusters
- Elasticsearch (ELK stack)
- vRealize Log Insight

You can search logs using:

IP address
URL path
HTTP status code
SSL cipher
Client country (Geo-IP)

Custom Dashboards

The Controller UI includes visual dashboards showing:
- Traffic graphs
- Error trends
- Top clients or URLs
- Health scores over time

You can also build your own dashboards or export the data into tools like Grafana or Datadog.

Integration with External Tools

Avi supports integration with:

Syslog for log forwarding
Kafka for large-scale log ingestion
Elasticsearch for log search and visualization
vRealize Log Insight for VMware-native environments

4.3 Application Performance Monitoring (APM)

Avi includes built-in APM features — no third-party agents needed.

FlightPath – Visual Traffic Tracing

FlightPath is a special tool in the Avi UI that lets you trace a user's request through the entire path:

Client → VIP (Virtual IP)
VIP → SE (Service Engine)
SE → Backend Pool
Response from backend → SE → Client

You can see each hop, latency at each step, and identify where things go wrong — like:

500 error from backend
SSL handshake failure
DNS issue
High latency in network path

Health Score Metrics

Each application gets a Health Score (0 to 100), calculated from:

Latency
Error rates
Server availability
SSL issues

Health scores let you:

See which apps are degraded
Spot issues before users complain
Set alerts (e.g., "alert me if score drops below 80")

Anomaly Detection and Root-Cause Analysis

Avi automatically detects:

Sudden spikes in latency or errors
Unusual patterns in user behavior
Failing backend servers

It tries to explain the cause:

“X% of requests are failing due to TCP resets from backend pool X.”
“SSL errors increased after a certificate change.”

This saves time compared to manually checking logs.

5. High Availability and Scale

In this section, you’ll learn how Avi ensures reliability, handles failures, and scales automatically to meet traffic demand.

These capabilities are critical for running modern, always-on applications that serve real users.

5.1 Controller High Availability (HA)

Why is this important?

The Controller manages:

Configurations
SE provisioning
Analytics
API/UI access

If it fails, your traffic will still flow, but you can’t make changes, collect analytics, or scale.

So you want the Controller to be highly available.

How Avi handles this: 3-Node Controller Cluster

Avi recommends deploying 3 Controller nodes in an HA cluster.

These nodes share configuration and data.
If one fails, the others take over automatically.
They use a quorum-based decision system to avoid split-brain problems.

Quorum: The system needs a majority of nodes (2 out of 3) to be active to function.

Controller HA Benefits:

No single point of failure
Seamless failover if one Controller goes down
High reliability for production environments

5.2 Service Engine High Availability (SE HA Modes)

Now let’s talk about Service Engines (SEs) — the traffic workers — and how they stay available.

Avi offers multiple SE HA modes to suit different application needs.

Mode 1: Active / Standby

One SE handles all the traffic.
A second SE is on standby, watching the first.
If the active SE fails, the standby immediately takes over.

Pros: Simple, reliable
Cons: Half your resources are idle most of the time

Mode 2: Active / Active

Two or more SEs handle traffic together.
Load is distributed across them.
If one fails, the others pick up the load.

Pros: All SEs are used efficiently
Cons: Needs careful planning to avoid overloading

Mode 3: Elastic HA (N + M Model)

This is the most advanced and flexible mode.

You have N active SEs (handling traffic)
You have M spare SEs (available as backups)
If any active SE fails, the system automatically replaces it with a spare

Example:
N = 4 SEs handle traffic
M = 2 backup SEs
If 1 SE fails, a spare SE takes its place — no manual work needed

Pros:

Fully automated

Scalable

No downtime

5.3 Scaling Capabilities

Avi supports both manual and automatic scaling of Service Engines and traffic capacity.

Vertical Scaling

Increase resources on an existing SE:
- More CPUs
- More memory

Useful if your app needs more power, but the number of connections doesn’t change much.

Horizontal Scaling

Add more SEs to a group.
Traffic gets split across more engines.

Best for:

Applications with growing traffic

Environments with spikes or unpredictable loads

Auto-Scaling

Avi can scale SEs automatically, based on:
- CPU usage
- Memory usage
- Network throughput
- Number of connections

Here’s how it works:

If traffic increases and SEs are overloaded → Avi adds more SEs.
If traffic drops → Avi removes unneeded SEs.

Auto-scaling is:

Automatic

Policy-driven

Resource-efficient

Virtual Service Scaling

A single Virtual Service (VS) can be:

Handled by one SE (small apps)
Split across multiple SEs (large-scale apps)

This is called multi-SE VS placement.

Avi automatically handles:

Load distribution
Session persistence
Failover

6. Integration with the VMware Ecosystem

One of the biggest strengths of the Avi Load Balancer (especially since VMware acquired Avi Networks) is how well it integrates with other VMware products.

This makes Avi a natural choice if you're already using VMware technologies like vSphere, NSX-T, or vRealize.

6.1 vSphere / vCenter Integration

If you're running your infrastructure on VMware vSphere, Avi can seamlessly integrate with it.

What does this integration allow?

Automatic SE deployment:
- Avi Controller can talk to vCenter.
- It automatically spins up Service Engines (SEs) as VMs in the correct cluster, datastore, and network.
VM lifecycle management:
- SEs can be powered on, off, deleted, or migrated (vMotion) — all controlled from Avi.
No manual VM setup:
- You don’t need to manually deploy OVAs or configure VMs — the Controller handles it all.

Why is this helpful?

It reduces human error.
Saves time in scaling and operations.
Makes Avi feel like a native part of your vSphere environment.

6.2 NSX-T Integration

NSX-T is VMware’s network virtualization and security platform. Avi integrates with it in several ways.

Typical Use Case:

You use NSX-T for Layer 3 and Layer 4 (routing, firewall), and use Avi for Layer 7 (application-level load balancing and security).

Integration Features:

North-South Load Balancing:
- Avi can serve as the external load balancer in NSX-T environments.
- Clients connect through NSX-T Tier-0 and Tier-1 gateways → traffic goes to Avi → Avi forwards it to apps.
L7 Application Awareness:
- Avi can inspect and make decisions based on:
  - HTTP headers
  - Cookies
  - SSL information
Centralized Management:
- NSX-T Manager can integrate Avi as a service.
- You can orchestrate load balancer operations from within NSX workflows.

Benefits:

Offloads Layer 7 tasks from NSX-T
More flexible load balancing policies
Simplifies complex network setups

6.3 vRealize Suite Integration

VMware’s vRealize Suite is used for operations, logging, automation, and analytics.

Avi integrates with several vRealize products:

vRealize Operations (vROps)

Avi exports metrics like:
- Traffic patterns
- CPU/memory usage
- Health scores
These can be visualized in custom dashboards inside vROps.

vRealize Log Insight

Avi can stream real-time logs to Log Insight.
Includes:
- Access logs
- System events
- Health monitor alerts

This helps with:

Centralized logging
Troubleshooting
Compliance tracking

vRealize Automation (vRA)

Automate the provisioning of:
- Virtual Services
- Pools
- Certificates
Build self-service load balancing portals:
- Example: A developer requests a new web service → vRA automatically sets up a new VS and pool in Avi.

6.4 Benefits of VMware Ecosystem Integration

Let’s summarize the major advantages of these integrations:

VMware Product	Avi Integration Benefit
vSphere/vCenter	Seamless SE deployment and VM lifecycle automation
NSX-T	Offloads Layer 7 load balancing; integrates with Tiered Gateway
vRealize Ops	Deep visibility into app and infrastructure metrics
vRealize Log Insight	Centralized log collection and analysis
vRealize Automation	Full automation of load balancer services for self-service

Architecture and Technologies (Additional Content)

1. Controller and SE Communication Mechanism and Port Requirements

Understanding how the Controller and Service Engines (SEs) communicate is crucial for secure deployment, troubleshooting, and network design.

Protocols Used:

HTTPS (TCP 443): Primary protocol for UI/API access to the Controller.
TCP 5054: Used internally for Controller ↔ SE communication, primarily for configuration sync, control messaging, and status updates.
UDP 123: For NTP (time synchronization).
UDP/TCP 53: DNS resolution for FQDN-based pool members or external lookups.
UDP 514 (Syslog): If integrated with external logging systems.

Network Topology Considerations:

SEs do not require direct internet access unless:
- You're deploying in a cloud environment (AWS/Azure) with public endpoints.
- You're pulling external certificates, updates, or telemetry.
NAT traversal may be necessary in restricted environments. Ensure SEs can initiate outbound connectivity to Controllers.

Firewall Guidelines:

Open necessary TCP/UDP ports between Controllers and SEs.
Allow east-west connectivity between Controller nodes (if clustered).
Ensure return paths for NAT if Controller is behind it.

2. System Behavior When the Controller Is Down

This is a frequently tested scenario in both exams and real-world operations.

Behavior Summary:

Data plane is unaffected:
- SEs continue to handle and forward live traffic.
- Existing Virtual Services (VSs) and SSL sessions remain active.
Control plane is affected:
- No new configuration changes can be applied.
- Analytics, scaling, object creation, and failover do not occur.
- UI/API access becomes unavailable unless it's a multi-node cluster with quorum.

3. Controller Cluster Synchronization and Consistency Mechanism

To ensure HA and data consistency, the Controller cluster operates using a distributed consensus mechanism.

Key Concepts:

A 3-node cluster is the recommended minimum.
Uses an algorithm similar to Raft for leader election and state replication.
One node becomes the leader, while others act as followers.

Split-Brain Prevention:

The cluster uses quorum voting (majority must be active).
2 out of 3 nodes must be healthy for:
- Write operations
- Leader election
If quorum is lost, the system enters read-only mode.

4. System Resource Planning (Deployment Guidance)

Resource planning ensures stable, scalable deployments that pass real-world tests and meet exam criteria.

Controller Sizing (Per Node):

CPU: 8–16 vCPUs (production)
RAM: 32–64 GB
Disk: 250–500 GB SSD recommended

Service Engine (SE) Capacity Guidelines:

Max concurrent connections: 500,000–2 million (based on SE size)
QPS: Up to 100,000 or more (depends on CPU & SSL offload)
Each SE Group can host multiple SEs, automatically scaled.

System Limits:

One Controller can manage:
- ~500 SEs
- ~10,000 Virtual Services (theoretical)
Recommended soft limits are lower for operational efficiency.

Multi-Tenant Planning:

Allocate resources (VS count, analytics retention, SE groups) per tenant.
Use tenant-level role mapping to isolate operational responsibilities.

5. Architecture Scenario Question Strategy (Exam Technique)

In the exam, scenario-based questions require business-to-architecture reasoning. Here's how to tackle them:

Framework for Solving:

Understand the business need:
- Is the requirement about scalability, security, cost, or latency?
Translate into architecture goals:
- Does it require multi-cloud, low latency, or geo-redundancy?
Select the right component mix:
- Choose SE mode (Active/Active, N+M), cloud type (No Access vs Write Access), etc.
Evaluate and eliminate:
- Remove solutions that don’t support the business constraint (e.g., lack of HA or autoscaling).

Example:

Scenario: The company wants to deploy Avi across AWS and on-prem vSphere.
Correct design: Use Hybrid Cloud architecture with separate SE Groups per cloud, and Controller access via public IP or VPN.

6. Deep Dive into Data Path (Control vs Data Plane)

To troubleshoot and design Avi networks, it's essential to understand flow segmentation:

Control Plane:

Client → VIP → SE → Controller
Used only for:
- Configuration sync
- Health checks
- Stats/analytics export

Data Plane:

Client → VIP → SE → Pool Member → SE → Client
Controller is not in the data path.
If SSL termination is enabled:
- Happens at the SE.
- SE handles handshake and re-encrypts to backend if needed.

Special Traffic Handling:

HTTP/2, WebSocket: SEs support these protocols natively.
TCP Proxy: Avi can perform full proxying (not just NAT).
SSL Termination: SE decrypts and analyzes traffic (if configured).

7. Architectural Comparison: Avi vs Traditional Load Balancers

This topic often appears in "Why Avi?" exam questions or customer migration discussions.

Aspect	VMware NSX ALB (Avi)	Traditional LB (e.g., F5)
Model	Software-defined	Hardware appliance
Deployment	Multi-cloud, elastic	Static, on-prem
Scaling	Automatic (N+M model)	Manual or license-based
Management	Central Controller, UI/API	Device-centric CLI/UI
Analytics	Real-time, per-app	Limited, add-on tools
Automation	Full REST API, SDK, Terraform	Basic scripting (iControl)
HA	Distributed SE model	Usually Active/Standby

Shopping cart

Subtotal:

6V0-22.25 Architecture and Technologies

Detailed list of 6V0-22.25 knowledge points