2.4 Explain the key concepts of high availability for servers.

📘CompTIA Server+ (SK0-005)

High Availability is about keeping servers and services running without interruption, even when failures or maintenance happen. In IT environments, downtime can be very costly, so HA ensures that users, applications, and other servers can continue to function reliably.

One of the main ways to achieve HA is through clustering.

Clustering

A cluster is a group of servers (also called nodes) that work together as a single system to provide services. If one server fails, another can take over, minimizing downtime.

There are different types of clustering setups:

1. Active-Active Cluster

In an active-active cluster, all servers are running and handling requests at the same time.
Each server in the cluster shares the workload, which improves performance as well as availability.
If one server fails, the other active servers continue to handle the workload without downtime.

Example in IT:

A web server farm with two servers running a website. Both servers process requests at the same time. If Server A fails, Server B continues serving all users without interruption.

Pros:

Maximum resource usage (no idle servers)
High performance and load distribution

Cons:

More complex to set up and maintain
Requires good network and storage synchronization

2. Active-Passive Cluster

In an active-passive cluster, one server is active and handles all requests, while the other server(s) remain idle (passive) until the active server fails.
When the active server fails, the passive server takes over automatically. This process is called failover.

Example in IT:

A database server where Server A is active and handles queries. Server B is passive and only starts working if Server A fails.

Pros:

Simpler to set up than active-active
Failover ensures continuity

Cons:

Passive servers are idle most of the time (less efficient use of resources)
Slight delay during failover

3. Failover

Failover is the process where a passive server automatically takes over services from a failed server.

It is usually automatic, but can sometimes be manual.
Critical for high availability.

Example in IT:

If a file server fails, a secondary server immediately begins serving files so users don’t experience downtime.

4. Failback

Failback is the process of returning services to the original server once it is repaired or restored.

Ensures that the system returns to its normal configuration after an issue is resolved.

Example in IT:

Server A failed and Server B took over. Once Server A is fixed, services can be moved back from Server B to Server A.

5. Proper Patching Procedures

In high availability setups, patching must be handled carefully to avoid downtime:

Patch one server at a time in a cluster.
Make sure other servers remain active to handle requests.
Verify functionality before patching the next server.

Example in IT:

In an active-passive database cluster, patch the passive server first. Then failover services to it. Finally, patch the original active server.

6. Heartbeat

A heartbeat is a regular signal exchanged between cluster nodes to check if servers are alive and healthy.

If a node stops sending heartbeats, the cluster assumes it has failed and triggers failover.

Example in IT:

Two web servers in a cluster send heartbeat signals every few seconds. If Server A stops sending a heartbeat, Server B automatically takes over.

Summary Table

Concept	What It Means	IT Example
Active-Active	All servers active and share workload	Web server farm serving users simultaneously
Active-Passive	One server active, others idle until failover	Database primary server and standby server
Failover	Automatic switch to backup server when primary fails	File server backup takes over if main server fails
Failback	Return services to the repaired original server	Services moved back to original database server after repair
Proper Patching	Updating servers without causing downtime	Patch passive node first, failover, then patch original node
Heartbeat	Regular signals between servers to monitor health	Web servers exchange heartbeat to detect failures

Exam Tip

Know the differences between active-active and active-passive clusters.
Understand failover vs. failback.
Remember the role of heartbeats in detecting server failure.
Be able to describe proper patching procedures in a clustered environment.