2.4 Explain the key concepts of high availability for servers.
📘CompTIA Server+ (SK0-005)
High Availability is about keeping servers and services running without interruption, even when failures or maintenance happen. In IT environments, downtime can be very costly, so HA ensures that users, applications, and other servers can continue to function reliably.
One of the main ways to achieve HA is through clustering.
Clustering
A cluster is a group of servers (also called nodes) that work together as a single system to provide services. If one server fails, another can take over, minimizing downtime.
There are different types of clustering setups:
1. Active-Active Cluster
- In an active-active cluster, all servers are running and handling requests at the same time.
- Each server in the cluster shares the workload, which improves performance as well as availability.
- If one server fails, the other active servers continue to handle the workload without downtime.
Example in IT:
- A web server farm with two servers running a website. Both servers process requests at the same time. If Server A fails, Server B continues serving all users without interruption.
Pros:
- Maximum resource usage (no idle servers)
- High performance and load distribution
Cons:
- More complex to set up and maintain
- Requires good network and storage synchronization
2. Active-Passive Cluster
- In an active-passive cluster, one server is active and handles all requests, while the other server(s) remain idle (passive) until the active server fails.
- When the active server fails, the passive server takes over automatically. This process is called failover.
Example in IT:
- A database server where Server A is active and handles queries. Server B is passive and only starts working if Server A fails.
Pros:
- Simpler to set up than active-active
- Failover ensures continuity
Cons:
- Passive servers are idle most of the time (less efficient use of resources)
- Slight delay during failover
3. Failover
Failover is the process where a passive server automatically takes over services from a failed server.
- It is usually automatic, but can sometimes be manual.
- Critical for high availability.
Example in IT:
- If a file server fails, a secondary server immediately begins serving files so users don’t experience downtime.
4. Failback
Failback is the process of returning services to the original server once it is repaired or restored.
- Ensures that the system returns to its normal configuration after an issue is resolved.
Example in IT:
- Server A failed and Server B took over. Once Server A is fixed, services can be moved back from Server B to Server A.
5. Proper Patching Procedures
In high availability setups, patching must be handled carefully to avoid downtime:
- Patch one server at a time in a cluster.
- Make sure other servers remain active to handle requests.
- Verify functionality before patching the next server.
Example in IT:
- In an active-passive database cluster, patch the passive server first. Then failover services to it. Finally, patch the original active server.
6. Heartbeat
A heartbeat is a regular signal exchanged between cluster nodes to check if servers are alive and healthy.
- If a node stops sending heartbeats, the cluster assumes it has failed and triggers failover.
Example in IT:
- Two web servers in a cluster send heartbeat signals every few seconds. If Server A stops sending a heartbeat, Server B automatically takes over.
Summary Table
| Concept | What It Means | IT Example |
|---|---|---|
| Active-Active | All servers active and share workload | Web server farm serving users simultaneously |
| Active-Passive | One server active, others idle until failover | Database primary server and standby server |
| Failover | Automatic switch to backup server when primary fails | File server backup takes over if main server fails |
| Failback | Return services to the repaired original server | Services moved back to original database server after repair |
| Proper Patching | Updating servers without causing downtime | Patch passive node first, failover, then patch original node |
| Heartbeat | Regular signals between servers to monitor health | Web servers exchange heartbeat to detect failures |
Exam Tip
- Know the differences between active-active and active-passive clusters.
- Understand failover vs. failback.
- Remember the role of heartbeats in detecting server failure.
- Be able to describe proper patching procedures in a clustered environment.
