Core Components
Pacemaker
Pacemaker is a high-availability cluster resource manager. At its core, Pacemaker is a distributed finite state machine capable of co-ordinating the startup and recovery of inter-related services across a set of machines.
Pacemaker supports a number of resource agent standards (LSB init scripts, OCF resource agents, systemd unit files, etc.) to manage any service, and can model complex relationships among them (colocation, ordering, etc.).
Pacemaker supports advanced service configurations such as groups of dependent resources, cloned resources that must be active on multiple machines, resources that can switch between two different roles, and containerized services.
Corosync
Corosync APIs provide membership (a list of peers), messaging (the ability to talk to processes on those peers), and quorum (do we have a majority) capabilities to projects such as Pacemaker that need to be cluster-aware.
libQB
libqb is a library with the primary purpose of providing high-performance, reusable features for client/server applications, including high-performance logging, tracing, IPC, and polling.
Resource Agents
Resource agents are the abstraction that allows Pacemaker to manage services it knows nothing about. They contain the logic for what to do when the cluster wishes to start, stop or check the health of a service.
This particular set of agents conform to the Open Cluster Framework (OCF) specification. A guide to writing agents is also available.
Fence Agents
Fence agents are the abstraction that allows Pacemaker to isolate badly behaving nodes, by either powering off the node or disabling its access to common resources. The fence-agents project provides fence agents for commonly used fence devices, including intelligent power and network switches, IPMI, popular cloud services, virtualization hosts, and shared storage access.
OCF specification
The Open Cluster Framework specification is a set of standards for cluster components. Currently, only the resource agent standard is in use.
Configuration Tools
Pacemaker's internal configuration format is XML, which is great for machines but terrible for humans.
The community's best minds have created command-line and graphical interfaces to hide the XML and allow the configuration to be viewed and updated in a more human-friendly format.
crm shell
The original configuration shell for Pacemaker. Written and actively maintained by SUSE, it may be used either as an interactive shell with tab completion, for single commands directly on the shell's command line, or as a batch mode scripting tool.
Hawk
Hawk is a web-based GUI for managing and monitoring Pacemaker HA clusters. It is generally intended to be run on every node in the cluster, so that you can just point your web browser at any node to access it. There is a usage guide at hawk-guide.readthedocs.io, and it is documented as part of the SUSE Linux Enterprise High Availability Extension documentation
LCMC
The Linux Cluster Management Console (LCMC) is a GUI with an innovative approach for representing the status of and relationships between cluster services. It uses SSH to let you install, configure, and manage clusters from your desktop.
pcs
pcs provides both a command-line tool and Web-based GUI for managing the complete life cycle of all cluster components, including Pacemaker, Corosync, QDevice, SBD, and Booth.
pygui
The original GUI for Pacemaker, written in Python by IBM China. It is no longer actively developed.
Striker
Striker is the user interface for the Anvil! (virtual) server platform and the ScanCore autonomous self-defence and alert system.
Other Add-ons
booth
The Booth cluster ticket manager extends Pacemaker to support geographically distributed clustering. It does this by managing the granting and revoking of 'tickets' which authorizes one of the cluster sites, potentially located in geographically dispersed locations, to run certain resources.
sbd
SBD provides a node fencing mechanism through the exchange of messages via shared block storage such as for example a SAN, iSCSI, FCoE. This isolates the fencing mechanism from changes in firmware version or dependencies on specific firmware controllers, and it can be used as a STONITH mechanism in all configurations that have reliable shared storage. It can also be used as a pure watchdog-based fencing mechanism.