Skip to content

Latest commit

 

History

History
383 lines (272 loc) · 38.6 KB

File metadata and controls

383 lines (272 loc) · 38.6 KB
title SQL Server availability basics for Linux deployments | Microsoft Docs
description
author MikeRayMSFT
ms.author mikeray
manager jhubbard
ms.date 11/27/2017
ms.topic article
ms.prod sql-non-specified
ms.prod_service database-engine
ms.service
ms.component linux
ms.suite sql
ms.custom
ms.technology database-engine
ms.workload On Demand

SQL Server availability basics for Linux deployments

[!INCLUDEtsql-appliesto-sslinux-only]

Starting with [!INCLUDEsssql17-md], [!INCLUDEssnoversion-md] is supported on both Linux and Windows. Like Windows-based [!INCLUDEssnoversion-md] deployments, [!INCLUDEssnoversion-md] databases and instances need to be highly available under Linux. This article covers the technical aspects of planning and deploying highly available Linux-based [!INCLUDEssnoversion-md] databases and instances, as well as some of the differences from Windows-based installations. Because [!INCLUDEssnoversion-md] may be new for Linux professionals, and Linux may be new for [!INCLUDEssnoversion-md] professionals, the article at times introduces concepts that may be familiar to some and unfamiliar to others.

[!INCLUDEssnoversion-md] availability options for Linux deployments

Besides backup and restore, the same three availability features are available on Linux as for Windows-based deployments:

  • Always On Availability Groups (AGs)
  • Always On Failover Cluster Instances (FCIs)
  • Log Shipping

On Windows, FCIs always require an underlying Windows Server failover cluster (WSFC). Depending on the deployment scenario, an AG usually requires an underlying WSFC, with the exception being the new None variant in [!INCLUDEsssql17-md]. A WSFC does not exist in Linux. Clustering implementation in Linux is discussed below in Pacemaker for Always On Availability Groups and failover cluster instances on Linux.

A quick Linux primer

While some Linux installations may be installed with an interface, most are not, meaning that nearly everything at the operating system layer is done via command line. The common term for this command line in the Linux world is a bash shell.

In Linux, many commands need to be executed with elevated privileges, much like many things need to be done in Windows Server as an administrator. There are two main methods to execute with elevated privileges:

  1. Run in the context of the proper user. To change to a different user, use the command su. If su is executed on its own without a username, as long as you know the password, you will now be in a shell as root.

  2. The more common and security conscious way to run things is to use sudo before executing anything. Many of the examples in this article use sudo.

Some common commands, each of which have various switches and options that can be researched online:

  • cd – change the directory
  • chmod – change the permissions of a file or directory
  • chown – change the ownership of a file or directory
  • ls – show the contents of a directory
  • mkdir – create a folder (directory) on a drive
  • mv – move a file from one location to another
  • ps – show all of the working processes
  • rm – delete a file locally on a server
  • rmdir – delete a folder (directory)
  • systemctl – start, stop, or enable services
  • Text editor commands. On Linux, there are various text editor options, such as vi and emacs.

Common tasks for availability configurations of [!INCLUDEssnoversion-md] on Linux

This section covers tasks that are common to all Linux-based [!INCLUDEssnoversion-md] deployments.

Ensure that files can be copied

One thing that anyone using [!INCLUDEssnoversion-md] on Linux should be able to do is copy files from one server to another. This task is very important for AG configurations.

Things like permission issues can exist on Linux as well as on Windows-based installations. However, those familiar with how to copy from server to server on Windows may not be familiar with how it is done on Linux. A common method is to use the command-line utility scp, which stands for secure copy. Behind the scenes, scp uses OpenSSH. SSH stands for secure shell. Depending on the Linux distribution, OpenSSH itself may not be installed. If it is not, OpenSSH will need to be installed first. For more information on configuring OpenSSH, see the information at the following links for each distribution:

When using scp, you must provide the credentials of the server if it is not the source or destination. For example, using

scp MyAGCert.cer username@servername:/folder/subfolder

copies the file MyAGCert.cer to the folder specified on the other server. Note that you must have permissions – and possibly ownership – of the file to copy it, so chown may also need to be employed before copying. Similarly, on the receiving side, the right user needs access to manipulate the file. For example, to restore that certificate file, the mssql user must be able to access it.

Samba, which is the Linux variant of server message block (SMB), can also be used to create shares accessed by UNC paths such as \\SERVERNAME\SHARE. For more information on configuring Samba, see the information at the following links for each distribution:

Windows-based SMB shares can also be used; SMB shares do not need to be Linux-based, as long as the client portion of Samba is configured properly on the Linux server hosting [!INCLUDEssnoversion-md] and the share has the right access. For those in a mixed environment, this would be one way to leverage existing infrastructure for Linux-based [!INCLUDEssnoversion-md] deployments.

One thing that is important is that the version of Samba deployed should be SMB 3.0 compliant. When SMB support was added in [!INCLUDEsssql11-md], it required all shares to support SMB 3.0. If using Samba for the share and not Windows Server, the Samba-based share should be using Samba 4.0 or later, and ideally 4.3 or later, which supports SMB 3.1.1. A good source of information on SMB and Linux is SMB3 in Samba.

Finally, using a network file system (NFS) share is an option. Using NFS is not an option on Windows-based deployments of [!INCLUDEssnoversion-md], and can only be used for Linux-based deployments.

Configure the firewall

Similar to Windows, Linux distributions have a built-in firewall. If your company is using an external firewall to the servers, disabling the firewalls in Linux may be acceptable. However, regardless of where the firewall is enabled, ports need to be opened. The following table documents the common ports needed for highly available [!INCLUDEssnoversion-md] deployments on Linux.

Port Number Type Description
111 TCP/UDP NFS – rpcbind/sunrpc
135 TCP Samba (if used) – End Point Mapper
137 UDP Samba (if used) – NetBIOS Name Service
138 UDP Samba (if used) – NetBIOS Datagram
139 TCP Samba (if used) – NetBIOS Session
445 TCP Samba (if used) – SMB over TCP
1433 TCP [!INCLUDEssnoversion-md] – default port; if desired, can change with mssql-conf set network.tcpport <portnumber>
2049 TCP, UDP NFS (if used)
2224 TCP Pacemaker – used by pcsd
3121 TCP Pacemaker – Required if there are Pacemaker Remote nodes
3260 TCP iSCSI Initiator (if used) – Can be altered in /etc/iscsi/iscsid.config (RHEL), but should match port of iSCSI Target
5022 TCP [!INCLUDEssnoversion-md] - default port used for an AG endpoint; can be changed when creating the endpoint
5403 TCP Pacemaker
5404 UDP Pacemaker – Required by Corosync if using multicast UDP
5405 UDP Pacemaker – Required by Corosync
21064 TCP Pacemaker – Required by resources using DLM
Variable TCP AG endpoint port; default is 5022
Variable TCP NFS – port for LOCKD_TCPPORT (found in /etc/sysconfig/nfs on RHEL)
Variable UDP NFS – port for LOCKD_UDPPORT (found in /etc/sysconfig/nfs on RHEL)
Variable TCP/UDP NFS – port for MOUNTD_PORT (found in /etc/sysconfig/nfs on RHEL)
Variable TCP/UDP NFS – port for STATD_PORT (found in /etc/sysconfig/nfs on RHEL)

For additional ports that may be used by Samba, refer to Samba Port Usage.

Conversely, the name of the service under Linux can also be added as an exception instead of the port; for example, high-availability for Pacemaker. Refer to your distribution for the names if this is the direction you wish to pursue. For example, on RHEL the command to add in Pacemaker is

sudo firewall-cmd --permanent --add-service=high-availability

Firewall documentation:

Install [!INCLUDEssnoversion-md] packages for availability

On a Windows-based [!INCLUDEssnoversion-md] installation, some components are installed even in a basic engine install, while others are not. Under Linux, only the [!INCLUDEssnoversion-md] engine is installed as part of the installation process. Everything else is optional. For highly available [!INCLUDEssnoversion-md] instances under Linux, two packages should be installed at the same time as [!INCLUDEssnoversion-md]: [!INCLUDEssnoversion-md] Agent (mssql-server-agent) and the high availability (HA) package (mssql-server-ha). While [!INCLUDEssnoversion-md] Agent is technically optional, it is [!INCLUDEssnoversion-md]’s scheduler for jobs and is required by log shipping, so installation is recommended. On Windows-based installations, [!INCLUDEssnoversion-md] Agent is not optional.

Note

For those new to [!INCLUDEssnoversion-md], [!INCLUDEssnoversion-md] Agent is [!INCLUDEssnoversion-md]’s built-in job scheduler. It is a common way for DBAs to schedule things like backups and other [!INCLUDEssnoversion-md] maintenance. Unlike a Windows-based installation of [!INCLUDEssnoversion-md] where [!INCLUDEssnoversion-md] Agent is a completely different service, on Linux, [!INCLUDEssnoversion-md] Agent runs in context of [!INCLUDEssnoversion-md] itself.

When AGs or FCIs are configured on a Windows-based configuration, they are cluster-aware. Cluster awareness means that [!INCLUDEssnoversion-md] has specific resource DLLs that a WSFC knows about (sqagtres.dll and sqsrvres.dll for FCIs, hadrres.dll for AGs) and are used by the WSFC to ensure that the [!INCLUDEssnoversion-md] clustered functionality is up, running, and functioning properly. Because clustering is external not only to [!INCLUDEssnoversion-md] but Linux itself, Microsoft had to code the equivalent of a resource DLL for Linux-based AG and FCI deployments. This is the mssql-server-ha package, also known as the [!INCLUDEssnoversion-md] resource agent for Pacemaker.

Use the commands below to install the HA package and [!INCLUDEssnoversion-md] Agent if they are not installed already. Installing the HA package after installing [!INCLUDEssnoversion-md] requires a restart of [!INCLUDEssnoversion-md] for it to be used by [!INCLUDEssnoversion-md]. These instructions assume that the repositories for the Microsoft packages have already been set up, since [!INCLUDEssnoversion-md] should be installed at this point.

Note

If you will not use [!INCLUDEssnoversion-md] Agent for log shipping or any other use, it does not have to be installed, so package mssql-server-agent can be skipped.

RHEL

sudo yum install mssql-server-ha mssql-server-agent
sudo systemctl restart mssql-server

SLES

sudo zypper install mssql-server-ha mssql-server-agent
sudo systemctl restart mssql-server

Ubuntu

sudo apt-get install mssql-server-ha mssql-server-agent
sudo systemctl restart mssql-server

The other optional packages for [!INCLUDEssnoversion-md] on Linux, [!INCLUDEssnoversion-md] Full-Text Search (mssql-server-fts) and [!INCLUDEssnoversion-md] Integration Services (mssql-server-is), are not required for high availability, either for an FCI or an AG.

Pacemaker for Always On Availability Groups and failover cluster instances on Linux

As noted above, the only clustering mechanism currently supported by Microsoft for AGs and FCIs is Pacemaker with Corosync. This section covers the basic information to understand the solution, as well as how to plan and deploy it for [!INCLUDEssnoversion-md] configurations.

HA add-on/extension basics

All of the currently supported distributions ship a high availability add-on/extension, which is based on the Pacemaker clustering stack. This stack incorporates two key components: Pacemaker and Corosync. All the components of the stack are:

  • Pacemaker – The core clustering component, that does things like coordinate across the clustered machines.
  • Corosync – A framework and set of APIs that provides things like quorum, the ability to restart failed processes, and so on.
  • libQB – Provides things like logging.
  • Resource agent – Specific functionality provided so that an application can integrate with Pacemaker.
  • Fence agent – Scripts/functionality that assist in isolating nodes and deal with them if they are having issues.

Note

The cluster stack is commonly referred to as Pacemaker in the Linux world.

This solution is in some ways similar to, but in many ways different from deploying clustered configurations using Windows. In Windows, the availability form of clustering, called a Windows Server failover cluster (WSFC), is built into the operating system, and the feature that enables the creation of a WSFC, failover clustering, is disabled by default. In Windows, AGs and FCIs are built on top of a WSFC, and share tight integration because of the specific resource DLL that is provided by [!INCLUDEssnoversion-md]. This tightly coupled solution is possible by and large because it is all from one vendor.

On Linux, while each supported distribution has Pacemaker available, each distribution can customize and have slightly different implementations and versions. Some of the differences will be reflected in the instructions in this article. The clustering layer is open source, so even though it ships with the distributions, it is not tightly integrated in the same way a WSFC is under Windows. This is why Microsoft provides mssql-server-ha, so that [!INCLUDEssnoversion-md] and the Pacemaker stack can provide close to, but not exactly the same, experience for AGs and FCIs as under Windows.

For full documentation on Pacemaker, including a more in-depth explanation of what everything is with full reference information, for RHEL and SLES:

Ubuntu does not have a guide for availability.

For more information about the whole stack, also see the official Pacemaker documentation page on the Clusterlabs site.

Pacemaker concepts and terminology

This section documents the common concepts and terminology for a Pacemaker implementation.

Node

A node is a server participating in the cluster. A Pacemaker cluster natively supports up to 16 nodes. This number can be exceeded if Corosync is not running on additional nodes, but Corosync is required for [!INCLUDEssnoversion-md]. Therefore, the maximum number of nodes a cluster can have for any [!INCLUDEssnoversion-md]-based configuration is 16; this is the Pacemaker limit, and has nothing to do with maximum limitations for AGs or FCIs imposed by [!INCLUDEssnoversion-md].

Resource

Both a WSFC and a Pacemaker cluster have the concept of a resource. A resource is specific functionality that runs in context of the cluster, such as a disk or an IP address. For example, under Pacemaker both FCI and AG resources can get created. This is not dissimilar to what is done in a WSFC, where you see a [!INCLUDEssnoversion-md] resource for either an FCI or an AG resource when configuring an AG, but is not exactly the same due to the underlying differences in how [!INCLUDEssnoversion-md] integrates with Pacemaker.

Pacemaker has standard and clone resources. Clone resources are ones that run simultaneously on all nodes. An example would be an IP address that runs on multiple nodes for load balancing purposes. Any resource that gets created for FCIs uses a standard resource, since only one node can host a FCI at any given time.

When an AG is created, it requires a specialized form of a clone resource called a multi-state resource. While an AG only has one primary replica, the AG itself is running across all nodes that it is configured to work on, and can potentially allow things such as read-only access. Because this is a “live” use of the node, the resources have the concept of two states: master and slave. For more information, see Multi-state resources: Resources that have multiple modes.

Resource groups/sets

Similar to roles in a WSFC, a Pacemaker cluster has the concept of a resource group. A resource group (called a set in SLES) is a collection of resources that function together and can fail over from one node to another as a single unit. Resource groups cannot contain resources that are configured as master/slave; thus, they cannot be used for AGs. While a resource group can be used for FCIs, it is not generally a recommended configuration.

Constraints

WSFCs have various parameters for resources as well as things like dependencies, which tell the WSFC of a parent/child relationship between two different resources. A dependency is just a rule telling the WSFC which resource needs to be online first.

A Pacemaker cluster does not have the concept of dependencies, but there are constraints. There are three kinds of constraints: colocation, location, and ordering.

  • A colocation constraint enforces whether or not two resources should be running on the same node.
  • A location constraint tells the Pacemaker cluster where a resource can (or cannot) run.
  • An ordering constraint tells the Pacemaker cluster the order in which the resources should start.

Note

Colocation constraints are not required for resources in a resource group, since all of those are seen as a single unit.

Quorum, fence agents, and STONITH

Quorum under Pacemaker is somewhat similar to a WSFC in concept. The whole purpose of a cluster’s quorum mechanism is to ensure that the cluster stays up and running. Both a WSFC and the HA add-ons for the Linux distributions have the concept of voting, where each node counts towards quorum. You want a majority of the votes up, otherwise, in a worst case scenario, the cluster will be shut down.

Unlike a WSFC, there is no witness resource to work with quorum. Like a WSFC, the goal is to keep the number of voters odd. Quorum configuration has different considerations for AGs than FCIs.

WSFCs monitor the status of the nodes participating and handle them when a problem occurs. Later versions of WSFCs offer such features as quarantining a node that is misbehaving or unavailable (node is not on, network communication is down, etc.). On the Linux side, this type of functionality is provided by a fence agent. The concept is sometimes referred to as fencing. However, these fence agents are generally specific to the deployment, and often provided by hardware vendors and some software vendors, such as those who provide hypervisors. For example, VMware provides a fence agent that can be used for Linux VMs virtualized using vSphere.

Quorum and fencing ties into another concept called STONITH, or Shoot the Other Node in the Head. STONITH is required to have a supported Pacemaker cluster on all Linux distributions. For more information, see Fencing in a Red Hat High Availability Cluster (RHEL).

corosync.conf

The corosync.conf file contains the configuration of the cluster. It is located in /etc/corosync. In the course of normal day-to-day operations, this file should never have to be edited if the cluster is set up properly.

Cluster log location

Log locations for Pacemaker clusters differ depending on the distribution.

  • RHEL and SLES - /var/log/cluster/corosync.log
  • Ubuntu – /var/log/corosync/corosync.log

To change the default logging location, modify corosync.conf.

Plan Pacemaker clusters for [!INCLUDEssnoversion-md]

This section discusses the important planning points for a Pacemaker cluster.

Virtualizing Linux-based Pacemaker clusters for [!INCLUDEssnoversion-md]

Using virtual machines to deploy Linux-based [!INCLUDEssnoversion-md] deployments for AGs and FCIs is covered by the same rules as for their Windows-based counterparts. There is a base set of rules for supportability of virtualized [!INCLUDEssnoversion-md] deployments provided by Microsoft in Microsoft Support KB 956893. Different hypervisors such as Microsoft’s Hyper-V and VMware’s ESXi may have different variances on top of that, due to differences in the platforms themselves.

When it comes to AGs and FCIs under virtualization, ensure that anti-affinity is set for the nodes of a given Pacemaker cluster. When configured for high availability in an AG or FCI configuration, the VMs hosting [!INCLUDEssnoversion-md] should never be running on the same hypervisor host. For example, if a two-node FCI is deployed, there would need to be at least three hypervisor hosts so that there is somewhere for one of the VMs hosting a node to go in the event of a host failure, especially if using features like Live Migration or vMotion.

For more specifics, consult:

Note

RHEL with a Pacemaker cluster with STONITH is not yet supported by Hyper-V. Until that is supported, for more information and updates, consult Support Policies for RHEL High Availability Clusters.

Networking

Unlike a WSFC, Pacemaker does not require a dedicated name or at least one dedicated IP address for the Pacemaker cluster itself. AGs and FCIs will require IP addresses (see the documentation for each for more information), but not names, since there is no network name resource. SLES does allow the configuration of an IP address for administration purposes, but it is not required, as can be seen below in Create the Pacemaker Cluster.

Like a WSFC, Pacemaker would prefer redundant networking, meaning distinct network cards (NICs or pNICs for physical) having individual IP addresses. In terms of the cluster configuration, each IP address would have what is known as its own ring. However, as with WSFCs today, many implementations are virtualized or in the public cloud where there is really only a single virtualized NIC (vNIC) presented to the server. If all pNICs and vNICs are connected to the same physical or virtual switch, there is no true redundancy at the network layer, so configuring multiple NICs is a bit of an illusion to the virtual machine. Network redundancy is usually built into the hypervisor for virtualized deployments, and is definitely built into the public cloud.

One difference with multiple NICs and Pacemaker versus a WSFC is that Pacemaker allows multiple IP addresses on the same subnet, whereas a WSFC does not.

Quorum and STONITH

Quorum configuration and requirements are related to AG or FCI-specific deployments of [!INCLUDEssnoversion-md].

STONITH is required for a supported Pacemaker cluster. Use the documentation from the distribution to configure STONITH. An example is at Storage-based Fencing for SLES. There is also a STONITH agent for VMware vCenter for ESXI-based solutions. For more information, see [needs link].

Note

As of the writing of this article, Hyper-V does not have a solution for STONITH. This is true for on premises deployments and also impacts Azure-based Pacemaker deployments using certain distributions such as RHEL.

Interoperability

This section documents how a Linux-based cluster can interact with a WSFC or with other distributions of Linux.

WSFC

Currently, there is no direct way for a WSFC and a Pacemaker cluster to work together. This means that there is no way to create an AG or FCI that works across a WSFC and Pacemaker. However, there are two interoperability solutions, both of which are designed for AGs. The only way an FCI can participate in a cross-platform configuration is if it is participating as an instance in one of these two scenarios:

  • An AG with a cluster type of None. For more information, see the Windows availability groups documentation.
  • A distributed AG, which is a special type of availability group that allows two different AGs to be configured as their own availability group. For more information on distributed AGs, see the documentation Distributed Availability Groups.

Other Linux distributions

On Linux, all nodes of a Pacemaker cluster must be on the same distribution. For example, this means that a RHEL node cannot be part of a Pacemaker cluster that has a SLES node. The main reason for this was stated above: the distributions may have different versions and functionality, so things could not work properly. Mixing distributions has the same story as mixing WSFCs and Linux: use None or distributed AGs.

Deploy a Pacemaker cluster

This section shows how to configure the underlying Pacemaker cluster used by an AG or FCI. Unlike a traditional AG or FCI on Windows, the cluster portion on Linux can be done before or after the installation of [!INCLUDEssnoversion-md], as well as the configuration of the AG. There is no set order. This is different since it is not the tightly coupled Windows Server/[!INCLUDEssnoversion-md] stack. The integration and configuration of resources for Pacemaker portion of an AG or FCI deployment is done after the cluster is configured.

Important

An AG with a cluster type of None does not require a Pacemaker cluster, nor can it be managed by Pacemaker.

Common tasks

This section documents the tasks that must be done to configure both [!INCLUDEssnoversion-md] and Pacemaker to be able to deploy AGs or FCIs.

Install the HA add-on

Use the syntax below to install the packages for each distribution of Linux that make up the HA add-on. On SLES, the HA add-on gets initialized when the cluster is created.

RHEL

  1. Register the server using the following syntax. You will be prompted for a valid username and password.

    sudo subscription-manager register

  2. List the available pools for registration.

    sudo subscription-manager list --available

  3. Run the following command to associate RHEL high availability with the subscription.

    sudo subscription-manager attach --pool=<PoolID>

    where PoolId is the pool ID for the high availability subscription from the previous step.

  4. Enable the repository to be able to use the high availability add-on.

    sudo subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms

  5. Install Pacemaker.

    sudo yum install pacemaker pcs fence-agents-all resource-agents

Ubuntu

sudo apt-get install pacemaker pcs fence-agents resource-agents

SLES

Install the High Availability pattern in YaST or do it as part of the main installation of the server. This can be done with an ISO/DVD as a source or getting it from online.

Prepare the nodes for Pacemaker (RHEL and Ubuntu only)

Pacemaker itself uses a user created on the distribution named hacluster. This gets created when the HA add-on is installed on RHEL and Ubuntu.

  1. On each server that will serve as a node of the Pacemaker cluster, create the password for a user that will be used by the cluster. The name used in the examples will be hacluster, but any name can be used. The name and password must be the same on all nodes participating in the Pacemaker cluster.

    sudo passwd hacluster

  2. On each node that will be part of the Pacemaker cluster, enable and start the pcsd service with the following commands (RHEL and Ubuntu):

    sudo systemctl enable pcsd
    sudo systemctl start pcsd

    Then execute

    sudo systemctl status pcsd

    to ensure that pcsd is started.

  3. Enable the Pacemaker service on each possible node of the Pacemaker cluster.

    sudo systemctl start pacemaker

    On Ubuntu, you will see an error

    pacemaker Default-Start contains no runlevels, aborting.

    This is a known issue. Despite the error, enabling the Pacemaker service is successful, and this is a bug that will be fixed at some point in the future.

  4. Next, create and start the Pacemaker cluster. There is one difference between RHEL and Ubuntu at this step. While on both distributions, installing pcs will configure a default configuration file for the Pacemaker cluster, on RHEL, executing this command will destroy any existing configuration and create a new cluster.

Create the Pacemaker cluster

This section documents how to create the cluster for each distribution of Linux.

RHEL

These instructions show how to configure a Pacemaker cluster on RHEL.

  1. Authorize the nodes.

    sudo pcs cluster auth <Node1 Node2 … NodeN> -u hacluster

    where NodeX is the name of the node.

  2. Create the cluster

    sudo pcs cluster setup --name <PMClusterName Nodelist> --start --all --enable

    where PMClusterName is the name assigned to the Pacemaker cluster and Nodelist is the list of names of the nodes separated by a space.

Ubuntu

Configuring Ubuntu is similar to RHEL. However, there is one major difference: when the Pacemaker packages are installed, it creates a base configuration for the cluster and enables and starts pcsd. If you try to configure the Pacemaker cluster by following the RHEL instructions exactly, you will get an error. To fix this problem, perform the following steps:

  1. Remove the default Pacemaker configuration from each node.

    sudo pcs cluster destroy

  2. Follow the steps in the RHEL section for creating the Pacemaker cluster.

SLES

The process for creating a Pacemaker cluster is completely different on SLES than it is on RHEL and Ubuntu. The steps below document how to create a cluster with SLES.

  1. Start the cluster configuration process by running sudo ha-cluster-init on one of the nodes. You may be prompted that NTP is not configured and that no watchdog device is found. That is fine for getting things up and running. Watchdog is related to STONITH if you use SLES’s built-in fencing that is storage-based. NTP and watchdog can be configured later.
  2. You will be prompted to configure Corosync. You will be asked for the network address to bind to, as well as the multicast address and port. The network address is the subnet that you are using; for example, 192.191.190.0. You can accept the defaults and click Enter at every prompt, or change if necessary.
  3. Next, you will be asked if you want to configure SBD, which is the disk-based fencing. This can be done later if desired. If it is not configured, unlike on RHEL and Ubuntu, stonith-enabled will by default be set to false.
  4. Finally, you will be asked if you want to configure an IP address for administration. This IP address is optional, but functions similar to the IP address for a WSFC in the sense that it creates an IP address in the cluster to be used for connecting to it via HA Web Konsole (HAWK). This, too, is optional.
  5. Ensure that the cluster is up and running by issuing sudo crm status.
  6. Change the hacluster password with sudo passwd hacluster.
  7. If you configured an IP address for administration, you can test it in a browser, which also tests the password change for hacluster.
  8. On another SLES server that will be a node of the cluster, run sudo ha-cluster-join
  9. When prompted, enter the name or IP address of the server that was configured as the first node of the cluster in the previous steps. The server is added as a node to the existing cluster.
  10. Verify the node was added by issuing sudo crm status.
  11. Change the hacluster password with sudo passwd hacluster
  12. Repeat Steps 8-11 for all other servers to be added to the cluster.