Prerequisites
Before starting, ensure you meet these requirements:
ntp or chrony).Step 1: Install SQL Server on All Nodes
Run these commands on every node to install SQL Server:
# Import Microsoft GPG key
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
# Add SQL Server repository (replace "22.04" with your Ubuntu version, e.g., "20.04")
sudo add-apt-repository "$(wget -qO- https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list)"
# Update package list and install SQL Server
sudo apt-get update
sudo apt-get install -y mssql-server
After installation, set the SA password and verify the service status:
# Set SA password (replace 'YourStrongPassword' with a secure password)
sudo /opt/mssql/bin/mssql-conf set-sa-password
# Start and enable SQL Server service
sudo systemctl start mssql-server
sudo systemctl enable mssql-server
# Check service status
sudo systemctl status mssql-server
Ensure the service is active (running) before proceeding.
Step 2: Configure SQL Server for Always On
Enable Always On features and create a database endpoint:
# Connect to SQL Server using sqlcmd
sqlcmd -S localhost -U SA -P 'YourStrongPassword'
# Enable Always On (run in sqlcmd)
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sp_configure 'hadr', 1;
RECONFIGURE;
# Create an availability group endpoint (replace 'Node1', 'Node2' with node names/IPs)
CREATE ENDPOINT [Hadr_endpoint]
STATE=STARTED
AS TCP (
LISTENER_PORT=5022,
LISTENER_IP=(0.0.0.0) -- Allow connections from all nodes
)
FOR DATABASE_MIRRORING (
ROLE=ALL,
AUTHENTICATION=CERTIFICATE server_cert,
ENCRYPTION=REQUIRED ALGORITHM AES
);
-- Exit sqlcmd
GO
QUIT
This endpoint enables secure communication between replicas.
Step 3: Install Pacemaker and Corosync
Pacemaker manages cluster resources; Corosync handles node communication.
Install Packages:
sudo apt-get install -y pacemaker pcs fence-agents resource-agents mssql-server-ha
Set hacluster Password:
All nodes must have the same hacluster password:
sudo passwd hacluster
Destroy Existing Clusters (if any):
Clean up residual configurations to avoid conflicts:
sudo pcs cluster destroy --all
Create a New Cluster:
Run on the primary node (replace ClusterName and Node1 Node2 with your cluster name and node names/IPs):
sudo pcs cluster setup --name ClusterName Node1 Node2 --start --all --enable
Verify Cluster Status:
Ensure all nodes are online:
sudo pcs status
You should see all nodes in a running state.
Step 4: Configure Pacemaker for SQL Server AG
Integrate SQL Server with Pacemaker to manage failover.
Create SQL Server Login for Pacemaker:
This login allows Pacemaker to manage the AG:
USE [master];
GO
CREATE LOGIN [pacemaker_login] WITH PASSWORD = N'YourPacemakerPassword';
ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemaker_login];
GO
Save the credentials securely on all nodes:
echo 'pacemaker_login' >> ~/pacemaker-passwd
echo 'YourPacemakerPassword' >> ~/pacemaker-passwd
sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd
sudo chown root:root /var/opt/mssql/secrets/passwd
sudo chmod 400 /var/opt/mssql/secrets/passwd
Create AG Resource:
Use the ocf:mssql:ag agent to manage the AG:
sudo pcs resource create ag_cluster ocf:mssql:ag ag_name=MyAG meta notify=true
Create Virtual IP (VIP) Resource:
The VIP allows clients to connect to the AG via a single IP:
sudo pcs resource create virtualip ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=24
Replace 192.168.1.100 with your desired VIP.
Configure Resource Dependencies:
Ensure the VIP starts only after the AG is running, and the AG promotes to primary on failure:
sudo pcs constraint colocation add virtualip ag_cluster-master INFINITY with-rsc-role=Master
sudo pcs constraint order promote ag_cluster-master then start virtualip
Verify Configuration:
Check the cluster status to confirm all resources are running:
sudo pcs status
Look for the ag_cluster in master state and the VIP assigned.
Step 5: Test Failover
Validate that the AG fails over automatically or manually:
Simulate Primary Failure:
Stop SQL Server on the primary node:
sudo systemctl stop mssql-server
Check Cluster Status:
Run sudo pcs status on any node. The AG should promote a secondary replica to master.
Recover Primary Node:
Restart SQL Server and verify it rejoins the AG as a secondary:
sudo systemctl start mssql-server
The cluster should automatically rebalance roles.
Step 6: Connect to the AG
Clients connect to the virtual IP (e.g., 192.168.1.100) instead of individual nodes. For example, in SQL Server Management Studio (SSMS), use the VIP as the server name.
Optionally, create a DNS record (e.g., sqlag.example.com) pointing to the VIP for easier access.
Troubleshooting Tips
ntp or chrony to keep nodes synchronized./var/log/syslog) and SQL Server logs (/var/opt/mssql/log/errorlog) for errors.pacemaker_login has sufficient permissions (sysadmin role is recommended for simplicity).