Ubuntu上Kubernetes的高可用性配置

High Availability Configuration for Kubernetes on Ubuntu

To achieve high availability (HA) for a Kubernetes cluster on Ubuntu, you need to address three core components: control plane redundancy, etcd clustering, and API server load balancing. Below is a structured guide covering environment preparation, critical configurations, and validation steps.

1. Environment Preparation

Before deploying the cluster, ensure all nodes (masters and workers) meet the following requirements:

System Requirements: Use Ubuntu 20.04/22.04 LTS (64-bit). Master nodes require at least 2 vCPUs, 4GB RAM, and 20GB disk space; worker nodes need 2 vCPUs and 4GB RAM (adjust based on workloads).
Network Configuration: Assign static IPs to all nodes. Configure /etc/hosts on every node to resolve hostnames (e.g., 192.168.1.10 k8s-master1, 192.168.1.11 k8s-master2, 192.168.1.20 k8s-worker1).
Time Synchronization: Install chrony (recommended) or ntp to ensure all nodes have synchronized clocks. Run sudo apt install chrony && sudo systemctl enable --now chrony on each node.
Disable Swap: Kubernetes requires swap to be disabled. Execute sudo swapoff -a and comment out the swap line in /etc/fstab to make this permanent.
Install Container Runtime: Use containerd as the default runtime. Install it via sudo apt install containerd, then configure it by editing /etc/containerd/config.toml to set SystemdCgroup = true. Restart the service with sudo systemctl restart containerd && sudo systemctl enable containerd.

2. Install Kubernetes Components

On all nodes, add the Kubernetes repository and install kubelet, kubeadm, and kubectl:

sudo apt update && sudo apt install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /usr/share/keyrings/kubernetes-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update && sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl  # Prevent accidental upgrades

These commands install the necessary tools to initialize and manage the Kubernetes cluster.

3. Deploy etcd Cluster (Control Plane Data Store)

etcd is a distributed key-value store that stores Kubernetes cluster state. For HA, deploy etcd as a cluster of odd-numbered nodes (3 or 5). You can use kubeadm to simplify etcd setup:

During master node initialization (see Step 4), include the --upload-certs flag to automatically generate and distribute etcd certificates.

Verify etcd cluster health on any master node:

kubectl get pods -n kube-system -l app=etcd
kubectl exec -n kube-system -it etcd-k8s-master1 -- etcdctl endpoint health

Ensure all etcd members report “healthy”.

4. Initialize the Control Plane with kubeadm

Choose one master node as the primary master and initialize the control plane with a load balancer endpoint (see Step 5) and etcd clustering enabled:

sudo kubeadm init \
  --control-plane-endpoint "k8s-vip:6443" \  # Replace with your load balancer's VIP/DNS
  --pod-network-cidr=10.244.0.0/16 \        # CIDR for Pod network (matches your CNI plugin)
  --upload-certs                            # Upload certs for etcd clustering

After initialization, follow the on-screen instructions to configure kubectl:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

This sets up the primary master node. To add additional masters (for HA), run the kubeadm join command provided in the output (with --control-plane flag) on each secondary master:

sudo kubeadm join k8s-vip:6443 --token <TOKEN> \
  --discovery-token-ca-cert-hash sha256:<HASH> \
  --control-plane --certificate-key <CERTIFICATE_KEY>

Repeat for all secondary masters.

5. Set Up Load Balancing for API Servers

To ensure the Kubernetes API server remains accessible even if a master node fails, use a load balancer (e.g., HAProxy) with a virtual IP (VIP). Here’s how to configure HAProxy:

Install HAProxy: sudo apt install haproxy.

Edit the configuration file (/etc/haproxy/haproxy.cfg) to include:

frontend kubernetes
  bind *:6443
  mode tcp
  option tcplog
  default_backend kubernetes-master-nodes

backend kubernetes-master-nodes
  mode tcp
  balance roundrobin
  option tcp-check
  server k8s-master1 192.168.1.10:6443 check fall 3 rise 2
  server k8s-master2 192.168.1.11:6443 check fall 3 rise 2

Replace IPs with your master nodes’ addresses.

Restart HAProxy: sudo systemctl restart haproxy && sudo systemctl enable haproxy.
The VIP (e.g., k8s-vip) will route traffic to healthy master nodes.

6. Deploy a Pod Network Plugin

A pod network plugin enables communication between Pods across nodes. Popular choices include Calico (recommended for HA) and Flannel. Install Calico with:

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Verify the plugin is running:

kubectl get pods -n kube-system -l k8s-app=calico-node

Ensure all pods are in “Running” state.

7. Validate High Availability

Test your HA setup to ensure resilience:

Check Node Status: Run kubectl get nodes to confirm all nodes (masters and workers) are “Ready”.
Simulate Master Failure: Cordon and drain a master node (e.g., kubectl cordon k8s-master1 && kubectl drain k8s-master1 --ignore-daemonsets), then verify the cluster remains accessible via the VIP and other masters. Uncordon the node after testing (kubectl uncordon k8s-master1).
Check etcd Health: Run kubectl exec -n kube-system -it etcd-k8s-master1 -- etcdctl endpoint health to ensure all etcd members are healthy.
Deploy a Test Application: Create a sample deployment (e.g., nginx) with multiple replicas to verify load distribution and failover.

By following these steps, you’ll deploy a highly available Kubernetes cluster on Ubuntu with redundant control planes, a clustered etcd datastore, and load-balanced API servers. This setup ensures minimal downtime during node failures and supports production workloads.