CentOS 上 Kubernetes 资源分配实操指南
一 前置条件与系统优化
swapoff -a,并在 /etc/fstab 中注释 swap 行;内核参数设置 vm.swappiness=0,避免影响 kubelet 调度与 OOM 行为。systemctl disable --now firewalld;SELinux 可设为 SELINUX=permissive 或 disabled(生产请评估安全策略)。modprobe br_netfilter/etc/sysctl.d/k8s.conf:
net.bridge.bridge-nf-call-iptables = 1net.bridge.bridge-nf-call-ip6tables = 1sysctl -p /etc/sysctl.d/k8s.conf 生效。timedatectl set-ntp true,避免证书校验与调度异常。systemctl enable --now kubelet。kubeadm init --pod-network-cidr=10.244.0.0/16(示例使用 Flannel)kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.ymlmkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config二 核心机制与关键概念
requests.cpu、requests.memory、limits.cpu、limits.memory)。podTopologySpreadConstraints,均衡分布关键服务副本。nvidia.com/gpu),在 Pod 中以 limits 申请。三 配置示例
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
要点:为所有容器设置 requests/limits;CPU 为可压缩资源,内存为不可压缩资源,OOM 风险更高。
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-a-quota
namespace: team-a
spec:
hard:
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
namespace: team-a
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
apiVersion: v1
kind: Pod
metadata:
name: gpu-demo
spec:
containers:
- name: cuda
image: nvidia/cuda:12.2.0-base
resources:
limits:
nvidia.com/gpu: 1 # 申请 1 张 GPU
前置:部署 NVIDIA Device Plugin(如 kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.13.0/nvidia-device-plugin.yml)。
四 容量规划与调优建议
--eviction-threshold 与 --eviction-hard(如内存压力阈值)。五 常用验证与排障命令
kubectl describe node <node-name>(关注 Capacity/Allocatable/Allocations 与污点/条件)。kubectl top nodes / kubectl top pods -A。kubectl describe pod <pod-name> -n <ns>(排查 OOMKilled/FailedScheduling/Throttled)。kubectl get resourcequota -n <ns> -o yaml、kubectl get limitrange -n <ns> -o yaml。kubectl get hpa <hpa-name> -n <ns>,观察指标与副本变化。