Ubuntu 上 Kubernetes 资源调度的落地做法
一 基础与容量规划
二 Pod 资源配置与请求
apiVersion: v1
kind: Pod
metadata:
name: demo
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "1"
memory: "2Gi"
三 常用调度策略与示例
spec:
nodeSelector:
disktype: ssd
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/arch
operator: In
values: [amd64]
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: disk-type
operator: In
values: [ssd]
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: [web]
topologyKey: kubernetes.io/hostname
# 给节点打污点
kubectl taint nodes gpu-node-1 hardware=gpu:NoSchedule
# Pod 容忍
spec:
tolerations:
- key: "hardware"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: my-app
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
---
spec:
priorityClassName: high-priority
kubectl cordon <node-name> # 不可调度
kubectl uncordon <node-name> # 恢复调度
spec:
nodeName: node-frontend-1
以上策略可组合使用,实现“定向—隔离—分布—保序”的完整调度闭环。
四 调度器配置与扩展
五 运维与排障要点
kubectl describe node <node> | egrep 'Capacity|Allocatable'
kubectl describe pod <pod> | grep -A 10 Events
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl cordon <node> # 先屏蔽,避免新 Pod 进入
# 排查/维护完成后
kubectl uncordon <node>