k8s☞17-2调度之亲和性和反亲和性

阅读量: zyh 2021-06-17 16:25:58
Categories: > Tags:

基本

最简单的调度方式nodeSelector 方式,仅需给Pod提供,就可以让Pod调度到对应的节点上。例如:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    disktype: ssd

上述方式太死板,万一所有节点都不符合条件,则会Pod会卡住无法调度。


因此就有了亲和性和反亲和性策略。

针对node的亲和性是:nodeAffinity。其意思是如果有符合条件的node,就将pod调度到这个node上。

针对pod的亲和性和反亲和性分别是:podAffinity和podAntiAffinity。其意思是如果node上已经有符合条件的pod,就将pod调度(反亲和:不调度)到这个node上。

对于Pod的亲和性和反亲和性,拥有一个称之为``topologyKey的概念。通过topologyKey`,可以将节点划分为若干拓扑网格。其意思是:

亲和性:在划分的每一个拓扑内,若已经有符合条件的pod,则将pod调度到topologyKey

反亲和性:在划分的每一个拓扑内,若已经有符合条件的pod,则不可调度到topologyKey

一个例子

实现目标:

每一个节点(topologyKey)上,均只有一个web-server和一个redis

redis的配置

apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-cache
spec:
  selector:
    matchLabels:
      app: store
  replicas: 3
  template:
    metadata:
      labels:
        app: store
    spec:
      affinity:
        podAntiAffinity: # 反亲和性:以节点名划分拓扑域,若拓扑域内已有app=store的Pod,则不可再将本Pod调度进此拓扑域
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: redis-server
        image: redis:3.2-alpine

web-server的配置

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: web-store
  replicas: 3
  template:
    metadata:
      labels:
        app: web-store
    spec:
      affinity:
        podAntiAffinity: # 反亲和性:以节点名划分拓扑域,若拓扑域内已有app=web-store的Pod,则不可再将本Pod调度进此拓扑域
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-store
            topologyKey: "kubernetes.io/hostname"
        podAffinity: # 亲和性:以节点名划分拓扑域,若拓扑域内已有app=store的pod,则可将本Pod调度进此拓扑域
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: web-app
        image: nginx:1.12-alpine

💛如果要将超出节点数的Pod尽可能的均衡负载,则Pod反亲和应该使用preferredDuringSchedulingIgnoredDuringExecution,这可以确保在每一个节点都部署Pod后,依然可以将Pod部署进去。

    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - web-store
              topologyKey: "kubernetes.io/hostname"

四大调度策略类型

支持nodeAffinity、podAffinity、podAntiAffinity的策略

💖策略可以组合使用

未来可能支持的策略

案例

线上服务器组专用节点

# 添加污点,非prod服务不可调度到此节点
kubectl taint nodes k8s001 dedicated=prod:NoSchedule
# 添加标签
kubectl label nodes k8s001 dedicated=prod
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    dedicated: prod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingRequiredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: dedicated
            operator: In
            values: 
            - prod
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "prod"
    effect: "NoSchedule"
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent

tolerations 确保了 nginx pod 可以调度到拥有 dedicated=prod:NoSchedule 污点的节点,而 k8s001 拥有此污点。

affinity.nodeAffinity 节点亲和确保了必须调度到拥有 dedicated=prod 标签的节点,而 k8s001 拥有此标签。并且,当节点的dedicated != prod的时候,Pod将会重新调度到满足条件的节点上。