Update replicas using K8s HorizontalPodAutoscaler and set determination interval and increase/decrease limit

2024-04-29 kubernetes

HorizontalPodAutoscaler (HPA) updates the replicas of the scaleTargetRef resource based on metrics.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: testapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: testapp
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

If replicas is included in the Deployment’s manifest, it may be overwritten when it is applied, so it is better not to include it. However, note that if you delete the replicas and apply it, replicas will be update to the default value 1 by the merge patch with kubectl.kubernetes.io/last-applied-configuration.

If you want to specify metrics such as cpu etc., you need to install Metrics Server.

// $ kubectl top node
// error: Metrics API not available

cluster.addHelmChart('metrics-server', {
  chart: 'metrics-server',
  release: 'metrics-server',
  repository: 'https://kubernetes-sigs.github.io/metrics-server',
  namespace: 'kube-system',
  createNamespace: false,
  wait: true
})

Utilization is calculated from the total container usage and request or limit resource amount, and replicas is determined by the following formula. For example, if averageUtilization: 70 and average CPU usage is 140%, replicas will be set to double the current value.

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

If you use multiple generation instance types, there can be variation in CPU usage among Pods, and the average may not scale well. In that case, you can refer to external metrics provided by KEDA, etc.

Prometheus metrics and time based scaling with KEDA (Kubernetes Event-driven Autoscaling) - sambaiz-net

If the load comes from requests, it would be good to change the routing policy of ALB to LOR.

Install AWS Load Balancer Controller on EKS cluster and set up ALB Ingress - sambaiz-net

If there are multiple metrics, the maximum number of replicas will be used, and if there is a metrics that can’t be obtained, scale down won’t occur.

The determination interval is –horizontal-pod-autoscaler-sync-period of kube-controller-manager, and the default is 15 seconds. In order to make scaling earlier, I would like to make it as short as possible, but it can’t be changed at present on EKS, which can’t touch the control plane. Anyway, if there are not enough resources, pods will not start, so it is also necessary to work on making it easier to request resources using PriorityClass and on speeding up node scaling.

Priority of K8s pods and preemption - sambaiz-net

Install Karpenter on an EKS cluster with CDK to auto-scale flexibility and quickly - sambaiz-net

You can set the upper limit for increase/decrease in behavior, and the default is following settings. periodSeconds: 15 and value: 100 in the Percent policy mean that the number of pods will increase/decrease by at most 100% of the current number of pods in 15 seconds. If there are multiple policies, selectPolicy selects which value will be applied, and the default is Max. stabilizationWindowSeconds is a value to avoid flapping, where the number of pods is unstable due to frequent metrics updates, and the maximum value during this period will be used.

behavior:
  scaleDown:
	  stabilizationWindowSeconds: 300
	  policies:
	  - type: Percent
		  value: 100
		  periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
	  policies:
	  - type: Percent
	    value: 100
	    periodSeconds: 15
	  - type: Pods
	    value: 4
	    periodSeconds: 15
	   selectPolicy: Max

References

あれ、本番環境のKubernetes Podがいなくなっちゃったよ