Date: Fri, 29 Mar 2024 07:03:53 +0000 (UTC) Message-ID: <581921856.50535.1711695833285@aws-us-west-2-akraino-confluence-1.web.codeaurora.org> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_50534_616796745.1711695833284" ------=_Part_50534_616796745.1711695833284 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Basically there is one performance management solution in CaaS sub= -system and it is exposed to the application via the HPA API provided by Kuber= netes platform. Applications can use both core metrics and custom metrics t= o horizontally scale themselves. The first is based on CPU and memory usage= and the other uses practically every metric that the developer provides to= the API Aggregator via an HTTP server.
In the following there is a short overview of the components of Performa= nce management and elasticity sub-system of CaaS.
The key differences between core and custom metrics are that core metric= s support scraping metrics only from CPU and memory whereas custom metrics = can scrape practically every kind of metrics. In the first case Kubernetes = offers the metrics out of box, but in the second case users have to impleme= nt the metrics provider HTTP server.
Note that the database behind the performance management system is not p= ersistent but uses time-series database to store metric values in both solu= tions.
~]$ kubectl= api-versions ... custom.metrics.k8s.io/v1beta1 ... metrics.k8s.io/v1beta1 ...
~]$ kubectl= top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% 172.24.16.104 1248m 62% 5710Mi 74% 172.24.16.105 1268m 63% 5423Mi 71% 172.24.16.107 1215m 60% 5191Mi 68% 172.24.16.112 253m 6% 846Mi 11%
The printout shows the names of nodes actually the IP addresses of= nodes, the usage of CPUs in percentage and milli standard for 2 CPU= s in the example furthermore memory usage in percentage and Mi (MiB) standa= rd.
~]$ kubectl= top pod --namespace=3Dkube-system | grep elasticsearch NAME CPU(cores) MEMORY(bytes) elasticsearch-data-0 71m 1106Mi elasticsearch-data-1 65m 1114Mi elasticsearch-data-2 75m 1104Mi elasticsearch-master-0 4m 1068Mi elasticsearch-master-1 7m 1076Mi elasticsearch-master-2 3m 1075Mi
Console output shows pod names and their CPU and memory consumption in t= he same format.
In case of the usage of Custom Metrics the developer has to provide the = exposition of metrics in his application in Prometheus format. There are sp= ecific libraries that can be used for creating HTTP server and Prometheus c= lient for this purpose in Golang, Python etc.
from promethe= us_client import start_http_server, Histogram import random import time function_exec =3D Histogram('function_exec_time', 'Time spent processing a function', ['func_name']) def func(): if (random.random() < 0.02): time.sleep(2) return time.sleep(0.2) start_http_server(9100) while True: start_time =3D time.time() func() function_exec.labels(func_name=3D"func").observe(time.time() - start_ti= me)
This application imports http_server and Histogram metrics from Promethe= us client library and exposes metrics from the func() function. Prometheus = can scrape these metrics from port 9100.
~]$ kubectl= get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq . { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [ { "name": "pods/go_memstats_heap_released_bytes", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "jobs.batch/http_requests", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] } ] }
The command result lists the custom metrics in the system, each metrics = can be requested one by one for more details:
kubectl get= =E2=80=93raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/p= ods/*/http_requests" | jq . { "kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system= /pods/%2A/http_requests" }, "items": [ { "describedObject": { "kind": "Pod", "namespace": "kube-system", "name": "podinfo-bd494c88d-lmt2j", "apiVersion": "/v1" }, "metricName": "http_requests", "timestamp": "2019-02-14T10:21:19Z", "value": "898m" }, { "describedObject": { "kind": "Pod", "namespace": "kube-system", "name": "podinfo-bd494c88d-lxng7", "apiVersion": "/v1" }, "metricName": "http_requests", "timestamp": "2019-02-14T10:21:19Z", "value": "898m" } ] } ~]$ curl http://$(kubectl get service podinfo --namespace=3Dkube-system -o = jsonpath=3D'{ .spec.clusterIP }'):9898/metrics =E2=80=A6 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.005"} 2040 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.01"} 2040 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.025"} 2040 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.05"} 2072 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.1"} 2072 http_request_duration_seconds_bucket{method=3D"GET",path=3D"healthz",status= =3D"200",le=3D"0.25"} 2072 =E2=80=A6 # HELP http_requests_total The total number of HTTP requests. # TYPE http_requests_total counter http_requests_total{status=3D"200"} 5593 =E2=80=A6
This is a HTTP request with cURL, it shows the custom metrics exposed by= an HTTP server of an application running in a Kubernetes pod.
php-apache-h= pa.yml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: php-apache-hpa spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: php-apache-deployment minReplicas: 1 maxReplicas: 5 targetCPUUtilizationPercentage: 50
In this example HPA scrapes metrics from CPU consumption of php-apache-d= eployment. The initial pod number is one and the maximum replica counts are= five. HPA initiates pod scaling when the CPU utilization is higher than 50= %. If the utilization is less than 50% HPA starts scaling down the number o= f pods by one.
podinfo-hpa-= custom.yaml apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: podinfo namespace: kube-system spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: podinfo minReplicas: 2 maxReplicas: 10 metrics: - type: Pods pods: metricName: http_requests targetAverageValue: 10
In the second example HPA uses custom metrics to manage the= performance. The podinfo application contains the implementation of an HTT= P server which exposes the metrics in Prometheus format. The initial number= of pods are two and the maximum are ten. The custom metric is the cardinal= ity of the http requests on the HTTP server regarding to the metrics expose= d.
~]$ kubectl= create -f podinfo-hpa-custom.yaml --namespace=3Dkube-system
In case of starting core metrics HPA the command is the same.
~]$ kubectl= describe hpa podinfo --namespace=3Dkube-system Name: podinfo Namespace: kube-system Labels: <none> Annotations: <none> CreationTimestamp: Tue, 19 Feb 2019 10:08:21 +0100 Reference: Deployment/podinfo Metrics: ( current / target ) "http_requests" on pods: 901m / 10 Min replicas: 2 Max replicas: 10 Deployment pods: 2 current / 2 desired Conditions: Type Status Reason Message ---- ------ ------ ------- AbleToScale True ReadyForNewScale recommended size matches curren= t size ScalingActive True ValidMetricFound the HPA was able to successfull= y calculate a replica count from pods metric http_requests ScalingLimited True TooFewReplicas the desired replica count is in= creasing faster than the maximum scale rate Events: <none>
Note that: HPA API supports scaling based on both core and custom metric= s within the same HPA object.
Example Kubernetes manifests for testing with custom metrics: