Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To send a patch node request conveniently, we first execute kube proxy command to start it temporarily, then add six intel.com/devices to devices resource to a node (~1 in the commands will automatically transform into /):

curl --header "Content-Type: application/json-patch+json" \
--request PATCH \
--data '[{"op": "add", "path": "/status/capacity/intel.com~1devices", "value": "6"}]' \
http://localhost:8001/api/v1/nodes/192.168.222.66/status

Now we 

Device plugin

Overview

/<your-node-name>/status

Now we extend 6 intel.com/devices resources for your node, then we can see 

kubectl describe node xxx
...
Capacity
: ephemeral-storage: 3650656984Ki cpu: 72 memory: 263895388Ki intel.com/devices: 6
pods:                110
...

Now we can use these resources in our pod by adding intel.com/devices: "1" to spec.containers.resources.requests/limits and the pod will be scheduled with statistics.

To clean up the extended resources, execute the following commands:

curl --header "Content-Type: application/json-patch+json" \
--request PATCH \
--data '[{"op": "remove", "path": "/status/capacity/intel.com~1devices"}]' \
http://localhost:8001/api/v1/nodes/<your-node-name>/status

Device plugin

Overview

Kubernetes provides to vendors a mechanism called device plugins to finish the following three tasks, device plugins are simple gRPC servers that may run in a container deployed through the pod mechanism or in bare metal mode.

service DevicePlugin {
	// returns a stream of []Device
	rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
	rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
}
  • advertise devices.
  • monitor devices (currently perform health checks).
  • hook into the runtime to execute device specific instructions (e.g: Clean GPU memory) and to take in order to make the device available in the container.


draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNamedevice_plugin
simpleViewerfalse
width
diagramWidth631
revision1

...

  • Very few devices are handled natively by Kubelet (cpu and memory)
  • Need a sustainable solution for vendors to be able to advertise their resources to Kubelet and monitor them without writing custom Kubernetes code
  • A consistent and portable solution to consume hardware devices across k8s clusters to use a particular device type (GPU, QAT, FPGA, etc.) in pods
  • ...

How it works

In kubernetes, kubelet will offer a register gRPC server which allows device plugin register itself to kubelet. When registing itself to kubelet, it will notify kubelet of the following information:

  1. Its own unix socket name, which will receive the requests from kubelet through the gRPC apis.
  2. The api version of device plugin itself
  3. The resource name offered by the device pluigin. The resource name must follow a specified format. such as intel.com/qat

After successful registration, kubelet will call the ListAndWatch function from device plugin. A ListAndWatch function is for the kubelet to Discover the devices and their properties as well as notify of any status change (device became unhealthy)

Enable QAT supported by virtlet

Bug detection in source code


Fix


Example