Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel3
Kubevirt

What is Kubevirt?

KubeVirt is a Kubernetes CRD(Custom Resource Definitions) implement. It is a virtual machine management add-on for Kubernetes. The aim is to provide a common ground for virtualization solutions on top of Kubernetes.

...

  • The virt-controller and the virt-handler listen on node status through the watch interface provided by api-server 
  • A new VMI definition similar with a plain pod is posted to the kubelet
  • The K8s API Server validates the input and creates a VMI custom resource definition (CRD) object.
  • The virt-controller observes the creation of the new VMI object and creates a corresponding pod.
  • Kubernetes schedules the pod on a host.
  • The virt-controller observes that a pod for the VMI got started and updates the nodeName field in the VMI object. 
  • The virt-handler (DaemonSet) observes that a VMI got assigned to the host where it is running on.
  • The virt-handler is using the VMI Specification and signals the creation of the corresponding domain using a libvirtd instance in the VMI's pod.
  • A client deletes the VMI object through the virt-api-server.
  • When Kubernetes runtime attempts to shutdown the virt-launcher pod before the VM/VMI has exited, the virt-launcher forwards signals from Kubernetes to the VM/VMI process and attempts to hold off the termination of the pod until the VM/VMI has shutdown successfully.
  • The virt-handler observes the deletion and turns off the domain.

QAT

Brief Introduction

Intel QuickAssist Technology is developed by Intel and runs on the Intel Architecture to provide security and compression acceleration capabilities to improve performance and efficiency. It will offload the workloads like cryptography and compression from the CPU to hardware. Server, networking, big data, and storage applications use Intel QuickAssist to offload compute-intensive operations, such as:

  • Symmetric cryptography functions, including cipher operations and authentication operations
  • Public key functions, including RSA, Diffie-Hellman, and elliptic curve cryptography,
  • Compression and decompression functions, including DEFLATE

It has made great benefits in many areas, such as Hadoop Acceleration, OpenSSL Integration,  SDN and NFV Solutions Boost and so on.

      –4G LTE and 5G encryption algorithm offload for mobile gateways and infrastructure.

      –VPN traffic acceleration, with up to 50 Gbps crypto throughput and support for IPsec and SSL acceleration.

      –Compression/decompression up to 24 Gbps throughput.

      –I/O virtualization using PCI-SIG Single-Root I/O Virtualization (SR-IOV).

Handware

  • Chipset: Intel® C6xx Series Chipsets
  • PCIE: Intel® QuickAssist Adapter 89xx
  • SoC: Intel Atom® Processor C3000 Series (Denverton NS) / Rangeley

Features

  • QAT provides security (encryption) HW acceleration and compression HW acceleration
  • QAT makes use of a set of APIs to abstract out the hardware, so the same application can run on multiple generations of QAT hardware
  • Customers can also make use of patches that QAT has provided to popular open source software, so they can minimize or eliminate their effort to learn the API

With above support, QAT makes it easier for developers to integrate the accelerators in their designs and thus decrease the development time. And it can increase business flexibility by offering solutions that best fit the changing business requirements. It also frees up the valuable cycles on processors and allows it to perform value-added functionality.

What's more, QAT provides a uniform means of communication between accelerators, applications, and acceleration technologies. Due to this, the resources are managed more productively. Then It can boost application throughput, by reducing the demand on the platform and maximizing the CPU utilization.

Gaps

First of all, in a Kubernetes cluster, if we need utilize the QAT card and assign its vf to a container, we will compile the QAT driver on the host and deploy the QAT device plugin. After the QAT device plugin register successfully, a ListAndWatch function is for the Kubelet to Discover the devices and their properties as well as notify of any status change (devices become unhealthy). The list of devices is returned as an array of all devices description information (ID, health status) of the resource. Kubelet records this resource and its corresponding number of devices to node.status.capacity/allocable and updates it to apiserver. 

In this way, when creating a plain pod, fields such as intel.com/qat can be added to spec.containers.resource.limits/requests: "1" to inform Kubernetes to schedule the pod to nodes with more than one intel.com/qat resource allowance. When the pod is to run, Kubelet will call device plugin allocate function. Device plugin may do some initialization operations, such as QAT configuration or QRNG initialization. If initialization is successful, this function will return how to config the device assigned to the pod when the container is created, and this configuration will be passed to the container runtime as a parameter used to run container.

This workflow runs well in Kubernetes, but Kubevirt doesn't support for it. Because Kubevirt is a CRD implenment and it fails to process the pod-type configuration in its yaml file. It has its own api server and controller to verify the CRD definition and create a corresponding pod. Thie means we can not assign a QAT vf to the Kubevirt VM by adding spec.containers.resource.limits/requests: "1" with QAT resource name to the VMI configuration file. So the gaps in Kubevirt to enable QAT may be following items:

  1. Need create corresponding custom resource to hold QAT device.
  2. QAT feature should be optional and can be configured through a configmap.
  3. When assign a QAT vf to VM, Kubevirt need mount the required pci device to it.
  4. Do ordinary device passthrough to assign it
  5. Change virt-api to verify an approved VMI pod with QAT
  6. Example yaml file to create VMI with QAT
  7. Test cases

Integration

CRD Definition

Kubevirt use the feature of Kubernetes name Dynamic Admission Control and create Kubevirt API through a ValidatingAdmissionWebhook. This feature allows KubeVirt to dynamically register an HTTPS webhook with Kubernetes at KubeVirt install time. After registering the custom webhook, all requests related to KubeVirt API objects are forwarded from the Kubernetes API server to our HTTPS endpoint for validation. If our endpoint rejects a request for any reason, the object will not be persisted into etcd and the client receives our response outlining the reason for the rejection.

So to enable QAT in Kubevirt, it is necessray to create related segments to the validation service and add the QAT feature Gate verified method.

  • Add necessary information to swagger.json which is an add on for kubenetes API used in Kubevirt

...

"qats": {
    "description": "Whether to assign a QAT vf device to the vmi.\n+optional",
    "type": "array",
    "items": {
        "$ref": "#/definitions/v1.QAT"
    }
},

...

"v1.QAT": {
    "required": [
        "name",
        "deviceName"
    ],
    "properties": {
        "deviceName": {
            "type": "string"
        },
        "name": {
            "description": "Name of the QAT device as exposed by a device plugin",
            "type": "string"
        }
    }
},

  •  Add Feature Gate to webhook validation service to config it in configmap 

...

if spec.Domain.Devices.QATs != nil && !config.QATPassthroughEnabled() {
    causes = append(causes, metav1.StatusCause{
        Type: metav1.CauseTypeFieldValueInvalid,
        Message: fmt.Sprintf("QAT feature gate is not enabled in kubevirt-config"),
        Field: field.Child("QATs").String(),
    })
}

...

...

Non-privileged

To meet the Kubernetes and Kubevirt community specifications, the pod should be non-privileged. So we should mount the assigned QAT pci device to the VM through the interfaces Kubevirt provided.

if util.IsQATVMI(vmi) {
    for _, qat := range vmi.Spec.Domain.Devices.QATs {
        requestResource(&resources, qat.DeviceName)
    }
}

This will call the Kubevirt to mount necessary resources for QAT into the VM, such as /sys/devices/.

PCI passthrough

Fundamentally, Kubevirt create a Kubernetes CRD to hold some resource configurations that fit a libvrit instance. So actually to assign a QAT vf into a VM is to create a libvirt and insert the specific device into hostdev block. (In libvirt, hostdev label means a plain host device assignment with all its limitations)

<hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x3d' slot='0x02' function='0x2'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</hostdev>

So when a QAT vf resource is required, I call the Kubevirt method to get the device id of the assigned QAT and integrate it into libvirt domain.

// Append HostDev to libvirt DomXML if QAT is required
if util.IsQATVMI(vmi) {

    qatPCIAddresses := append([]string{}, c.QATDevices...)
    hostDevices, err := createHostDevicesFromPCIAddresses(qatPCIAddresses)
    if err != nil {
        log.Log.Reason(err).Error("Unable to parse PCI addresses")
    } else {
       domain.Spec.Devices.HostDevices = append(domain.Spec.Devices.HostDevices, hostDevices...)
    }
}

...

Example

Based on the Kubevirt official example VMI file, to make use of my QAT version, after building its docker images and uploading to docker, we should open the QAT feature gate through configmap.

apiVersion: v1
kind: ConfigMap
metadata:
    name: kubevirt-config
    namespace: kubevirt
    labels:
        kubevirt.io: ""
data:
    feature-gates: "QAT"

And the key point of QAT VMI:

...

spec:
    domain:
        devices:
            qats:
                - deviceName: qat.intel.com/generic
                   name: qat1

...


Conclusion

The patch now has not been merged into the Kubevirt Project and is in review now. (PR: https://github.com/kubevirt/kubevirt/pull/2980)