Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Virtlet considers all other devices bound vfio-pci drivers as a volume device and add adds them into libvritxml as block disk type with disk driver. This will caused cause vm startup errors.
  • Virtlet binds the network devices after the creatition creation of libvirt domain file, and its default hostdev id number starts from 0, it will make conflict when we add other type device to libvirt domain file by pci-passthrough
  • Virtlet can not recognize  other sriov device
  • ...

...

In Kubernetes, Kubelet will offer a register gRPC server which allows the device plugin to register itself to Kubelet. When starting, the device plugin will make a (client) gRPC call to the Register function that Kubelet exposes. The device plugins sends a RegisterRequest to Kubelet to notify Kubelet of the following informations, and Kubelet answers to the RegisterRequest with a RegisterResponse containing any error Kubelet might have encountered (api version not supported, resource name already register), then the device plugin start its gRPC server if it did not receive an error.

...

After successful registration, Kubelet will call the ListAndWatch function from the device plugin. A ListAndWatch function is for the Kubelet to Discover the devices and their properties as well as notify of any status change (devices become unhealthy). The list of devices is returned as an array of all devices description information (ID, health status) of the resource. Kubelet records this resource and its corresponding number of devices to node.status.capacity/allocable and updates it to apiserver. This function will always loop check, once the device is abnormal or unplugged from the machine, it will update and return the latest device list to Kubelet.

In this way, when creating a pod, fields such as intel.com/qat can be added to spec.containers.resource.limits/requests: "1" to inform Kubernetes to schedule the pod to nodes with more than one intel.com/qat resource allowance. When the pod is to run, Kubelet will call device plugin allocate function. Device plugin may do some initialization operations, such as QAT configuration or QRNG initialization. If initialization is successful, this function will return how to config the device assigned to the pod when the container is created, and this configuration will be passed to the container runtime as a parameter used to run the container.

User Flow

To use the extend extended resource, we add intel.com/qat to spec.containers.resource.limits/requests, we expect the request to have limits == requests.

...

Gaps detection in source code

When testing the QAT sriov sr-iov support condition with the officer virtlet image, together with QAT device plugin. We we take thie simple straightforward method that add adds the resource name qat.intel.com/generic advertised by the QAT device plugin to fileds fields spec.containers.resource.limits and spec.containers.resource.requests with value "1". It works correctly in plain kubernetes pods. But in a virtlet vm pod, we encountered the conflict caused by the configuration transformed between virtual machine and pod by virtlet. The issues is that when allocating a QAT vf device to virtlet vm pod, Kubelet will add the extended device to kubeapi.PodSandboxConfig.Devices (k8s.io/kubernetes/pkg/kubelet/apis/cri/runtime/v1alpha2 - v1.14). Then virtlet will incorrectly transforms transform all these devices to its volume devices and considers them as block disk with disk drivers bound to them later. 

...

It causes the errors that too many disks, disks' reading issues, denied permission and so on after a vm pod starts. And regardless of this, I want to assign QAT vf to the virtlet pod by pci-passthrough. So I want to add corresponding fileds fields into the libvirt instance domain xml created by virtlet. After code analysis, virtlet is a cri implentment implement and in its createDomain(config *types.VMConfig) *libvirtxml.Domain (pkg/libvirttools/virtualization.go) I detect the xml file creation and find it is using the libvirtxml "github.com/libvirt/libvirt-go-xml" go module. So the whole work flow workflow is clear now and I can fix it then.

domain := &libvirtxml.Domain{
        Devices&libvirtxml.DomainDeviceList{
            Emulator"/vmwrapper",
            Inputs[]libvirtxml.DomainInput{
                {Type"tablet"Bus"usb"},
            },
            Graphics: []libvirtxml.DomainGraphic{
                {VNC: &libvirtxml.DomainGraphicVNC{Port: -1}},
            },
            Videos: []libvirtxml.DomainVideo{
                {Model: libvirtxml.DomainVideoModel{Type: "cirrus"}},
            },
            Controllers: []libvirtxml.DomainController{
                {Type: "scsi"Index: &scsiControllerIndex, Model"virtio-scsi"},
            },
        },

...


Key Point

Because Virtlet create creates a VM by libvirt instance. So we config QAT devices to its domain file to finish the QAT device assignment. Virtlet can get the QAT device id from the environment vailables variables which are advertised by QAT device plugin and passed by Kubelet. Then we can easily assign a QAT vf deivce device into a Virtlet VM by PCI-passthrough supported by libvirt hostdev api.

...

And for further code information, you can get from the my fork version of Virtlet in https://github.com/leyao-daily/virtlet

Example

I have upload uploaded the QAT enabled imaged image into docker hub and you can download it by 'docker pull integratedcloudnative/virtlet-qat:test'. After the Virtlet Pod runningruns, you can setup set up a VM with QAT vf device. Add the orange line with the number of QAT vf you want to assign into spec.containers.resource.limits/requests of your Virtlet VM yaml file like below.

...