Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Also the whole system will run in a hybrid mode. By hybrid, we mean the KubeEdge cluster will manage worker nodes in both the cloud worker nodes and the edge nodes. For the cloud running components, we can leverage the full power of standard K8s, i.e. pods, deployments, services, ingress/egress, loadbanancer loadbalancer etc. This means cloud components will be deployed in a way exactly the same as done in a standard Kubernetes cluster. On the other hand, edge components will leverage the KubeEdge framework. 

...

An instance of the InferenceModel specifies a single serving service for the provided model. We

Two scenarios:

  • DeployToLayer == cloud, the controller will create a deployment with specified replicas, and a service to expose the deployment to outside world.
  • DeployToLayer == edge, the controller will create a single pod running on the specified edge node (through NodeSelector)

Two ways of deployment

The InferenceModel CRD supports both of the following ways of deployment. If image is provided, manifest and targetVersion will be ignored.

Deployment methodProsCons
Docker image
  • Docker images encapsulate all the runtime dependencies
  • Very flexible as users can build whatever image they want
  • Docker will manage all the life cycles of the "model offloading"
  • Users have to build the images themselves, write the Dockerfile, build the image and upload to a docker registry
  • Users have to provide a private docker registry if they don't want to use the public dockerhub.
Machine learning model file manifests
  • Data scientists directly work with model files. It would be nice if they can just drop their model files somewhere
  • By using a data store, it opens the door for serverless computing
  • Our framework has to manage the whole life cycles of model files deployment, update, delete, etc.