Page History

Versions Compared

Key

This line was added.
This line was removed.
Formatting was changed.

...

Just create an instance of InferenceModel with "DeployToLayer == edge"

Joint Inference by edge and cloud

Create three resources:

An instance of InferenceModel to the cloud
An instance of InferenceModel to the edge
A pod running on the edge for serving customer traffic. It contains the logic for deciding whether or not to call cloud model serving API.

Joint inference by device, edge and cloud

We can have three models, with different size and accuracies, running on device, edge, and cloud respectively.