NVIDIA K8s Device Plugin for Wind River Linux
Introduction
The advent of containers has changed the way computational workloads are managed and orchestrated in modern computing environments. Given the paradigm shift towards the microservices, container orchestration has become of critical importance in today’s distributed and cloud systems [1].
Managing edge devices on the scale of hundreds and thousands is an onerous task. Fortunately, orchestrators such as Kubernetes take the complexity out of updates, roll-backs, and more in a platform-agnostic environment. [2]. Orchestrators provide the means to manage heterogeneous edge clusters. It is necessary to not only orchestrate containers but to discover the hardware specialized devices that the containers and orchestrator can leverage. Failing to manage these resources can lead to inefficiency, time drain, concurrency issues, and more.
Background
In the context of Service-Oriented Architectures (SOA), orchestration is the automated configuration, coordination, and management of computer systems and software. Orchestration provides an automation model where process logic is centralized yet still extensible and composable. The use of this paradigm can significantly reduce the complexity of solution environments [3]. Containers provide a convenient means to deploy software in a managed and controlled way. The controlled deployment of containers is made using Kubernetes, making computing at the edge completely cloud-native and intelligent, scalable, and secure [2].
Kubernetes
Kubernetes (K8s) is a portable, extensible, open-source platform orchestrator for managing containerized workloads and services. It facilitates both declarative configuration and automation [4].
Two types of resources form a K8s cluster, the master nodes, and the worker nodes. K8s master nodes run essential cluster services. Worker nodes run the scheduled containerized workloads in units called pods (See Figure 1 below). Pods are the smallest deployable computing units that can be created and managed in Kubernetes; they encapsulate one or more containers that share resources, including a single IP address [5].
Device Plugins
Nodes with specialized hardware need to make the orchestrator aware so that orchestrators can manage resources and control concurrency of applications. The K8s community developed an interface to address this need called device plugins.
A K8s device plugin is a gRPC Remote Procedure Calls (gRPC) server that adds support for specific vendor-specific devices. The plugins allow discovery and health checks of the devices, which allows hooking the runtime to make devices available in containers and cleaning up. K8s designed these servers to be an external part of the cluster so that they are independent, and vendors can customize them depending on their needs [7].
When setting up a cluster, an administrator knows what kind of devices are present on the different machines and can install the device plugin to manage the resources automatically. The plugin detects the devices and advertises the resources to the K8s cluster through a code name, At the end-user side, the application specifies hardware requirements, and the cluster allocates an application to the specific resource in the best node available [7].
K3S
In this tutorial, we use K3s, a K8s custom-made distribution developed by Rancher Labs, with its focus on edge. It has many enhancements that make it suitable for this project; for example, it is lightweight and packed in a single binary, which makes it suitable for embedded[8].
Another enhancement that K3s offers are a lightweight storage backend mounted on top of sqlite3, and minimized versions of dependencies. These dependencies are tailored for the embedded world and include all the basic functionalities of a K8s cluster. As an example, Rancher labs developed its storage driver called Local Path Provisioner that enables the ability to create persistent volume claims out of the box using local storage on the respective node [8]. These improvements and more make the Rancher version of K8s suitable for the needs of this tutorial.
NVIDIA Containers Orchestration
The NVIDIA device plugin for K8s is a Daemon set that allows the cluster to expose the number of GPUs available on each node automatically, keep track of the health and run GPU enabled containers in a K8s cluster. To use this plugin, it is necessary to have the NVIDIA-docker stack installed in the node as well as the NVIDIA and CUDA drivers [9]. The plugin automatically performs the registration and communication with the cluster so that a user of the cluster can request GPU resources in its pods.
One limitation of this plugin has is that it only supports Kubernetes clusters that run on Intel 64 bit architecture. As a result, it is not possible to orchestrate containers on embedded ARM-based devices or NVIDIA Jetson devices [9]. This article utilizes a custom device plugin that enables the use of the NVIDIA GPUs on Jetson boards. By modifying the existing plugin to support other architectures and creating an ARM64 container with the modified source code, we can orchestrate both Intel and ARM-based nodes.
Requisites
This article is part two of a series on orchestrating container workloads on Nvidia GPUs. Read the first part, NVIDIA container runtime for Wind River Linux.
We assume a booted Jetson Board with the following requirements:
- NVIDIA drivers = 384.81
- nvidia-docker version > 2.0 (see how to install and it’s
prerequisites) - docker configured with nvidia as the default runtime.
- Kubernetes version >= 1.10
- Wind River Linux >= LTS 19
Source: [9]
Wind River Linux Image
For the OS image, make sure to add the following packages into your project:
- docker-ce
- git
- openssh
Install K3s
The following script downloads the latest version of K3s available; however, note that this tutorial uses version v1.18.2+k3s1
of K3s.
mkdir -p /usr/local/bin
curl -sfL https://get.k3s.io | sh -
After installation, check for status:
kubectl get nodes
Example output:
root@jetson-nano-qspi-sd:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jetson-nano-qspi-sd Ready master 88s v1.18.2+k3s1
Change the K3s Default Runtime
To use the NVIDIA runtime, add Docker as the default’s container runtime:
sed -i 's/server \\/server --docker \\/' /etc/systemd/system/k3s.service
systemctl daemon-reload
systemctl restart k3s
Then, add NVIDIA container runtime as the default docker runtime by modifying the file: /etc/docker/daemon.json
as follows:
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
Then, restart the Docker daemon:
systemctl restart docker
Install the NVIDIA K8s Device Plugin
To have the device plugin working on ARM64 architectures, we need to edit the NVIDIA device plugin with the following patches:
- 0001-arm64-add-support-for-arm64-architectures.patch
- 0002-nvidia-Add-support-for-tegra-boards.patch
- 0003-main-Add-support-for-tegra-boards.patch
Clone the original NVIDIA device plugin repo and apply the patches:
$ git clone -b 1.0.0-beta6 https://github.com/nvidia/k8s-device-plugin.git
$ cd k8s-device-plugin
$ wget https://labs.windriver.com/downloads/0001-arm64-add-support-for-arm64-architectures.patch
$ wget https://labs.windriver.com/downloads/0002-nvidia-Add-support-for-tegra-boards.patch
$ wget https://labs.windriver.com/downloads/0003-main-Add-support-for-tegra-boards.patch
$ git am 000*.patch
Then, build the device plugin container:
$ docker build -t nvidia/k8s-device-plugin:1.0.0-beta6 -f docker/arm64/Dockerfile.ubuntu16.04 .
Next, deploy the container into your cluster:
$ kubectl apply -f nvidia-device-plugin.yml
Finally, check the status of the pods and what until all of them are running:
$ kubectl get pods -A
An example output of the device plugin is as follows:
root@jetson-nano-qspi-sd:~/test/k8s-device-plugin# kubectl logs nvidia-device-plugin-daemonset-k8g57 --namespace=kube-system
2020/05/29 19:49:07 NVIDIA Tegra device detected!
2020/05/29 19:49:07 Starting FS watcher.
2020/05/29 19:49:07 Starting OS watcher.
2020/05/29 19:49:07 Retreiving plugins.
2020/05/29 19:49:07 Starting GRPC server for 'nvidia.com/gpu'
2020/05/29 19:49:07 Starting to serve 'nvidia.com/gpu' on /var/lib/kubelet/device-plugins/nvidia.sock
2020/05/29 19:49:07 Registered device plugin for 'nvidia.com/gpu' with Kubelet
Results
After following the installation steps, you would have a working Kubernetes node with an additional nvidia.com/gpu
resource:
$ kubectl describe node
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (2%) 0 (0%)
memory 70Mi (1%) 170Mi (4%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
nvidia.com/gpu 0 0
...
NVIDIA Container Runtime over K8s
Query the GPU device to verify that the NVIDIA runtime is working by deploying the following pod:
$ cat << EOF > query_pod.yml
apiVersion: v1
kind: Pod
metadata:
name: query_pod
spec:
restartPolicy: OnFailure
containers:
- image: jitteam/devicequery
name: query-ctr
resources:
limits:
nvidia.com/gpu: 1
EOF
$ kubectl apply -f query_pod.yml
Check the status of the pod query_pod
and wait until is equal"Completed"
:
$ kubectl get pod query_pod
Then, check the logs:
$ kubectl logs query_pod
Output:
root@jetson-nano-qspi-sd:~/k8s-device-plugin# kubectl logs pod1
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA Tegra X1"
The container detected 1 CUDA device correctly!
GPU concurrency
Test that the concurrency of the GPU resources is being handled correctly:
cat << EOF > concurrency.yml
apiVersion: v1
kind: Pod
metadata:
name: pod1
spec:
restartPolicy: OnFailure
containers:
- image: nvcr.io/nvidia/l4t-base:r32.4.2
name: pod1-ctr
command: ["sleep"]
args: ["30"]
resources:
limits:
nvidia.com/gpu: 1
---
apiVersion: v1
kind: Pod
metadata:
name: pod2
spec:
restartPolicy: OnFailure
containers:
- image: nvcr.io/nvidia/l4t-base:r32.4.2
name: pod2-ctr
command: ["sleep"]
args: ["30"]
resources:
limits:
nvidia.com/gpu: 1
EOF
Apply the changes and check the status of the second pod:
kubectl apply -f concurrency.yml
kubectl describe pod pod2
Output:
root@jetson-nano-qspi-sd:~/k8s-device-plugin# kubectl describe pod pod2
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
As you can see, the second pod failed to allocate the GPU because the first pod is already using it. As soon as the pod1
exists, the other pod runs successfully.
After waiting 30 seconds (the specified timeout), you will see this output instead:
root@jetson-nano-qspi-sd:~/k8s-device-plugin# kubectl describe pod pod2
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
Normal Scheduled <unknown> default-scheduler Successfully assigned default/pod2 to jetson-nano-qspi-sd
Normal Pulled 6s kubelet, jetson-nano-qspi-sd Container image "nvcr.io/nvidia/l4t-base:r32.4.2" already present on machine
Normal Created 6s kubelet, jetson-nano-qspi-sd Created container pod2-ctr
Normal Started 5s kubelet, jetson-nano-qspi-sd Started container pod2-ctr
root@jetson-nano-qspi-sd:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
pod1 0/1 Completed 0 2m38s
pod2 0/1 Completed 0 2m38s
Conclusions
The use of device plugins alongside K8s allows the orchestration of GPU enabled devices, together with correcting the concurrency issues faced before. Device plugins effectively support the discovery of external devices so that the containers can leverage the different types of hardware accelerators. It is now possible to manage GPU workloads at the edge using state-of-the-art technologies, therefore, enabling HPC areas such as AI to benefit from this kind of acceleration.
References
[1] A. M. Beltre, P. Saha, M. Govindaraju, A. Younge, and R. E. Grant, “Enabling hpc workloads on cloud infrastructure using kubernetes container orchestration mechanisms,” in 2019 ieee/acm international workshop on containers and new orchestration paradigms for isolated environments in hpc (canopie-hpc), 2019, pp. 11–20.
[2] C. Tarbett, “Why K3s Is the Future of Kubernetes at the Edge,”
Rancher Labs, Nov. 2019.
[3] T. Erl, Service-oriented architecture: Concepts, technology, and
design. USA: Prentice Hall PTR, 2005.
[4] “What is Kubernetes?” 2020.
[5] M. E. Piras, L. Pireddu, M. Moro, and G. Zanetti, “Container
Orchestration on HPC Clusters,” SpringerLink, pp. 25–35, Jun. 2019.
[6] “Kubernetes: part 1 architecture and main components overview,”, https://rtfm.co.ua/en/kubernetes-part-1-architecture-and-main-components-overview/
RTFM: Linux, DevOps and system administration. May-2020.
[7] Kubernetes, “Community,” GitHub. 2020.
[8] “K3s - 5 less than K8s,” Rancher Labs. Apr-2020.
[9] Nvidia, “k8s-device-plugin,” GitHub. May-2020.
All product names, logos, and brands are property of their respective owners.
All company, product and service names used in this software are for identification purposes only. Wind River are registered trademarks of Wind River Systems.
Disclaimer of Warranty / No Support: Wind River does not provide support and maintenance services for this software, under Wind River’s standard Software Support and Maintenance Agreement or otherwise. Unless required by applicable law, Wind River provides the software (and each contributor provides its contribution) on an “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, either express or implied, including, without limitation, any warranties of TITLE, NONINFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the software and assume ay risks associated with your exercise of permissions under the license.
Docker is a trademark of Docker, Inc.
Kubernetes is a trademark of The Linux Foundation.
NVIDIA, NVIDIA EGX, CUDA, Jetson, and Tegra are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.
By Pablo Rodriguez Quesada