May 27, 2022


Future Technology

Deploy your containerized AI applications with nvidia-docker

More and extra products and products and services are getting edge of the modeling and prediction capabilities of AI. This short article presents the nvidia-docker device for integrating AI (Artificial Intelligence) software bricks into a microservice architecture. The most important edge explored listed here is the use of the host system’s GPU (Graphical Processing Unit) sources to accelerate various containerized AI applications.

To have an understanding of the usefulness of nvidia-docker, we will start by describing what sort of AI can profit from GPU acceleration. Secondly we will current how to carry out the nvidia-docker resource. Lastly, we will describe what equipment are readily available to use GPU acceleration in your apps and how to use them.

Why applying GPUs in AI applications?

In the field of artificial intelligence, we have two major subfields that are applied: device mastering and deep understanding. The latter is part of a larger sized relatives of equipment discovering strategies dependent on artificial neural networks.

In the context of deep understanding, in which operations are fundamentally matrix multiplications, GPUs are additional effective than CPUs (Central Processing Models). This is why the use of GPUs has developed in current years. Indeed, GPUs are deemed as the coronary heart of deep mastering for the reason that of their massively parallel architecture.

However, GPUs cannot execute just any method. In fact, they use a unique language (CUDA for NVIDIA) to acquire benefit of their architecture. So, how to use and connect with GPUs from your apps?

The NVIDIA CUDA technological know-how

NVIDIA CUDA (Compute Unified Unit Architecture) is a parallel computing architecture blended with an API for programming GPUs. CUDA translates application code into an instruction set that GPUs can execute.

A CUDA SDK and libraries this kind of as cuBLAS (Standard Linear Algebra Subroutines) and cuDNN (Deep Neural Community) have been formulated to communicate simply and efficiently with a GPU. CUDA is out there in C, C++ and Fortran. There are wrappers for other languages including Java, Python and R. For case in point, deep studying libraries like TensorFlow and Keras are centered on these technologies.

Why using nvidia-docker?

Nvidia-docker addresses the requirements of builders who want to incorporate AI features to their apps, containerize them and deploy them on servers driven by NVIDIA GPUs.

The aim is to established up an architecture that permits the progress and deployment of deep discovering products in solutions available via an API. Therefore, the utilization charge of GPU methods is optimized by earning them offered to many application instances.

In addition, we reward from the positive aspects of containerized environments:

  • Isolation of circumstances of each and every AI design.
  • Colocation of several styles with their unique dependencies.
  • Colocation of the same design less than many variations.
  • Consistent deployment of types.
  • Design functionality checking.

Natively, making use of a GPU in a container requires setting up CUDA in the container and offering privileges to obtain the system. With this in mind, the nvidia-docker software has been developed, permitting NVIDIA GPU devices to be exposed in containers in an isolated and secure method.

At the time of composing this report, the most current edition of nvidia-docker is v2. This edition differs greatly from v1 in the pursuing strategies:

  • Edition 1: Nvidia-docker is executed as an overlay to Docker. That is, to generate the container you experienced to use nvidia-docker (Ex: nvidia-docker run ...) which performs the actions (among other individuals the development of volumes) letting to see the GPU units in the container.
  • Model 2: The deployment is simplified with the substitution of Docker volumes by the use of Docker runtimes. In truth, to start a container, it is now required to use the NVIDIA runtime by way of Docker (Ex: docker operate --runtime nvidia ...)

Observe that owing to their various architecture, the two versions are not appropriate. An software composed in v1 should be rewritten for v2.

Location up nvidia-docker

The necessary features to use nvidia-docker are:

  • A container runtime.
  • An offered GPU.
  • The NVIDIA Container Toolkit (main aspect of nvidia-docker).



A container runtime is required to run the NVIDIA Container Toolkit. Docker is the suggested runtime, but Podman and containerd are also supported.

The formal documentation gives the set up procedure of Docker.


Drivers are expected to use a GPU system. In the situation of NVIDIA GPUs, the drivers corresponding to a offered OS can be acquired from the NVIDIA driver obtain site, by filling in the facts on the GPU model.

The installation of the motorists is performed by means of the executable. For Linux, use the adhering to commands by changing the name of the downloaded file:

chmod +x NVIDIA-Linux-x86_64-470.94.operate

Reboot the host device at the conclude of the set up to acquire into account the put in drivers.

Setting up nvidia-docker

Nvidia-docker is available on the GitHub venture website page. To put in it, adhere to the installation handbook depending on your server and architecture specifics.

We now have an infrastructure that allows us to have isolated environments providing entry to GPU sources. To use GPU acceleration in applications, various instruments have been formulated by NVIDIA (non-exhaustive listing):

  • CUDA Toolkit: a established of applications for creating software/applications that can complete computations applying the two CPU, RAM, and GPU. It can be made use of on x86, Arm and Power platforms.
  • NVIDIA cuDNN]( a library of primitives to speed up deep mastering networks and enhance GPU functionality for major frameworks this kind of as Tensorflow and Keras.
  • NVIDIA cuBLAS: a library of GPU accelerated linear algebra subroutines.

By making use of these instruments in software code, AI and linear algebra jobs are accelerated. With the GPUs now noticeable, the software is equipped to send out the data and operations to be processed on the GPU.

The CUDA Toolkit is the most affordable stage selection. It gives the most handle (memory and guidance) to develop personalized apps. Libraries offer an abstraction of CUDA features. They let you to focus on the application development alternatively than the CUDA implementation.

After all these features are implemented, the architecture applying the nvidia-docker company is all set to use.

Right here is a diagram to summarize almost everything we have seen:



We have established up an architecture allowing for the use of GPU means from our purposes in isolated environments. To summarize, the architecture is composed of the subsequent bricks:

  • Working method: Linux, Windows …
  • Docker: isolation of the setting working with Linux containers
  • NVIDIA driver: installation of the driver for the components in query
  • NVIDIA container runtime: orchestration of the former 3
  • Purposes on Docker container:
    • CUDA
    • cuDNN
    • cuBLAS
    • Tensorflow/Keras

NVIDIA continues to acquire tools and libraries close to AI technologies, with the objective of creating alone as a chief. Other systems may complement nvidia-docker or may perhaps be a lot more acceptable than nvidia-docker dependent on the use situation.

Resource connection