Set-up a Multi GPU server from Scratch

Shivaraj karki
3 min readMar 18, 2021

--

This blog is outcome of dirtied hands on multi GPU server from Bhabani Mohapatra and team. A good learning environment will aid you not only learn coding stuffs in AI but also gives proper understanding of hardware beneath it. I am lucky to be part of this team.

This blog guides through details about how to start and how to set-up a fully working multi GPU server from scratch, so fasten your seat belt and have patience because it takes some time and effort to arrive.

multi GPU setup from ipoor.org

We can roughly categories efforts into following steps

1. Get a multi GPU server setup

2. Which GPU cards to be selected

3. UBUNTU server installation

4. Install packages to run AI ML codes

Which GPU card to be selected

AI community prefers NVIDIA GPU cards, latest one when I’m writing this blog is GeForce RTX 3090. Nvidia supports AI libraries and most of the time its the default choice

Place these GPUs on GPU slots in server (as shown in above videos) with proper power cables connected.

UBUNTU server installation

For Multi GPU server, most of the time Ubuntu OS stands as first choice due to its large community support, and being open source makes it economic.

Here are the few links which guides you to install ubuntu to server

Set-up SSD

These server prefers SSD setup, one can hot plug SSDs and configure hard disk in ubuntu.

  1. To configure in ext4 format

2. To configure in RAID 5

Install packages to run AI ML codes

Verify you have CUDA capable GPU and Install appropriate CUDA and cudnn version. This link will guide you through entire process

Once you have nvidia-smi detecting all your GPUs with proper CUDA and cudnn version, then your GPU server is ready to accelerate AI simulations.

This blog is brief compilation of resources available to build Multi-GPU setup, this helps reader to get server ready without much glitches. If something goes wrong in-between one should not hesitate to start over from beginning.

--

--

Shivaraj karki
Shivaraj karki

Written by Shivaraj karki

Technical Lead @HCL. I implement Deep Learning, Computer Vision models on production line.

No responses yet