GPULab Client CLI

Prerequisites

You need a fed4fire account to use GPULab. You can sign up at the imec User Authority.

To run the CLI, you need pip for python3 to install the gpulab-client. To install it on debian/ubuntu, try:

sudo apt-get install python3-pip

Make sure you have at least python 3.4. You can check with:

python3 --version

Tip

If your Linux distribution does not have a recent enough Python, try using pyenv which works on (almost) any Linux:

curl -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash
pyenv update
#optional on debian:   apt-get install libbz2-dev libreadline-dev libsqlite3-dev
pyenv install 3.6.2
pyenv local 3.6.2
pyenv versions
python3 --version

The last command should show that you now have a recent enough python version.

(Note: If you install python locally using this method, you do not need to add “sudo” in front of the installation command in the next section.)

Installation

The current version of GPULab is 1.8

You can download it here: gpulab-client-1.8.tar.gz.

All version can be found at the gitlab GPULab tags page. This includes the changelog for each version.

To install, run:

sudo pip3 install gpulab-client-1.8.tar.gz

The python “pip” system will take care of all details. You will end up with a local install of gpulab-cli

Basic CLI usage

After installation, the gpulab-cli command is available:

$ gpulab-cli --help
Usage: gpulab-cli [OPTIONS] COMMAND [ARGS]...

   GPULab client version 1.5

   Send bugreports, questions and feedback to: jfedbugreports@ilabt.imec.be

   Documentation: https://doc.ilabt.imec.be/ilabt-documentation/gpulab.html

Options:
  --cert PATH          Login certificate  [required]
  -p, --password TEXT  Password associated with the login certificate
  --dev                Use the GPULab development environment
  --servercert PATH    The file containing the servers (self-signed)
                       certificate. Only required when the server uses a self
                       signed certificate.
  --version            Print the GPULab client version number and exit.
  -h, --help           Show this message and exit.

Commands:
  cancel    Cancel running job
  clusters  Retrieve info about the available clusters
  debug     Retrieve a job's debug info. (Do not rely on the presence or
            format of this info. It will never be stable between versions. If
            this has the only source o info you need, ask the developrs to
            expose that info in a different way!)
  hold      Hold queued job(s). Status will change from QUEUED to ONHOLD
  jobs      Get info about one or more jobs
  log       Retrieve a job's log
  release   Release held job(s). Status will change from ONHOLD to QUEUED
  rm        Remove job
  submit    Submit a jobDefinition to run
  wait      Wait for a job to change state

To get a list of currently running jobs:

$ gpulab-cli --cert /home/me/my_wall2_login.pem jobs
TASK ID                             NAME                      COMMAND                   CREATED              USER            PROJECT         STATUS

This command is quite long, you can store some of this info in environment variables, so you don’t have to type them each time.

export GPULAB_CERT='/home/me/my_wall2_login.pem'
export GPULAB_DEV='False'

Recommendation

If you append these exports to ~/.bashrc you’ll never have to type them again!

To same command to get a list of currently running jobs is now much shorter:

gpulab-cli jobs

Note

Using the CLI without password

You can use the CLI without password. Be aware that this lowers security.

You need to install openssl to execute the commands below. On Debian, try:

sudo apt-get install openssl

The password is “stored” in the PEM file, because it is used to encrypt the private RSA key inside the PEM file. You can decrypt the RSA key and store it, to remove the password. Below, we assume that your (password protected) wall2 PEM file is in my_wall2_login.pem. The commands will create the file my_wall2_login_decrypted.pem which will not be password protected.

Use these commands:

openssl rsa -in my_wall2_login.pem > my_wall2_login_decrypted.pem
openssl x509 -in my_wall2_login.pem >> my_wall2_login_decrypted.pem

(The first command will ask your password, the second won’t)

Submitting a GPULab Job

A GPULab job is defined by a JSON job definition, which looks as follows:

my-first-jobDefinition.json
{
  "jobDefinition": {
    "name": "Iterativenet",
    "description": "hello world",
    "clusterId": 1,
    "dockerImage": "gpulab.ilabt.imec.be:5000/sample:nvidia-smi",
    "jobType": "BATCH",
    "command": "",
    "resources": {
      "gpus": 1,
      "systemMemory": 2000,
      "cpuCores": 2
    },
    "jobDataLocations": [ ],
    "portMappings": [ ]
  }
}

To submit the job, you’ll have to specify the name of the project on the wall2 authority in which you want it to run. The command is:

$ gpulab-cli submit --project=myproject < my-first-jobDefinition.json
78125766-0b45-11e8-be1c-0fbd357c0b05

A hash representing the job ID is returned.

You can now query the status of this job using this job ID or first part of this job ID (if it’s unique enough):

$ gpulab-cli jobs 7812
          Job ID: 78125766-0b45-11e8-be1c-0fbd357c0b05
           Name: no name
        Project: fed4fire
       Username: wvdemeer
   Docker image: gpulab.ilabt.imec.be:5000/sample:nvidia-smi
        Command: -
         Status: FINISHED
        Created: 2018-02-06T13:56:02-07:00
      Worker ID: -
Worker hostname: 192.168.0.1
        Started: 2018-02-06T14:09:44-07:00
       Duration: 1 second
       Finished: 2018-02-06T14:09:45-07:00
       Deadline: 2018-02-07T00:09:44-07:00

You can also view the command line output of the job:

$ gpulab-cli log 7812
2018-02-06T14:09:45.185400608Z
2018-02-06T14:09:45.185451167Z ==============NVSMI LOG==============
2018-02-06T14:09:45.185459009Z
2018-02-06T14:09:45.185466771Z Timestamp                           : Tue Feb  6 14:09:45 2018
2018-02-06T14:09:45.185471972Z Driver Version                      : 390.12
2018-02-06T14:09:45.185477068Z
2018-02-06T14:09:45.185490896Z Attached GPUs                       : 1
2018-02-06T14:09:45.185612743Z GPU 00000000:02:00.0
2018-02-06T14:09:45.185713708Z     Product Name                    : GeForce GTX 580
2018-02-06T14:09:45.186226030Z     Product Brand                   : GeForc
...

You can also view the internal event log of GPULab. This is mostly useful for debugging purposes:

$ gpulab-cli debug 7812