Storage

The storage volumes which are attached to your job are specific to the project within which they are run. ie. All jobs run within one project will see the same files in a specific volume.

Only files saved within a mounted volume are stored permanently. All files stored outside a mounted volume are ephemeral and will be lost when the job ends.

Available storage volumes

/project_antwerp

On the Antwerp-based clusters 7 and 8, a 100TB DDN A3`I storage cluster is available under `/project_antwerp``.

The most straight-forward way to mount this storage is:

"storage": [
    {
       "containerPath": "/project_antwerp"
    }
],

This will cause a directory /project_antwerp to be available to you.

If you want it to be mounted under /project, you can specify:

"storage": [
    {
       "hostPath": "/project_antwerp",
       "containerPath": "/project"
    }
],

/project_scratch

The scratch-storage is a fast slave-specific storage, typically backed by SSD’s in RAID0. Consider binding your job to a specific slave with slaveName if you want to access files stored on a specific scratch storage.

The following slaves have a scratch storage available:

  • slave6A: The HGX-2 in Ghent has a 94TB scratch storage;
  • slave7A: The DGX-2 in Antwerp has a 28TB scratch storage;
  • slave8A: The DGX-1 in Antwerp has a 7TB scratch storage.

Caution

As these storages are backed by a RAID0 disk array, one disk failure can mean the whole storage is corrupted. For example, on the HGX-2 the storage is backed by 16 enterprise SSD’s with a MTBF of 2,000,000 hours.

Do only store files here that you can afford to lose.

To mount the project scratch folder to /project_scratch, you specify it as containerPath:

"storage": [
    {
       "containerPath": "/project_scratch"
    }
],

This will cause a directory /project_scratch to be bound to the local scratch disk inside your docker container.

/project

GPULab jobs running on Ghent-based slaves (everything except cluster 7 and 8) can access the same storage as Virtual Wall 2 projects . It will contain the same data as in /groups/wall2-ilabt-iminds-be/MyProject/, As the same NFS share is mounted behind the scenes.

The data is shared instantly, and is thus instantly available everywhere (it’s the same NFS share everywhere).

Please note that the NFS server backing this storage is maxed out in terms of capacity, resulting in intermittent IO-errors. It has a total capacity of 12TB, but typically has only a few 100’s of GB available, and sometimes runs out of space.

This storage will be phased out in the future, when a new /project_ghent storage will be introduced.

Important

There are no automatic backups for this storage! You need to keep backups of important files yourself!

On Antwerp-based slaves, /project is an alias for /project_scratch for legacy reasons. Please stop using this storage path, as it will be phased out in the future.

Accessing the storages outside of GPULab

In this section, we discuss some options to access the storages from elsewhere.

Access from Virtual Wall 2 (/project only)

When you start an experiment with wall2 resources in the experiment MyProject, on all your nodes you can find the shared /project storage in this directory:

/groups/ilabt-imec-be/MyProject/ (for accounts on imec or Fed4FIRE+ Testbed Portal)
/groups/wall2-ilabt-iminds-be/MyProject/ (for accounts on Legacy authority)
Use the jFed experimenter GUI to reserve a resource, and access the data from
that resource. You can find a detailed tutorial on how to do this in the Fed4Fire first experiment tutorial. Note that jFed has basic scp functionality, to make transferring files easier.

Access from JupytherHub

The iLab.t JupytherHub allows you to select which storage you want to mount. JupyterHub will show (one of) the selected storage(s) as the default folder, to prevent accidental data loss.

../_images/jupyterhub-storage.png

When using a terminal in JupyterHub, remember to switch to the correct folder to retrieve your files.

cd /project_scratch

Note

In some cases, you might get Invalid response: 403 Forbidden when you try to access your files.

This is because the permissions on /project have been changed and are too restrictive. This is typically done by the virtual wall, and might be triggered by other users in your project.

To fix this, open a terminal in jupyterhub, and type:

sudo chmod uog+rwx /project

Access over SFTP

The job request below shows how you can use the atmoz/sftp Docker image to get SFTP access to your storage in GPULab:

sftp-server.json
{
    "name": "SFTP Server",
    "description": "Temporary SFTP server to easily upload/download files",
    "request": {
        "resources": {
            "cpus": 1,
            "gpus": 0,
            "cpuMemoryGb": 2,
            "clusterId": 6
        },
        "docker": {
            "image": "atmoz/sftp:latest",
            "command": "gpulabuser:gpulabpass::<MY GID>:",
            "storage": [
                {
                    "containerPath": "/home/gpulabuser/project_scratch",
                    "hostPath": "/project_scratch"
                }
            ],
            "portMappings": [
                {
                    "containerPort": 22,
                    "hostPort": null,
                    "hostIp": null
                }
            ]
        },
        "scheduling": {
            "interactive": true
        }
    }
}

Before launching, you need to replace <MY GID> in the command with the correct numeric group ID that is assigned to your data folders in GPULab. You can retrieve them via ls -ln in a terminal:

root@0fda924deb83:/# ls -ln | grep project
drwxrwx---    10 10003 6059  512 Dec  3 15:38 project
drwxrwx---     2 10003 6059   38 Dec  5 07:45 project_scratch

In this instance, the command must be adapted to gpulabuser:gpulabpass::6059:

By setting this value, we make sure that the user gpulabuser that is created on startup belongs to the correct GID to have access to your data folders.

Note that we mount the /project_scratch to /home/gpulabuser/project_scratch, as only folders in /home/gpulabuser are visible in the SFTP-server.

On connectivity

The GPULab-slaves have no public IPv4 addresses. To access the exposed SFTP-port, you need either IPv6 connectivity, or you must connect to the IDLab VPN to access them via their private IPv4 address.

Usage Example:

thijs@ibcn055:~$ gpulab-cli submit --project twalcari-test < sftp-server.json
3f83d2c2-1736-11ea-93a1-1ba327a60b6f
thijs@ibcn055:~$ gpulab-cli jobs 3f83d2
         Job ID: 3f83d2c2-1736-11ea-93a1-1ba327a60b6f
           Name: SFTP Server
    Description: Temporary SFTP server to easily upload/download files
        Project: twalcari-test
       Username: twalcari
   Docker image: atmoz/sftp:latest
        Command: gpulabuser:gpulabpass::6059:
         Status: RUNNING
        Created: 2019-12-05T09:07:14+01:00
  State Updated: 2019-12-05T09:07:21+01:00
         Queued: 2019-12-05T09:07:14+01:00
     Cluster ID: 6
      Worker ID: 6
    Worker Name: hgx-2
  Port Mappings: 22/tcp -> 33022
    Worker Host: 10.2.46.110
     SSH login:: ssh LTXHA2KL@10.2.46.110
        Started: 2019-12-05T09:07:19+01:00
       Duration: 11 seconds
       Finished: -
       Deadline: 2019-12-05T19:07:19+01:00
thijs@ibcn055:~$ sftp -P 33022 gpulabuser@10.2.46.110
The authenticity of host '[10.2.46.110]:33022 ([10.2.46.110]:33022)' can't be established.
ED25519 key fingerprint is SHA256:etWA3AH6wgsbrUEUiDc1jvYe+/ScUYdbf8s2hxaIaH8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[10.2.46.110]:33022' (ED25519) to the list of known hosts.
gpulabuser@10.2.46.110's password:
Connected to 10.2.46.110.
sftp> ls
project_scratch
sftp> cd project_scratch/
sftp> ls
my-data.txt
sftp>