Skip to content

Data Storage Organisation

As explained in the Overview, the available storage is divided into filesets, each has its specific use.

/gpfs/home

The home directories are stored on the /gpfs/home fileset, this is the starting point when you log on Lucia. Each user has its own private space to store personal data, configurations, codes, etc. It is usually referred to as $HOME or ~.
By default, the permissions are read/write/execute for the owner only, i.e drwx------, and the group ownership is set to the user's personal group.
The /gpfs/home uses user quota, and limits are set as follows:

Block soft limit Block hard limit Block grace period File soft limit File hard limit File grace period
200GB 230GB 7 days 1000k files 1300k files 7 days

You'll still be able to write data when exceeding the soft limit until the 7 days grace period expires or you reach the hard limit. Once the grace period expires, you'll have to reduce your usage below the soft limit to be able to write data again.

You can check your usage and limits with:

mmlsquota -u <username> --block-size g ess:home

Project directories

Each project on Lucia has two main directories, one on the /gpfs/projects fileset for storage, and one on the /gpfs/scratch fileset as working directory. Those directories are intended for sharing data among members of the project.

The default permissions on the directories are "2770", i.e drwxrws---, with the setgid bit set, allowing newly created files and subdirectories to automatically inherit the same group as the parent directory, see setgid below for more information.

Group quota are set on blocks and files on both filesets, the limits vary from one project to another following what was requested for the project. Note that the file limit depends on the block limit, by default the minimum is 500k files for projects with ≤500GB of block limit, this limit is then increased by 1k file per additional GB, and this limit is capped at 10000k files for projects with block limits >10000GB can be increased on demand).

You can check the usage and limits for your project on all filesets with:

mmlsquota -g <project_name> --block-size g ess
If not specified in the DoW, the default quota values for industrial projects are:

Fileset Block soft limit Block hard limit Block grace period File soft limit File hard limit File grace period
/gpfs/projects 1000GB 1300GB 7 days 2000k files 1300k files 7 days
/gpfs/scratch 1000GB 1300GB 7 days 1000k files 1300k files 7 days

Default quota

The default quota for any Unix group that isn't a project on /gpfs/projects and /gpfs/scratch is set to an extremely low value (16KB and 1 file), as a consequence you might get a Disk quota exceeded error message if the project directory you're working on doesn't have its permissions and ownership properly set, see setgid below.

/gpfs/projects ($PROJECT_HOME)

The /gpfs/projects fileset is used to store and share data throughout the project's lifespan, typically software, devolopments, input files and important files that need to be kept after a job is completed.

You can specifically check the project usage and limits for the /gpfs/projects fileset with:

mmlsquota -g <project_name> --block-size g ess:projects

/gpfs/scratch ($SCRATCH_HOME)

The /gpfs/scratch fileset is the workspace used for temporary data during job execution, and it partly consists of NVMe SSD for better performance.

You can specifically check the scratch usage and limits for the /gpfs/scratch fileset with:

mmlsquota -g <project_name> --block-size g ess:scratch

Periodical clean-up of the /gpfs/scratch fileset

As the /gpfs/scratch fileset is a temporary workspace build for performance, it will be cleaned up periodically to avoid dormant data. The clean-up will usually occur during the spring (end of May) and the fall (end of November) maintenance windows. A reminder will be sent 15 days prior the maintenance window.

Umask

The default umask on Lucia is currently quite permissive as it is set to 0002 (RHEL8’s default), meaning files and directories you create will be group writable and world readable.

While group writable permissions can be useful for instance when collaborating with other users in projects’ directories, be aware that other users of the same project might also intentionally or unintentionally modify or even delete your files and directories.

Depending on your preferences, you might want to restrict the default permissions and only relax them when needed using chmod. Here are some examples of the umask command:

  • Display your current umask in octal values: umask
  • Display your current umask in symbolic values: umask -S
  • Set your umask to only allow group readable permissions and no other permssions: umask 0027

Note that setting your umask on the command line will only modify it for the current session, if you want a permanent change, you'll have to add the command in your ~/.bashrc.

For more information on umask and permissions, see Red Hat’s documentation

Setgid

The setgid bit is set on the project directories so that new files and directories created inside the project directories inherit the same group ownership as their parent directory instead of the primary group of the user.

When listing directory contents using ls -l or directories themselves using ls -ld, the setgid bit is indicated by an s (lowercase s) in the group part of the permissions field, replacing the usual x. I might also be indicated by an S (uppercase s) when the execution permissions have been removed for the group.

In the example below, we first show the main project and scratch directories with their default permissions, then dir1 and dir2 which both have the setgid bit set, but dir2 doesn't have the read (r), write (w) and execute (x) permissions set for the group, while dir3 has those permissions set without the setgid bit.

myusername@frontal01:~> ls -ld /gpfs/{projects,scratch}/company/project01/
drwxrws--- 3 root project01 4096 Mar 14 13:37 /gpfs/projects/company/project01/
drwxrws--- 4 root project01 4096 Mar 14 13:37 /gpfs/scratch/company/project01/

myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/
total 2
drwxrws--- 2 myusername project01 4096 May  4 13:37 dir1
drwx--S--- 2 myusername project01 4096 May  4 13:37 dir2
drwxrwx--- 2 myusername project01 4096 May  4 13:37 dir3
As explained above, when the setgid bit is set on a directory, the files and subdirectories created in that directory should, in theory, inherit the same group ownership, subdirectories should also inherit the setgid bit itself, however the latter might not be preserved or might get removed under certain circumstances, generally leading to a "Disk quota exceeded" error message.

Here's an example when trying to create a new file or a subdirectory in each of the 3 directories shown in the previous example:

myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir1/newfile_1
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir1/newsubdir_1
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir1/
total 1
-rw-rw---- 1 myusername project01    0 May  4 16:32 newfile_1
drwxrws--- 2 myusername project01 4096 May  4 16:32 newsubdir_1

myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir2/newfile_2
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir2/newsubdir_2
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir2/
total 1
-rw-rw---- 1 myusername project01    0 May  4 16:32 newfile_2
drwxrws--- 2 myusername project01 4096 May  4 16:32 newsubdir_2

myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir3/newfile_3
touch: cannot touch '/gpfs/scratch/company/project01/dir3/newfile_3': Disk quota exceeded
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir3/newsubdir_3
mkdir: cannot create directory ‘/gpfs/scratch/company/project01/dir3/newsubdir_3’: Disk quota exceeded
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir3/
total 0

myusername@frontal01:~> mmlsquota -g project01 --block-size g ess:scratch

Disk quotas for group project01 (gid 666666):
                         Block Limits                                               |     File Limits
Filesystem Fileset    type             GB      quota      limit   in_doubt    grace |    files   quota    limit in_doubt    grace  Remarks
ess        scratch    GRP             123       1000       1300          2     none |   141592 1024000  1331200       80     none ess.lucia.cenaero.be
As we can see, the quota for the project "project01" on the scratch fileset is well below its volume or file number limits, in this case the "Disk quota exceeded" message is caused by the fact that, in absence of the setgid bit on dir3, the system tried to create the new file and new subdirectory using our primary group (usually the same as our username) instead of the group of the project, and only project groups have an allocated quota on the projects and scratch filesets.

Some commands can also cause this issue, such as mv which will preserve permissions and ownership of the source files and directories for their destination, so it is highly recommended to use cp instead (without the -p option obviously). scp might also cause issues, destination files and directories will get the proper group ownership during the transfer but the setgid bit won't be added to the directories, possibly causing problems later, so we recommend using rsync instead, with the --no-p (turns off the preserve permissions), --no-g (turns off the preserve group) and --chmod=ug-rwX (ensures that all non-masked bits get enabled) options, for instance:

rsync -av --progress --no-p --no-g --chmod=ug=rwX <src> <dest>
In cases where there are already lots of files and subdirectories in a project directory with wrong permissions or ownership, you might want to use chmod and chgrp combined with find to set the proper permissions and ownership, here are a couple of examples:

  • Set the setgid bit again on your project directories where it is missing:

    # This find command will look in /gpfs/scratch/company/project01 for directories (-type d) belonging to the user named "myusername", that don't have the setgid bit set (! -perm -g=s), and add it by executing the "chmod g+s" on those directories (-exec chmod g+s {} \;)
    find /gpfs/scratch/company/project01 -type d -user myusername ! -perm -g=s -exec chmod g+s {} \; # (1)
    

  • Set the project group again on your project directories:

    # This find command will look in /gpfs/scratch/company/project01 for files and directories belonging to the user named "myusername", that don't belong to the group "project01", and change their group ownership to "project01" by exectuting "chgrp project01" on those files or directores (-exec chgrp project01 {} \;)
    find /gpfs/scratch/company/project01 -user myusername ! -group project01 -exec chgrp project01 {} \;
    

Setting the setgid bit recursively

Be cautious with the chmod command and avoid using it recursively (with the -R option), especially when setting the setgid bit as this will also set it on files, which doesn't have the same effect as on directories, and allows them to be run using the privileges of the group of the file instead of the group of the user.

And finally, when working on projects it might sometimes be more convenient to use the newgrp and/or sg commands to temporarily change your primary group to the group of the project you're working on, see man newgrp and man sg for the differences between the two commands.