Data Storage Organisation
As explained in the Overview, the available storage is divided into filesets, each has its specific use.
/gpfs/home
The home directories are stored on the /gpfs/home
fileset, this is the starting point when you log on Lucia. Each user has its own private space to store personal data, configurations, codes, etc. It is usually referred to as $HOME
or ~
.
By default, the permissions are read/write/execute for the owner only, i.e drwx------
, and the group ownership is set to the user's personal group.
The /gpfs/home
uses user quota, and limits are set as follows:
Block soft limit | Block hard limit | Block grace period | File soft limit | File hard limit | File grace period |
---|---|---|---|---|---|
200GB | 230GB | 7 days | 1000k files | 1300k files | 7 days |
You'll still be able to write data when exceeding the soft limit until the 7 days grace period expires or you reach the hard limit. Once the grace period expires, you'll have to reduce your usage below the soft limit to be able to write data again.
You can check your usage and limits with:
Project directories
Each project on Lucia has two main directories, one on the /gpfs/projects
fileset for storage, and one on the /gpfs/scratch
fileset as working directory. Those directories are intended for sharing data among members of the project.
The default permissions on the directories are "2770", i.e drwxrws---
, with the setgid bit set, allowing newly created files and subdirectories to automatically inherit the same group as the parent directory, see setgid below for more information.
Group quota are set on blocks and files on both filesets, the limits vary from one project to another following what was requested for the project. Note that the file limit depends on the block limit, by default the minimum is 500k files for projects with ≤500GB of block limit, this limit is then increased by 1k file per additional GB, and this limit is capped at 10000k files for projects with block limits >10000GB can be increased on demand).
You can check the usage and limits for your project on all filesets with:
If not specified in the DoW, the default quota values for industrial projects are:Fileset | Block soft limit | Block hard limit | Block grace period | File soft limit | File hard limit | File grace period |
---|---|---|---|---|---|---|
/gpfs/projects |
1000GB | 1300GB | 7 days | 2000k files | 1300k files | 7 days |
/gpfs/scratch |
1000GB | 1300GB | 7 days | 1000k files | 1300k files | 7 days |
Default quota
The default quota for any Unix group that isn't a project on /gpfs/projects
and /gpfs/scratch
is set to an extremely low value (16KB and 1 file), as a consequence you might get a Disk quota exceeded
error message if the project directory you're working on doesn't have its permissions and ownership properly set, see setgid below.
/gpfs/projects ($PROJECT_HOME
)
The /gpfs/projects
fileset is used to store and share data throughout the project's lifespan, typically software, devolopments, input files and important files that need to be kept after a job is completed.
You can specifically check the project usage and limits for the /gpfs/projects
fileset with:
/gpfs/scratch ($SCRATCH_HOME
)
The /gpfs/scratch
fileset is the workspace used for temporary data during job execution, and it partly consists of NVMe SSD for better performance.
You can specifically check the scratch usage and limits for the /gpfs/scratch
fileset with:
Periodical clean-up of the /gpfs/scratch
fileset
As the /gpfs/scratch
fileset is a temporary workspace build for performance, it will be cleaned up periodically to avoid dormant data. The clean-up will usually occur during the spring (end of May) and the fall (end of November) maintenance windows. A reminder will be sent 15 days prior the maintenance window.
Umask
The default umask on Lucia is currently quite permissive as it is set to 0002
(RHEL8’s default), meaning files and directories you create will be group writable and world readable.
While group writable permissions can be useful for instance when collaborating with other users in projects’ directories, be aware that other users of the same project might also intentionally or unintentionally modify or even delete your files and directories.
Depending on your preferences, you might want to restrict the default permissions and only relax them when needed using chmod. Here are some examples of the umask command:
- Display your current umask in octal values:
umask
- Display your current umask in symbolic values:
umask -S
- Set your umask to only allow group readable permissions and no other permssions:
umask 0027
Note that setting your umask on the command line will only modify it for the current session, if you want a permanent change, you'll have to add the command in your ~/.bashrc
.
For more information on umask and permissions, see Red Hat’s documentation
Setgid
The setgid bit is set on the project directories so that new files and directories created inside the project directories inherit the same group ownership as their parent directory instead of the primary group of the user.
When listing directory contents using ls -l
or directories themselves using ls -ld
, the setgid bit is indicated by an s
(lowercase s) in the group part of the permissions field, replacing the usual x
. I might also be indicated by an S
(uppercase s) when the execution permissions have been removed for the group.
In the example below, we first show the main project and scratch directories with their default permissions, then dir1
and dir2
which both have the setgid bit set, but dir2
doesn't have the read (r
), write (w
) and execute (x
) permissions set for the group, while dir3
has those permissions set without the setgid bit.
myusername@frontal01:~> ls -ld /gpfs/{projects,scratch}/company/project01/
drwxrws--- 3 root project01 4096 Mar 14 13:37 /gpfs/projects/company/project01/
drwxrws--- 4 root project01 4096 Mar 14 13:37 /gpfs/scratch/company/project01/
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/
total 2
drwxrws--- 2 myusername project01 4096 May 4 13:37 dir1
drwx--S--- 2 myusername project01 4096 May 4 13:37 dir2
drwxrwx--- 2 myusername project01 4096 May 4 13:37 dir3
Here's an example when trying to create a new file or a subdirectory in each of the 3 directories shown in the previous example:
myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir1/newfile_1
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir1/newsubdir_1
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir1/
total 1
-rw-rw---- 1 myusername project01 0 May 4 16:32 newfile_1
drwxrws--- 2 myusername project01 4096 May 4 16:32 newsubdir_1
myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir2/newfile_2
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir2/newsubdir_2
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir2/
total 1
-rw-rw---- 1 myusername project01 0 May 4 16:32 newfile_2
drwxrws--- 2 myusername project01 4096 May 4 16:32 newsubdir_2
myusername@frontal01:~> touch /gpfs/scratch/company/project01/dir3/newfile_3
touch: cannot touch '/gpfs/scratch/company/project01/dir3/newfile_3': Disk quota exceeded
myusername@frontal01:~> mkdir /gpfs/scratch/company/project01/dir3/newsubdir_3
mkdir: cannot create directory ‘/gpfs/scratch/company/project01/dir3/newsubdir_3’: Disk quota exceeded
myusername@frontal01:~> ls -l /gpfs/scratch/company/project01/dir3/
total 0
myusername@frontal01:~> mmlsquota -g project01 --block-size g ess:scratch
Disk quotas for group project01 (gid 666666):
Block Limits | File Limits
Filesystem Fileset type GB quota limit in_doubt grace | files quota limit in_doubt grace Remarks
ess scratch GRP 123 1000 1300 2 none | 141592 1024000 1331200 80 none ess.lucia.cenaero.be
dir3
, the system tried to create the new file and new subdirectory using our primary group (usually the same as our username) instead of the group of the project, and only project groups have an allocated quota on the projects and scratch filesets.
Some commands can also cause this issue, such as mv
which will preserve permissions and ownership of the source files and directories for their destination, so it is highly recommended to use cp
instead (without the -p
option obviously). scp
might also cause issues, destination files and directories will get the proper group ownership during the transfer but the setgid bit won't be added to the directories, possibly causing problems later, so we recommend using rsync
instead, with the --no-p
(turns off the preserve permissions), --no-g
(turns off the preserve group) and --chmod=ug-rwX
(ensures that all non-masked bits get enabled) options, for instance:
chmod
and chgrp
combined with find
to set the proper permissions and ownership, here are a couple of examples:
-
Set the setgid bit again on your project directories where it is missing:
# This find command will look in /gpfs/scratch/company/project01 for directories (-type d) belonging to the user named "myusername", that don't have the setgid bit set (! -perm -g=s), and add it by executing the "chmod g+s" on those directories (-exec chmod g+s {} \;) find /gpfs/scratch/company/project01 -type d -user myusername ! -perm -g=s -exec chmod g+s {} \; # (1)
-
Set the project group again on your project directories:
# This find command will look in /gpfs/scratch/company/project01 for files and directories belonging to the user named "myusername", that don't belong to the group "project01", and change their group ownership to "project01" by exectuting "chgrp project01" on those files or directores (-exec chgrp project01 {} \;) find /gpfs/scratch/company/project01 -user myusername ! -group project01 -exec chgrp project01 {} \;
Setting the setgid bit recursively
Be cautious with the chmod
command and avoid using it recursively (with the -R
option), especially when setting the setgid bit as this will also set it on files, which doesn't have the same effect as on directories, and allows them to be run using the privileges of the group of the file instead of the group of the user.
And finally, when working on projects it might sometimes be more convenient to use the newgrp
and/or sg
commands to temporarily change your primary group to the group of the project you're working on, see man newgrp
and man sg
for the differences between the two commands.