System Overview
The Lucia supercomputer consists in 3 main compute partitions, a storage partition, a service partition, and a management partition. Those partitions are interconnecter via an HDR Infiniband network and a 10Gb/s Ethernet network. The storage partition is based on IBM Spectrum Scale parallel filesystem and has storage capacicty of about 3PiB, and has an offsite backup system based on IBM Spectrum Protect.
Compute nodes
CPU nodes
- 75 HPE Apollo n2600 2U chassis supporting each 4 HPE ProLiant XL225n compute nodes.
- 1137 TFLOPS Rmax (LINPACK performance) / 1505 TFLOPS Rpeak (theoretical performance)
- Total CPU cores: 38400
Node details | Standard nodes | Medium nodes |
---|---|---|
Num. of nodes | 270 | 30 |
Node model | HPE XL225n | HPE XL225n |
Processors | 2x AMD EPYC 7763 64-core | 2x AMD EPYC 7763 64-core |
Processor freq. | 2.45GHz, boost up to 3.5GHz | 2.45GHz, boost up to 3.5GHz |
Processor L3 cache | 256MB | 256MB |
Cores per node | 128 | 128 |
Hyperthreading | Disabled | Disabled |
Memory | 256GB DDR4-3200 | 512GB DDR4-3200 |
User available mem. | 240GB | 492GB |
Ethernet | 2x 10Gbps | 2x 10Gbps |
Fast interconnect | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 |
Local disk | SATA SSD 480GB (system) | SATA SSD 480GB (system) |
Node hostnames | cns[001-270] |
cnm[001-030] |
GPU nodes
- 25 HPE Apollo d6500 6U chassis supporting each 2 HPE ProLiant XL645d compute nodes.
- 2717 TFLOPS Rmax (LINPACK performance) / 3900 TFLOPS Rpeak (theoretical performance)
- Ranked 245th on the Nov. 2022 TOP500 List
- Total GPUs: 200
Node details | GPU nodes |
---|---|
Num. of nodes | 50 |
Node model | HPE XL645d |
Processors | 1x AMD EPYC 7513 32-core |
Processor freq. | 2.6GHz, boost up to 3.65GHz |
Processor L3 cache | 128MB |
Cores per node | 32 |
Hyperthreading | Disabled |
Memory | 256GB DDR4-3200 |
User available mem. | 240GB |
Accelerators | 4x Nvidia A100 40GB |
Ethernet | 2x 10Gbps |
Fast interconnect | 2x Infiniband HDR-200 |
Local disk | SATA SSD 480GB (system) |
Node hostnames | cna[001-050] |
Other nodes
- 7 large memory nodes, 1 extra large memory node, 4 visualization nodes and 2 ai nodes:
Node details | Large memory | Extra large mem. | Visualization | AI |
---|---|---|---|---|
Node model | HPE ProLiant DL385 | HPE ProLiant DL385 | HPE ProLiant DL385 | HPE XL645d |
Num. of nodes | 7 | 1 | 4 | 2 |
Processors | 2x AMD EPYC 7513 32-core | 2x AMD EPYC 7513 32-core | 2x AMD EPYC 7313 16-core | 1x AMD EPYC 7513 32-core |
Processor freq. | 2.6GHz, boost up to 3.65GHz | 2.6GHz, boost up to 3.65GHz | 3GHz, boost up to 3.7GHz | 2.6GHz, boost up to 3.65GHz |
Processor L3 cache | 128MB | 128MB | 128MB | 128MB |
Cores per node | 64 | 64 | 32 | 32 |
Hyperthreading | Disabled | Disabled | Disabled | Disabled |
Memory | 2048GB DDR4-3200 | 4096GB DDR4-3200 | 512GB DDR4-3200 | 2048GB DDR4-3200 |
User available mem. | 2000GB | 4000GB | 492GB | 2000GB |
Accelerators | No | No | No | 8x Nvidia A100 SXM4 80GB |
Graphic Card | No | No | 4x Nvidia T4 16GB | No |
Ethernet | 2x 10Gbps | 2x 10Gbps | 2x 10Gbps | 2x 10Gbps |
Fast interconnect | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 | 2x Infiniband HDR-200 |
Local disk | SATA SSD 480GB (system) | SATA SSD 480GB (system) | 2x SATA SSD 480GB (system) | 2x SATA SSD 480GB (system) + 1x NVMe SSD 6.4TB (local scratch) |
Node hostnames | cnl[001-007] |
cnx001 |
cng[001-004] |
cni[001-002] |
Storage
The storage system is based on IBM's Spectrum Scale filesystem (formely known as GPFS). The home, projects and scratch spaces are using an unified filesystem consisting of 2 tiers:
- A flash tier of about 200TB, consisting of NVMe SSDs, acting as a burst-buffer. This storage space is also used to store the filesystem's metadata.
- A standard disk tier of about 2.87PB, consisting of NL-SAS disks, used for their increased capacity.
The logical partitionning is directly managed by IBM Spectrum Scale through "filesets", and data are migrated seemlessly between the two physical storage tiers.
filesets | home | projects | scratch | softs |
---|---|---|---|---|
Size | 200TB | 1.5PB | 1PB | 50TB |
Burst-buffer | no | no | yes | no |
Snapshots | yes | yes | no | yes |
Backup | yes | yes | no | yes |
Quota | user | group | group | no |
The scratch fileset using the burst-buffer has an aggregaded sequential read performance of about 270GB/s, a 200GB/s write performance, and is able to peak 4 to 5 millions IOPS read of 4k blocks. The other filesets have an aggregated sequential performance of 18GB/s read and write, and about 450k IOPS reading 4k blocks.
Backup
The backup solution is based on IBM Spectrum Protect, it uses an IBM TS4500 tape library equipped with 4 tape drives. The system currently has a net uncompressed 4PB capacity (200x 20TB tapes).
Interconnect
Lucia's communication network consists of two main parts, a 10Gb/s Ethernet network and an HDR Infiniband network:
Ethernet Network
The 10Gb Ethernet network is mainly used for administative communication and tasks, and also for connecting to the cluster with SSH and for user data transferts in and out of the cluster. The Ethernet network is divided in multiple subnets/VLANs for dedicated tasks such as node deployment, user access, or server/device management.
Infiniband Network
Lucia features an high speed low latency HDR Infiniband network in a non-blocking fat-tree topology. The Infiniband network is primarily used by the compute nodes to communicate with other nodes and transfer data during jobs, and by the high performance IBM Spectrum Scale storage system as well.
Software Environment
- Operating system: Red Hat Linux Enterprise 8
- Job scheduler: Slurm 23.02
- Web portal: Open OnDemand
- Programming environment: Cray PE 22.09
- Main software installation framework: EasyBuild 4.9.0