System Overview
The Lucia supercomputer is structured as follows:
- A Compute Partition composed of CPU nodes, GPU nodes, and specialized nodes designed to handle various computational tasks, memory-intensive jobs, visualization, and AI processing.
- A Storage Partition based on the IBM Spectrum Scale parallel filesystem (formerly known as GPFS) with approximately 3 PiB of storage capacity. This partition includes an offsite backup system managed by IBM Spectrum Protect.
- Service and Management Partitions: These partitions handle essential system operations and management tasks but are not directly accessible or relevant to end users.
Those partitions are interconnected via an HDR InfiniBand network and a 10Gb/s Ethernet network.
Compute Partition
The distribution of the various categories of compute nodes is depicted in the following pie chart:
The Rpeak performance of the various categories of compute nodes is depicted in the following bar chart:
CPU Nodes
The CPU section consists of 75x HPE Apollo n2600 chassis, each supporting 4x HPE ProLiant XL225n compute nodes, offering:
- Total Performance: 1137 TFLOPS Rmax (LINPACK) / 1505 TFLOPS Rpeak (theoretical).
- Total CPU Cores: 38,400 cores.
Node Details | Standard Nodes | Medium Nodes |
---|---|---|
Num. of Nodes | 270 | 30 |
Node Model | HPE XL225n | HPE XL225n |
Processors | 2x AMD EPYC 7763 64-core | 2x AMD EPYC 7763 64-core |
Processor Frequency | 2.45GHz (boost up to 3.5GHz) | 2.45GHz (boost up to 3.5GHz) |
Processor L3 Cache | 256MB | 256MB |
Cores per Node | 128 | 128 |
Hyperthreading | Disabled | Disabled |
Memory | 256GB DDR4-3200 | 512GB DDR4-3200 |
User-Available Memory | 240GB | 492GB |
Ethernet | 2x 10Gbps | 2x 10Gbps |
Fast Interconnect | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 |
Local Disk | SATA SSD 480GB (system) | SATA SSD 480GB (system) |
Node Hostnames | cns[001-270] |
cnm[001-030] |
GPU Nodes
The GPU nodes are housed in 25x HPE Apollo d6500 chassis, each supporting 2x HPE ProLiant XL645d nodes and 4x Nvidia A100 GPUs per node. They provide:
- Total Performance: 2717 TFLOPS Rmax (LINPACK) / 3900 TFLOPS Rpeak (theoretical).
- Total GPUs: 200 Nvidia A100.
- TOP500 Ranking: Ranked 245th on the Nov. 2022 TOP500 List.
Node Details | GPU Nodes |
---|---|
Num. of Nodes | 50 |
Node Model | HPE XL645d |
Processors | 1x AMD EPYC 7513 32-core |
Processor Frequency | 2.6GHz (boost up to 3.65GHz) |
Processor L3 Cache | 128MB |
Cores per Node | 32 |
Hyperthreading | Disabled |
Memory | 256GB DDR4-3200 |
User-Available Memory | 240GB |
Accelerators | 4x Nvidia A100 40GB |
Ethernet | 2x 10Gbps |
Fast Interconnect | 2x Infiniband HDR-200 |
Local Disk | SATA SSD 480GB (system) |
Node Hostnames | cna[001-050] |
Specialized Nodes
This partition also includes nodes tailored for specific high-memory, visualization, and AI workloads.
- Large Memory Nodes: Ideal for memory-intensive applications, with up to 4TB of memory.
- Visualization Nodes: Equipped with Nvidia T4 GPUs for rendering and visualization tasks.
- AI Nodes: Optimized for deep learning and machine learning workloads, featuring Nvidia A100 GPUs with a total of 8 GPUs per node.
Node Details | Large Memory | Extra Large Memory | Visualization | AI |
---|---|---|---|---|
Node Model | HPE ProLiant DL385 | HPE ProLiant DL385 | HPE ProLiant DL385 | HPE XL645d |
Num. of Nodes | 7 | 1 | 4 | 2 |
Processors | 2x AMD EPYC 7513 32-core | 2x AMD EPYC 7513 32-core | 2x AMD EPYC 7313 16-core | 1x AMD EPYC 7513 32-core |
Processor Frequency | 2.6GHz (boost up to 3.65GHz) | 2.6GHz (boost up to 3.65GHz) | 3GHz (boost up to 3.7GHz) | 2.6GHz (boost up to 3.65GHz) |
Processor L3 Cache | 128MB | 128MB | 128MB | 128MB |
Cores per Node | 64 | 64 | 32 | 32 |
Hyperthreading | Disabled | Disabled | Disabled | Disabled |
Memory | 2048GB DDR4-3200 | 4096GB DDR4-3200 | 512GB DDR4-3200 | 2048GB DDR4-3200 |
User-Available Memory | 2000GB | 4000GB | 492GB | 2000GB |
Accelerators | No | No | No | 8x Nvidia A100 SXM4 80GB |
Graphics Card | No | No | 4x Nvidia T4 16GB | No |
Ethernet | 2x 10Gbps | 2x 10Gbps | 2x 10Gbps | 2x 10Gbps |
Fast Interconnect | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 | 1x Infiniband HDR-100 | 2x Infiniband HDR-200 |
Local Disk | SATA SSD 480GB (system) | SATA SSD 480GB (system) | 2x SATA SSD 480GB (system) | 2x SATA SSD 480GB (system) + 1x NVMe SSD 6.4TB (local scratch) |
Node Hostnames | cnl[001-007] |
cnx001 |
cng[001-004] |
cni[001-002] |
Storage partition
The storage system is based on IBM's Spectrum Scale filesystem (formely known as GPFS). The home, projects and scratch spaces are using a unified filesystem consisting of 2 tiers:
- A flash tier of about 200TB, consisting of NVMe SSDs, acting as a burst-buffer. This storage space is also used to store the filesystem's metadata.
- A standard disk tier of about 2.87PB, consisting of NL-SAS disks, used for their increased capacity.
The logical partitionning is directly managed by IBM Spectrum Scale through "filesets", and data are migrated seemlessly between the two physical storage tiers.
filesets | home | projects | scratch | softs |
---|---|---|---|---|
Size | 200TB | 1.5PB | 1PB | 50TB |
Burst-buffer | no | no | yes | no |
Snapshots | yes | yes | no | yes |
Backup | yes | yes | no | yes |
Quota | user | group | group | no |
The scratch fileset using the burst-buffer has an aggregaded sequential read performance of about 270GB/s, a 200GB/s write performance, and is able to peak 4 to 5 millions IOPS read of 4k blocks.
The other filesets have an aggregated sequential performance of 18GB/s read and write, and about 450k IOPS reading 4k blocks.
Backup
The backup solution is based on IBM Spectrum Protect, it uses an IBM TS4500 tape library equipped with 4 tape drives. The system currently has a net uncompressed 4PB capacity (200x 20TB tapes).
Interconnect
Lucia's communication network consists of two main parts, a 10Gb/s Ethernet network and an HDR Infiniband network:
Ethernet Network
The 10Gb Ethernet network is mainly used for administative communication and tasks, and also for connecting to the cluster with SSH and for user data transfers in and out of the cluster. The Ethernet network is divided in multiple subnets/VLANs for dedicated tasks such as node deployment, user access, or server/device management.
Infiniband Network
Lucia features a high speed low latency HDR Infiniband network in a non-blocking fat-tree topology. The Infiniband network is primarily used by the compute nodes to communicate with other nodes and transfer data during jobs, and by the high performance IBM Spectrum Scale storage system as well.
Software Environment
- Operating system: Red Hat Linux Enterprise 8
- Job scheduler: Slurm 23.02
- Web portal: Open OnDemand
- Programming environment: Cray PE 22.09
- Main software installation framework: EasyBuild 4.9.0