Cluster description
Hardware
Compute nodes
Node | Entity | Type | Threads | Memory | GPUs |
---|---|---|---|---|---|
cpu-node[32-39] | RPBS | Supermicro X8DTT-H | 12 (2x Intel Xeon CPU X5650 @ 2.67GHz) | 48GB | |
cpu-node[69-72] | EDC | Dell PowerEdge FC630 | 32 (2x Intel Xeon CPU E5-2630 v3 @ 2.40GHz) | 64GB | |
cpu-node79 | RPBS | Dell PowerEdge FC630 | 56 (2x Intel Xeon CPU E5-2660 v4 @ 2.00GHz) | 128GB | |
cpu-node80 | RPBS | Dell PowerEdge FC640 | 96 (2x Intel Xeon Gold 5220R @ 2.20GHz) | 192GB | |
cpu-node[81-84] | RPBS | Dell PowerEdge FC640 | 64 (2x Intel Xeon Gold 5218 @ 2.30GHz) | 256GB | |
cpu-node110 | RPBS | Dell PowerEdge R920 | 120 (4x Intel Xeon E7-4870 v2 @ 2.30GHz) | 512GB | |
cpu-node[130-145] | iPOP-UP | Dell PowerEdge C6525 | 128 (2x AMD EPYC 7452 @ 2.35GHz) | 256GB | |
cpu-node146 | RPBS | ProLiant DL560 Gen10 | 144 (4x Intel Xeon Gold 6254 CPU @ 3.10GHz) | 1536GB | |
gpu-node[1,3] | Master BI | ProLiant DL380 Gen10 Plus | 64 (2x Intel Xeon Silver 4314 CPU @ 2.40GHz) | 256GB | 2x Tesla A30 48GB |
gpu-node2 | Master ISDD | ProLiant DL380 Gen10 Plus | 64 (2x Intel Xeon Silver 4314 CPU @ 2.40GHz) | 256GB | 2x Tesla A30 48GB |
gpu-node4 | CMPLI | Asus Z10PE-D8 WS | 40 (2x Intel Xeon E5-2630 v4 @ 2.20GHz) | 32GB | 2x Geforce GTX1080Ti |
gpu-node[5-6] | CMPLI | Asus Z10PE-D8 WS | 40 (2x Intel Xeon E5-2630 v4 @ 2.20GHz) | 32GB | 2x GeForce RTX2080Ti |
gpu-node[7-8] | CMPLI | Asus Z10PE-D8 WS | 40 (2x Intel Xeon E5-2630 v4 @ 2.20GHz) | 32GB | 2x GeForce RTX2080Ti |
gpu-node9 | CMPLI | Dell PowerEdge R7525 | 64 (2x AMD EPYC 7282 @ 2.80GHz) | 128GB | 3x Quadro RTX6000 |
gpu-node10 | Master BI | Dell PowerEdge R7525 | 32 (2x AMD EPYC 7262 @ 3.20GHz) | 64GB | 3x Quadro RTX6000 |
gpu-node11 | RPBS | Dell PowerEdge T640 | 32 (2x Intel Xeon Silver 4208 @ 2.10GHz) | 96GB | 1x GeForce RTX3090 / 1x RTX4000 Ada |
gpu-node13 | RPBS | Dell PowerEdge T640 | 32 (2x Intel Xeon Silver 4208 @ 2.10GHz) | 96GB | 2x GeForce RTX3080Ti |
gpu-node14 | CMPLI | Dell Precision 7920 Rack | 64 (2x Intel Xeon Gold 6226R @ 2.90GHz) | 192GB | 3x RTXA6000 |
gpu-node[15,17-18] | RPBS | ProLiant DL385 Gen10 Plus v2 | 64 (2x AMD EPYC 7313 @ 3.0GHz) | 512GB | 2x Tesla A100 80GB |
gpu-node16 | CMPLI | ProLiant DL385 Gen10 Plus v2 | 64 (2x AMD EPYC 7313 @ 3.0GHz) | 512GB | 2x Tesla A100 80GB |
gpu-node19 | RPBS | Dell Precision 7960 Rack | 64 (2x Intel Xeon Gold 6426Y @ 2.60GHz) | 64GB | 2x RTXA4500 |
Infrastructure
Node | Entity | Type | Role |
---|---|---|---|
master1 | iPOP-UP | Dell PowerEdge R440 | Slurm controler, Slurm database, gateway |
virtserv4 | RPBS | Dell PowerEdge R730 | Virtualization |
virtserv5 | RPBS | Dell PowerEdge R440 | Virtualization |
virtserv6 | iPOP-UP | HP ProLiant DL380 Gen10 | Virtualization |
virtserv7 | RPBS | HP ProLiant DL380 Gen11 | Virtualization |
directory | CMPLI | HP ProLiant DL160 Gen10 | LDAP server, DHCP, DNS |
bastion | CMPLI | HP ProLiant DL160 Gen10 | bastion |
get-away | RPBS | Dell PowerEdge R420 | reverse proxy, mailer |
Storage
Hot storage
Node | Entity | Type | Storage | Role |
---|---|---|---|---|
hot-storage-mds1 | iPOP-UP | Dell PowerEdge R740xd | 8x 1.92TB SSDs | Metadata / management server (Lustre) |
hot-storage-oss[1-2] | iPOP-UP | DELL PowerEdge R740xd | 20x 3.84TB SSDs | Object Storage server (Lustre) |
Mountpoint | Capacity | Backup |
---|---|---|
/shared | 125 To | No |
Cold storage
Node | Entity | Type | Storage | Role |
---|---|---|---|---|
cold-storage1 | CMPLI | DELL PowerEdge R740xd | 12x 12TB HDD | NFS server |
cold-storage1 (extension) | iPOP-UP | Dell PowerVault MD1400 | 12x 12TB HDD | NFS server (extension) |
cold-storage2 | CMPLI | DELL PowerEdge R740xd | 12x 12TB HDD | Backup |
Mountpoint | Capacity | Backup |
---|---|---|
/cold-storage | 214 To | Yes |
Old cluster (Mobyle jobs only)
Node | Partition | Type | CPUs | Memory |
---|---|---|---|---|
node73 | mobyle | Dell PowerEdge FC630 | 32 (2x Intel Xeon CPU E5-2630 v3 @ 2.40GHz) | 64GB |
cpu-node[74-76] | mobyle | Dell PowerEdge FC630 | 40 (2x Intel Xeon CPU E5-2630 v4 @ 2.20GHz) | 64GB |
cpu-node[77-78] | mobyle | Dell PowerEdge FC630 | 56 (2x Intel Xeon CPU E5-2660 v4 @ 2.00GHz) | 128GB |
Node | Entity | Type | Role |
---|---|---|---|
slurmmaster | RPBS | Dell PowerEdge R720 | Slurm controler, Slurm database, gateway, Docker registry |
Node | Entity | Type | Storage | Role |
---|---|---|---|---|
joule2 | RPBS | Dell PowerEdge R730xd | 10x 4TB HDD | NFS server |
backup | RPBS | Dell PowerEdge R730xd | 20x 4TB HDD | Backup |
lustre-mds | RPBS | Dell PowerEdge R720xd | 10x 300GB HDD | Metadata / management server (Lustre) |
lustre-oss[1-2] | RPBS | Dell PowerEdge R720xd | 20x 1TB HDD | Object Storage server (Lustre) |
Network
Node | Entity | Type | Description |
---|---|---|---|
cmpli-rtr1 | iPOP-UP | Juniper SRX1500 | 16x 1GbE + 4x 10GbE ports |
cmpli-rtr2 | CMPLI | Juniper SRX1500 | 16x 1GbE + 4x 10GbE ports |
cmpli-sw | CMPLI | Juniper EX4600 | 24x SFP+ 10GbE ports |
sw-data-mgmt-[1-2] | RPBS | Juniper EX3300 | 24x 1000BASE-T 1GbE ports |
sw-dell-10g-[1,3] | RPBS | Dell N4032F | 24x SFP+ 10GbE ports |
sw-dell-10g-2 | RPBS | Dell S4128F-ON | 28x SFP+ 10GbE ports |
sw-dell-100g | RPBS | Dell S5232F-ON | 32x QSFP28 100GbE |
Software layer
The cluster is managed by Slurm (version 21.08.8-2).
Scientific software and tools are available through Environment Modules and are mainly based on Conda packages or Singularity images.
Operating System: CentOS 7, Rocky Linux 9
Around the cluster management: Proxmox VE
Deployment and configuration are powered by Saltstack, Terraform and Gitlab.