Discovery Cluster Details
Discovery is a Linux cluster that in aggregate contains 128 nodes, 6712 CPU cores, 54.7TB of memory, and more than 2.8 PB of disk space.
Cell | Vendor | CPU | Cores | Ram | GPU | Scratch | Nodes |
---|---|---|---|---|---|---|---|
a | Dell | AMD EPYC 75F3 (2.95GHz) | 64 | 1TB | 2 A100 | 5.9TB | a01-a05 |
p | Dell | Intel Xeon Gold 6248 (2.50GHz) | 40 | 565GB | 4 Tesla V100 | 1.5TB | p01-p04 |
q | HPE | AMD EPYC 7532 (2.4GHz) | 64 | 512GB | None | 820GB | q01-q10 |
s | Dell | AMD EPYC 7543 (2.80GHz) | 64 | 512GB | None | 718GB | s01-s44 |
t | Lenovo | ThinkSystem SR645 V3 | 64 | 768GB | None | 719GB | t01-t10 |
centurion | EXXACT | AMD EPYC 7453 (2.7GHz) | 56 | 506GB | 8 A5500 | 7TB | centurion01-centurion09 |
amp | EXXACT | Intel Xeon Gold 6258R (2.70GHz) | 56 | 506GB | 10 A5000 | 7TB | amp01-amp06 |
adanova01 | EXXACT | AMD EPYC 7513 (2.60GHz) | 64 | 2TB | 4 l40s | 7TB | adanova01 |
Discovery offers researchers the ability to have specialized heads nodes available inside the cluster for dedicated compute. These nodes can come equipped with up to 64 compute cores and 1.5TB of memory
Operating System :
GPU compute nodes are available to the free members of discovery using the gpuq
queue. Including other specialized GPU partitions such as:
gpuq
– High-end compute nodes with A100 GPUs in MIG with 40GB slices (free members)
a100
– High-end GPU nodes with A100 GPUs – Paid tier
v100
– High-end GPU nodes with v100 GPUs – Paid tier /(preemptable)
v100_preemptable
– High-end compute nodes with v100 GPUs (preemptable)
a5000
– Mid-range GPU nodes optimized for general GPU workloads (preemptable)
a5500
– Mid-range GPU nodes optimized for general GPU workloads (preemptable)
adanova01
– High-end GPU nodes optimized for general GPU workloads (preemptable)
Partitions marked (preemptable)
run the risk of preemption. Use with caution and always make sure you are check pointing if possible!
Interactive nodes, named andes | polaris
, are available for testing and debugging your programs interactively before submitting them to the main cluster through the scheduler.
Node Interconnects