Rocket is a heterogeneous HPC cluster that currently consists of about 64 compute nodes, featuring almost 13000 cores, a bit over 50 terabytes of memory and 64 GPUs, interconnected by high-speed low-latency Infiniband networking. The cluster also utilizes two General Parallel File systems, which in sum total provide almost 10 petabytes of usable storage space.

This cluster is usable for both University of Tartu and ETAIS users. After requesting and receiving a user account, one can access the cluster via SSH protocol by logging into the rocket.hpc.ut.ee headnode.

Overview

The main part of the Rocket cluster consists of:

  • 40 high density AMD CPU nodes (called ares 1-20, artemis 1-20)
  • 8 compute nodes with GPUs (falcon 1 to 6, pegasus 1 and 2)
  • 4 high memory Intel machines (called bfr 1 to 4)
  • 12 Intel CPU nodes (called sfr 1 to 12)
  • 2 headnode (login1.hpc.ut.ee, login2.hpc.ut.ee)
  • 8 testing nodes (called stage1 – stage8)

In addition to these nodes, there are a few GPFS filesystem servers which will provide fast storage for the entire cluster.

All the machines mentioned above are connected to a fast Infiniband fabric, powered by Mellanox switches.

In addition to Infiniband, all aforementioned machines are also connected to a regular ethernet network for easier access. Machines are connected together with 1/10/25/40 Gbit/s Ethernet in order to provide fast access from these machines to outside of the cluster network, to the University central network and beyond, depending on necessity.

All nodes in the Rocket cluster are running the latest RHEL 9.

You can submit your computations to the cluster using SLURM.

20 high density nodes, Ares 1-20

  • 2x AMD EPYC 7702 64-Core Processor (128 cores total)
  • 1 TB RAM
  • 8TB of fast SSD temporary space
  • HDR infiniband @ 100 Gbps

20 AMD nodes, Artemis 1-20

  • 2x AMD EPYC 7763 64-Core Processor (128 cores total)
  • 1 TB RAM
  • 8TB of fast SSD temporary space
  • HDR infiniband @ 100 Gbps

4 big memory nodes, BFR1-4 (Lenovo ThinkSystem SR630)

  • 2x Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz (40 cores total)
  • 1 TB RAM
  • 8TB of fast SSD temporary space
  • FDR Infiniband, clocked down to 4x QDR for cluster cohesion

12 CPU nodes, SFR1-12 (Lenovo ThinkSystem SR630)

  • 2x Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz (40 cores total)
  • 256 GB RAM
  • 8TB of fast SSD temporary space
  • FDR Infiniband, clocked down to 4x QDR for cluster cohesion

6 GPU nodes, Falcon 1-6, purchase funded by Institute of Computer Sciences:

  • 2 x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (48 cores total)
  • 512 GB RAM
  • 5TB of local SSD storage
  • Infiniband:
    • Falcon 1-3 – 2x 40 Gbps each
    • Falcon 4-6 – 5x 100 Gbps each
  • 24x NVIDIA Tesla V100 GPUs:
    • Falcon 3 versions have 16 GB of VRAM.
    • Falcon 4-6 versions have 32 GB of VRAM.

2 GPU nodes with Tesla a100 GPUS:

pegasus.hpc.ut.ee

  • \
  • 2 x AMD EPYC 7642 48-Core Processors (192 cores total)
  • 512 GB RAM
  • 1.6TB of local SSD storage
  • 7 x Tesla a100 with 40GB vRAM each
  • Infiniband:
    • 1x 200Gb connection

pegasus2.hpc.ut.ee

  • 2x AMD EPYC 7713 64-Core Processors (256 cores total)
  • 2 TB RAM
  • 15TB of local SSD storage
  • 8 x Tesla a100 with 80GB vRAM each
  • Infiniband:
    • 9 x 100Gb connections

8 testing nodes, stage1 – stage8 (HP ProLiant SL230s Gen8)

  • 2 x Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz (20 cores total)
  • 64GB RAM
  • 1TB HDD (~860GB usable)
  • 4x QDR Infiniband

The following storage branches are mounted to all machines in the Rocket cluster:

  • /gpfs/space – 4.7 PB
    Declustered RAID based GPFS specific very high performance disk storage with a transparent Flash tier.

  • /gpfs/terra – 4.0 PB
    Declustered RAID based GPFS specific very high performance disk storage with a transparent Flash tier.

Pricing

The table below indicates the prices of our services for the structural units of University of Tartu and users outside of the University. For additional information please check our pricing.

HPC Compute servers
  • CPU 0.012 EUR/core h or
  • Memory 0.012 EUR/6GB/h.
  • GPU 0.5 EUR/GPU/h.

  • Memory usage is calculated by 6 GB segments.
    Price depends on the use of which resource is greater. Accounting of the use the computing servers is based on the amount of memory (with 1 unit being equal to 6 GB of RAM/h) and processors (with 1 unit being equal to 1 core/h) allocated to the user’s job(s).

  • When determining the order of starting compute jobs, jobs with higher priority are given preference.
Calculate your costs
Storage space
  • Storage 80 EUR/1TB/year
  • In case of 2x replicated data, 160 EUR/1TB/year.
  • A copy stored on tape costs 30 EUR/1TB/year.
  • Replicated + tape stored data costs 190 EUR/1TB/year.

  • Usable protocols are Samba, NFS, dsmc (TSM command line tool) and direct usage from HPC cluster.
    If looking for simpler access protocols, S3 is a better option.
Calculate your costs
Administrator’s hourly rate
  • Rate 60 EUR/h

  • Will be applied in case the desired software requires unusually long and complicated installation process.
Our team will be on university holiday break from December 23rd to December 29th. Regular HPC support services will be limited during this time, and our responses may be delayed. Please be aware that on December 31st, our services will be available until 12:00 PM. Additionally, our office will be closed on January 1st. Have a joyful holiday season!