[Ekhi-users] Ekhi two new "fat" nodes

Inigo Aldazabal Mensa inigo.aldazabalm at ehu.eus
Thu Oct 14 11:08:55 CEST 2021


Hi all,

I already deployed the two new Ekhi big memory nodes, ekhi[29-30]. They
are different to the existing ones. Every node has:

2 x Intel CLX-SP 4210R 2.2GHz processors, 10 cores each (20cores/node)
12 x 64GB DDR4-2933 memory  (768GB/node)
NO Infiniband network

There nodes belong now to three partitions: "all", "test" and a new
"fat" partition with only these two nodes and a time limit for the
jobs of 14 days. This "fat" partition is targeted to big memory
calculations and it's priority is much higher that the rest, so that
jobs requesting the "fat" partition (only nodes ekhi[29-30]) will have
higher priority that jobs submitted to "all" partition and will likely
be executed first (in the fat nodes, that is).

Now, if you submit a job to the "all" default partition and the job
requirements fit ekhi[29-30] characteristics, the job may run in any of
those two nodes. A bit of caution has to be taken here. Ekhi[29-30] do
NOT have the high speed - low latency Infiniband network the rest of
the nodes have, so only one-node jobs must be run there.

Thus, if you are submitting a multi-node calculation please exclude
ekhi[29-30] nodes explicitly in your batch script adding:

#SBATCH --exclude=ekhi[29-30]


Any problem you find or clarification you may want just tell me.
Bests,

Iñigo

-- 
Iñigo Aldazabal Mensa, Ph.D.
HPC Computing Centre Manager / Scientific Computing Specialist
Centro de Física de Materiales (CSIC-UPV/EHU)
Paseo Manuel de Lardizabal, 5
20018 San Sebastian - Guipuzcoa
SPAIN

phone: +34-943-01-8780
e-mail: inigo.aldazabal at csic.es inigo.aldazabalm at ehu.eus
pgp key id: 0xDBCC8369



More information about the Ekhi-users mailing list