[Ekhi-users] Ekhi queue down due to high temperatura

Inigo Aldazabal Mensa inigo.aldazabalm at ehu.eus
Fri Aug 21 22:32:02 CEST 2020


Hi all,

It seems you are having some problems running jobs on ekhi. But some
don't, as you can see with "squeue".

In order to troubleshoot your problems please report:

1.- Directory you did "sbatch" from
2.- slurm.script used
3.- JobID

ekhi11 had some problems and it should be fixed now.


Iñigo


On Wed, 19 Aug 2020
18:50:42 +0200 Inigo Aldazabal Mensa <inigo.aldazabalm en ehu.eus> wrote:

> Hi,
> 
> Again we had problems with the CFM Data Center 2 cooling system
> that forced me to cancel all running jobs and shut down all Ekhi
> computing nodes :-(( 
> 
> Currently I'm working remotely most of the time (in fact, again, I
> was on a free day today) but tomorrow I'll go to the CFM and talk to
> the cooling system technicians to check this. Also, a new cooling
> system to avoid this problems is on the way, but just a couple of
> weeks of being installed.
> 
> Depending on my talk with the technicians tomorrow I'll power back on
> Ekhi computing nodes, but I can not say for sure. I'll keep you
> informed.
> 
> Bests (to say something),
> 
> Iñigo
> 


More information about the Ekhi-users mailing list