[Ekhi-users] Ekhi queue down due to high temperatura

Inigo Aldazabal Mensa inigo.aldazabalm at ehu.eus
Thu Aug 20 23:03:55 CEST 2020


Hi all,

Ekhi cluster computing nodes are back online again and you already can
submit your jobs. Sorry for the inconveniences!

Regarding the cooling system, one of the Data Center two coolers went
down on Tuesday and the technician could not come to fix it until
today. The other one was over-stressed due to being "alone" and to the
high temperatures, and it went down because of this. This is different
to what happened 20 days ago, but these really high temperatures
(for our local climate) are very hard for our cooling systems and
chances are that on such cases something may go wrong.

In a couple of weeks I expect to have the new cooling system
installed, which will reinforce the present one, making these kind of
failures much more unlikely. Also another, different, system in on its
way to make the whole data center even more redundant to this kind of
situations. We'll see next summer :-)

Bests,

Iñigo

On Wed, 19 Aug 2020 18:50:42 +0200
Inigo Aldazabal Mensa <inigo.aldazabalm en ehu.eus> wrote:

> Hi,
> 
> Again we had problems with the CFM Data Center 2 cooling system
> that forced me to cancel all running jobs and shut down all Ekhi
> computing nodes :-(( 
> 
> Currently I'm working remotely most of the time (in fact, again, I
> was on a free day today) but tomorrow I'll go to the CFM and talk to
> the cooling system technicians to check this. Also, a new cooling
> system to avoid this problems is on the way, but just a couple of
> weeks of being installed.
> 
> Depending on my talk with the technicians tomorrow I'll power back on
> Ekhi computing nodes, but I can not say for sure. I'll keep you
> informed.
> 
> Bests (to say something),
> 
> Iñigo
> 


More information about the Ekhi-users mailing list