[Ekhi-users] CFM clusters partially down by Sunday August the 11th

Irene Azaceta irene.azaceta at ehu.eus
Mon Aug 12 10:56:48 CEST 2024


Dear all,

We finished powering the nodes back on, so the clusters are fully 
operational now.

Sorry for the trouble.

Kind regards,

Irene

On 8/9/24 11:18, Inigo Aldazabal Mensa wrote:
> Hi all,
>
> Due to the high temperatures expected for next Sunday, and in order to
> be able to power off the computing nodes without affecting the service,
> we are draining most of the CFM computer clusters' nodes so that jobs
> expected to end (as by indicated in the --time option at the moment of
> being run) after Sunday the 11th at 5am will not be run and will be kept
> queued.
>
> That is, you can still submit jobs as normal and, if they are "in
> time", they'll be run. If not, they'll be kept in PENDING state and run
> once we put everything up again.
>
> Now, with this you see why you should  try to specify the job duration
> --time in the slurm scripts and don't let slurm use the default value
> one which usually is one week or so (it depends on every cluster
> policies).
>
> Remember that for any help or information regarding the clusters you can
> write to the computing service common email
>
> CFM  Scientific Computing Service <hpc.cfm at ehu.eus>
>
> and Irene and/or I will take care of your request.
>
> Bests,
>
> Iñigo
>
-- 
Dra. Irene Azáceta
HPC Engineer
Centro de Física de Materiales (CSIC-UPV/EHU)
Paseo Manuel de Lardizabal 5
20018 San Sebastián (Gipuzkoa)
SPAIN

e-mail: irene.azaceta at ehu.eus



More information about the Ekhi-users mailing list