[Ekhi-users] CFM clusters partially down by Sunday August the 11th
Irene Azaceta
irene.azaceta at ehu.eus
Mon Aug 12 10:56:48 CEST 2024
Dear all,
We finished powering the nodes back on, so the clusters are fully
operational now.
Sorry for the trouble.
Kind regards,
Irene
On 8/9/24 11:18, Inigo Aldazabal Mensa wrote:
> Hi all,
>
> Due to the high temperatures expected for next Sunday, and in order to
> be able to power off the computing nodes without affecting the service,
> we are draining most of the CFM computer clusters' nodes so that jobs
> expected to end (as by indicated in the --time option at the moment of
> being run) after Sunday the 11th at 5am will not be run and will be kept
> queued.
>
> That is, you can still submit jobs as normal and, if they are "in
> time", they'll be run. If not, they'll be kept in PENDING state and run
> once we put everything up again.
>
> Now, with this you see why you should try to specify the job duration
> --time in the slurm scripts and don't let slurm use the default value
> one which usually is one week or so (it depends on every cluster
> policies).
>
> Remember that for any help or information regarding the clusters you can
> write to the computing service common email
>
> CFM Scientific Computing Service <hpc.cfm at ehu.eus>
>
> and Irene and/or I will take care of your request.
>
> Bests,
>
> Iñigo
>
--
Dra. Irene Azáceta
HPC Engineer
Centro de Física de Materiales (CSIC-UPV/EHU)
Paseo Manuel de Lardizabal 5
20018 San Sebastián (Gipuzkoa)
SPAIN
e-mail: irene.azaceta at ehu.eus
More information about the Ekhi-users
mailing list