Basics

Before beginning the explanation of the various functionalities of GLUON and the use of the queue manager, it is important to highlight certain basic features that all users must be aware of.

Directories

Your personal, unique, and non-transferable directory (home) is located in

[gluon_user@glui01 ~]$  /lhome/ific/u/username

Where “username” is your username. Personal directories are distributed into subdirectories divided by the initial letter of the usernames. Thus, the user “username”, whose initial is “u”, can be found within the directory /lhome/ific/u/.

In addition to this directory, each workgroup is assigned a group directory with a larger storage capacity. Each group must manage the space and administer its subdirectories as it sees fit. This group directory is located at:

[gluon_user@glui01 ~]$  /lustre/ific.uv.es/prj/gl

Within this directory, there is a subdirectory named after your group. As already mentioned, managing the space in this directory is the responsibility of each research group.

User Running Limits

Although there may be times when GLUON has a high availability of nodes, this does not mean that they can all be used by the same user. This could lead to conflicts since high availability at a specific moment does not imply that it will be maintained throughout the duration of a job running (in fact, availability is highly dynamic).

To avoid these conflicts, there is a limit on the number of CPUs/nodes that can be used by a single user at the same time. Currently, this limit is set at 576 CPUs / 6 nodes.

Time Reservation by Job: JobFlavour

It is very important when submitting to HTCondor to request a reservation time for the job. GLUON is configured so that the default reservation time is 20 minutes, unless a specific time is requested. These times are requested through the selection of a JobFlavour. The equivalencies between the different JobFlavours and the reserved time are as follows:

Request time

JobFlavour

Option 1

JobFlavour

Option 2

JobFlavour

Option 3

20 minutes

espresso

cafe

prueba

1 hour

microcentury

siesta

breve

2 hours

longlunch

comida

corto

8 hours

workday

jornada

normal

1 day

tomorrow

dia

largo

4 days

testmatch

puente

muylargo

7 days

nextweek

semana

eterno

Depending on the requested time, the job will be launched on one node or another within GLUON. Not all nodes are capable of starting all jobs. The authorization matrix by node according to the JobFlavour is as follows:

glwn

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

20 minutes

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

1 hour

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

2 hours

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

8 hours

F

F

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

F

F

F

1 day

F

F

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

F

F

F

4 days

F

F

F

F

F

F

F

F

F

F

F

F

F

F

T

T

T

T

T

T

T

T

T

F

F

F

7 days

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

T

T

F

F

F

F

F

F

F

F

This table shows whether a job can run on the different work nodes, depending on the request time.

Use of Compute Nodes According to Job Type

GLUON features a 200GBE Infiniband connection with low latency, enabling jobs not only to operate within a single compute node but also to utilize CPUs across multiple nodes. This capability allows users to launch jobs that exceed the typical limit of 96 CPUs, as the system can handle multi-node reservations to optimize job execution. However, this feature might cause conflicts between different jobs. To mitigate this, some nodes are specifically configured to run multi-node (parallel) jobs, and others are set up for single-node (vanilla) jobs. The current setup of GLUON is as follows:

glwn

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

Vanilla

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

F

F

F

F

F

F

Parallel

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

T

T

T

T

T

F

F

F

If your job can run on a single node (fewer than 96 CPUs), it is preferable to use vanilla directly, as the wait times for your job in the queue will be shorter.

Problems?

If there is any issue or if you have any questions, please use the IFIC ticket service to let us know. Within the IT section, there is a ticket queue dedicated exclusively to GLUON.