Basics
Before beginning the explanation of the various functionalities of GLUON and the use of the queue manager, it is important to highlight certain basic features that all users must be aware of.
Directories
Your personal, unique, and non-transferable directory (home) is located in
[gluon_user@glui01 ~]$ /lhome/ific/u/username
Where “username” is your username. Personal directories are distributed into subdirectories divided by the initial letter of the usernames. Thus, the user “username”, whose initial is “u”, can be found within the directory /lhome/ific/u/.
In addition to this directory, each workgroup is assigned a group directory with a larger storage capacity. Each group must manage the space and administer its subdirectories as it sees fit. This group directory is located at:
[gluon_user@glui01 ~]$ /lustre/ific.uv.es/prj/gl
Within this directory, there is a subdirectory named after your group. As already mentioned, managing the space in this directory is the responsibility of each research group.
User Running Limits
Although there may be times when GLUON has a high availability of nodes, this does not mean that they can all be used by the same user. This could lead to conflicts since high availability at a specific moment does not imply that it will be maintained throughout the duration of a job running (in fact, availability is highly dynamic).
To avoid these conflicts, there is a limit on the number of CPUs/nodes that can be used by a single user at the same time. Currently, this limit is set at 576 CPUs / 6 nodes.
Time Reservation by Job: JobFlavour
It is very important when submitting to HTCondor to request a reservation time for the job. GLUON is configured so that the default reservation time is 20 minutes, unless a specific time is requested. These times are requested through the selection of a JobFlavour
. The equivalencies between the different JobFlavours and the reserved time are as follows:
Request time |
JobFlavour Option 1 |
JobFlavour Option 2 |
JobFlavour Option 3 |
---|---|---|---|
20 minutes |
espresso |
cafe |
prueba |
1 hour |
microcentury |
siesta |
breve |
2 hours |
longlunch |
comida |
corto |
8 hours |
workday |
jornada |
normal |
1 day |
tomorrow |
dia |
largo |
4 days |
testmatch |
puente |
muylargo |
7 days |
nextweek |
semana |
eterno |
Depending on the requested time, the job will be launched on one node or another within GLUON. Not all nodes are capable of starting all jobs. The authorization matrix by node according to the JobFlavour is as follows:
glwn |
01 |
02 |
03 |
04 |
05 |
06 |
07 |
08 |
09 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20 minutes |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
1 hour |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
2 hours |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
8 hours |
F |
F |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
F |
F |
1 day |
F |
F |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
F |
F |
4 days |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
F |
F |
7 days |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
T |
T |
F |
F |
F |
F |
F |
F |
F |
F |
This table shows whether a job can run on the different work nodes, depending on the request time.
Use of Compute Nodes According to Job Type
GLUON features a 200GBE Infiniband connection with low latency, enabling jobs not only to operate within a single compute node but also to utilize CPUs across multiple nodes. This capability allows users to launch jobs that exceed the typical limit of 96 CPUs, as the system can handle multi-node reservations to optimize job execution. However, this feature might cause conflicts between different jobs. To mitigate this, some nodes are specifically configured to run multi-node (parallel) jobs, and others are set up for single-node (vanilla) jobs. The current setup of GLUON is as follows:
glwn |
01 |
02 |
03 |
04 |
05 |
06 |
07 |
08 |
09 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Vanilla |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
F |
F |
F |
F |
F |
Parallel |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
F |
T |
T |
T |
T |
T |
F |
F |
F |
If your job can run on a single node (fewer than 96 CPUs), it is preferable to use vanilla directly, as the wait times for your job in the queue will be shorter.
Problems?
If there is any issue or if you have any questions, please use the IFIC ticket service to let us know. Within the IT section, there is a ticket queue dedicated exclusively to GLUON.