CernVM-FS
CernVM-FS (CVMFS) is a distributed file system designed for efficiently delivering software to large-scale computing environments. Originally developed at CERN, it provides a scalable and lightweight solution for distributing software across clusters and grid infrastructures. Instead of requiring full local installations or manual synchronizations, CVMFS allows software to be accessed on demand from a central repository over HTTP, reducing storage needs and administrative overhead.
CVMFS is widely used in scientific computing, particularly in high-energy physics and research institutions, to ensure consistency across multiple nodes. By mounting a remote repository, users can access pre-installed software as if it were stored locally, without the need for manual downloads or updates. This simplifies software management and guarantees that all nodes in a cluster have access to the same versions of applications and libraries.
Introduction
Basic Functioning of CernVM-FS (CVMFS)
CernVM-FS (CVMFS) is a read-only file system designed to distribute software efficiently across computing clusters. It allows users to access software on demand without needing local installations. Instead of downloading full software packages, CVMFS fetches files over HTTP and caches them locally to optimize performance.
The cvmfs/sft.cern.ch Repository
In this cluster, only the cvmfs/sft.cern.ch
repository is available. This repository is maintained by CERN and provides a variety of scientific software commonly used in physics and computational research. Some examples are:
ROOT - A framework for data analysis and visualization.
Geant4 - A toolkit for the simulation of particle interactions.
Python environments - Pre-configured Python versions with scientific libraries.
Compilers and development tools - Including GCC and Clang versions.
Machine learning and AI libraries - Such as TensorFlow and PyTorch .
Releases and LCG Distributions
Software in cvmfs/sft.cern.ch
is organized into releases, which provide pre-compiled software stacks with various tools and libraries. These releases ensure a stable and consistent environment for users.
The most common type of release available in cvmfs/sft.cern.ch
is the LCG (LHC Computing Grid) release. LCG releases are curated software environments specifically designed for high-energy physics (HEP) and scientific computing. Each release includes a set of well-tested tools, compilers, and libraries that work seamlessly together.
Depending on the user’s needs, it is possible to:
Load an entire LCG release, which provides a pre-configured environment with all available software included in that release.
Load specific software from a release, allowing users to work with only the required tools without loading unnecessary packages.
Loading Specific Software from a Release
Finding and loading specific software versions in cvmfs/sft.cern.ch
can be challenging due to the way software is distributed across LCG releases. Each software package is stored inside a release (e.g., LCG_104d
), and different versions may be compiled with various compilers and system architectures.
To simplify this process, we have a tool available on gluon called lcgenvsearch
. This script allows users to search for a specific software package, version, and architecture within the available releases, providing the exact commands needed to load it.
How lcgenvsearch Works
lcgenvsearch
searches through the LCG releases and identifies:
Available releases that contain the requested software.
The versions available for each release.
The system architecture and compiler used for compilation.
The exact commands needed to load the software.
Using lcgenvsearch
To search for a specific software package, simply run:
[user@glui01 ~]$ lcgenvsearch <software_name>
For example, to find ROOT:
[user@glui01 ~]$ lcgenvsearch ROOT
The output returned by lcgenvsearch will include all possible LCG releases that contain the ROOT package, along with all relevant information: release, version, architecture, compiler used, etc. Similarly, it will also provide the necessary commands to load the package.
------------------------------
LCG Release: LCG_102b_LHCB_8
Version: 6.26.08
Arquitecture: x86_64-centos7-gcc11-dbg
Software Name: ROOT
Commands to load:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases
[user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p LCG_102b_LHCB_8 x86_64-centos7-gcc11-dbg ROOT`"
------------------------------
LCG Release: LCG_102b_LHCB_8
Version: 6.26.08
Arquitecture: x86_64-centos7-gcc11-opt
Software Name: ROOT
Commands to load:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases
[user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p LCG_102b_LHCB_8 x86_64-centos7-gcc11-opt ROOT`"
------------------------------
For example, if we wanted to use version 6.26.08 of ROOT from the LCG_102b_LHCB_8 release on the User Interface (gluiXX), we would simply execute the commands provided:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases
[user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p LCG_102b_LHCB_8 x86_64-centos7-gcc11-opt ROOT`"
By default, the script will return all versions of ROOT available in cvmfs/sft.cern.ch
, along with the corresponding compilers and architectures.
There are some special cases of software, such as GCC, where the loading process differs as it is independent of the LCG release. However, for users, the process remains exactly the same: simply use the commands provided by lcgenvsearch
.
Advanced Filtering
The script provides additional options for filtering results:
-ver <version>
→ Filters by software version (e.g.,6.28
will list6.28.01
,6.28.02
, etc.).-rel <release>
→ Filters by LCG release release (e.g.,LCG_104d
).-c <compiler>
→ Filters by compiler (e.g.,gcc
,clang
).-os <O.system>
→ Filters by operating system (default: AlmaLinux 9 →EL9
, CentOS 7 →centos7
).-ar <arch>
→ Filters by architecture (default:x86_64
).
Example:
[user@glui01 ~]$ lcgenvsearch ROOT -ver 6.28 -c gcc -rel LCG_104d
This will filter only the releases where the ROOT package is in version 6.28.XX, has been compiled with GCC, and the release name contains LCG_104d
:
------------------------------
LCG Release: LCG_104d_ATLAS_22
Version: 6.28.12
Arquitecture: x86_64-centos7-gcc11-dbg
Software Name: ROOT
Commands to load:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases
[user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p LCG_104d_ATLAS_22 x86_64-centos7-gcc11-dbg ROOT`"
------------------------------
LCG Release: LCG_104d_ATLAS_22
Version: 6.28.12
Arquitecture: x86_64-centos7-gcc11-opt
Software Name: ROOT
Commands to load:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases
[user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p LCG_104d_ATLAS_22 x86_64-centos7-gcc11-opt ROOT`"
------------------------------
Loading Software on UI Nodes and Worker Nodes
The script provides two sets of commands:
For User Interface (UI) nodes (
gluiXX
):Once the software is found, it provides a command like:
[user@glui01 ~]$ export LCGENV_PATH=/cvmfs/sft.cern.ch/lcg/releases [user@glui01 ~]$ eval "`/cvmfs/sft.cern.ch/lcg/releases/lcgenv/latest/lcgenv -p <LCG_release> <architecture> <software>`"
Running these commands in a UI node will correctly load the environment. In the case of GCC, the loading command follows a different structure since GCC versions are independent of LCG releases. Generally, they are loaded using commands of the following type:
[user@glui01 ~]$ source /cvmfs/sft.cern.ch/lcg/releases/gcc/<version>/<architecture>/setup.sh
However, there is no need to worry about this.
lcgenvsearch
will provide the correct command for you.
For Worker Nodes (WN) via HTCondor:
If running jobs on HTCondor, the environment needs to be set up inside the job script (.sh). The same commands can be included inside the submission script to ensure the correct environment is loaded on execution. Recuerda especificar en el archivo de submisión (.sub)
getenv = True
.
This tool makes it significantly easier to find and load the exact software needed, ensuring compatibility across both UI and WN environments in the cluster.
Using a Complete LCG Release
Instead of loading individual software packages, it is possible to load an entire LCG release, which provides a pre-configured environment containing a complete set of compatible tools and libraries. This approach has both advantages and disadvantages, making it suitable for specific use cases.
Advantages of Loading a Full LCG Release
Consistency: Ensures that all software versions are compatible with each other, avoiding conflicts.
Pre-configured Environment: Automatically sets up environment variables, paths, and dependencies.
Convenience: No need to manually search for and load individual software packages.
Disadvantages of Loading a Full LCG Release
Increased Resource Usage: Loads many packages that may not be needed, consuming memory and environment space.
Less Flexibility: If a specific software version is required that is not included in the release, additional setup may be needed.
System Compatibility Restrictions: The loaded release must match the operating system and architecture of the machine.
When to Use a Full LCG Release
Loading a complete LCG release is recommended when:
A user needs multiple software packages that are known to be compatible within a specific release.
The software dependencies are complex, and using a pre-configured environment ensures stability.
The user prefers a ready-to-use setup without manually configuring multiple packages.
Checking Available LCG Releases
To see the available releases, explore the following directory:
[user@glui01 ~]$ ls /cvmfs/sft.cern.ch/lcg/views/
Each subdirectory represents an available LCG release.
Loading an LCG Release
To load a specific release, use the following command:
[user@glui01 ~]$ source /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc11-opt/setup.sh
Important Notes: - Only releases compatible with the current GLUON operating system (CentOS 7) and x86_64 architecture can be loaded. - If working with a different architecture (e.g., aarch64) or a different OS (EL9 for AlmaLinux 9), the release cannot be loaded directly. In such cases, a compatible environment must be used, such as a container or a virtual machine that matches the required architecture and OS.
This method provides a stable, ready-to-use software environment, making it a good choice for users who need a complete, well-tested software stack for their work.