FAQs

How do I update the HPC Client to the latest release?

The HPC Client will receive upgrades with enhancements and fixes. This document describes the process for upgrading the source code for your HPC Client.

How do I update my Flywheel engine?

As Flywheel instances are updated new features will be available. Updating the Flywheel engine binary will ensure these features are available. See this document for the directions on updating the Flywheel compute engine.

What are the Best Practices for building gears to run an HPC Cluster?

Gears running on HPC Clusters use one of two container executors, Singluarity (Apptainer) and Podman. To ensure effective execution of gears in these restricted environments, follow these two guidelines:

The gear will only write to /flywheel/v0/work (gear_context.work_dir) and /flywheel/v0/output (gear_context.output_dir).
The algorithm, gear code, and included packages are world-readable (chmod -R a+rX /{target}/{directory}). Avoid installation in any/root/` environment.

Temporary (e.g. /tmp) folders in container executors may be restrictingly small. Furthermore, the remainder of the container file system will be read-only.

How do I use GPUs on my Slurm Cluster?

With effective Slurm, hpc-client, and compute engine configuration, gears can be scheduled on a GPU node with a Slurm Scheduler. See this document for details.

How do I set a Slurm Job Priority from a Flywheel Job Priority?

It can be convenient to impose a priority for scheduling a Slurm job. These directions demonstrate how to map the Flywheel Job Priorities (e.g. low, medium, high, critical) to scheduling a job on a Slurm Scheduler.

How do I send Flywheel jobs to specific HPC clusters on my network?

It is possible to schedule jobs from a Flywheel instance to run on specific HPC clusters on your network. With the configuration below, it is as simple as adding additional tags to the job submission.

Configuration

First, the hpc-client must be installed on each cluster.

Next, as is demonstrated in examples/settings/multi_hpc_cast.yml, adding the following will enable additional cluster-specific filtering.

    filter_tags:
        - 'cluster_1_tag'

Likewise, add the above with a cluster-specific tag to each cluster you wish to schedule specific jobs on.

After the configuration is performed, adding the cluster_1_tag to the Flywheel job will ensure that the job will be scheduled only on the tag-specific cluster.

NOTE: All clusters must have specific filter tags or they will duplicate the job execution.

Can I use Podman instead of Singularity to execute my gears?

Podman's ability to run gears without conversion and execute as root may make it desirable to use Podman in some circumstances. See this document for details about deploying the HPC Client with Podman.

How do I set ram and cpu settings for my job?

Starting in version 2.0.0, the HPC Client will perform the following checks for setting ram and cpu settings:

Was scheduler_ram or scheduler_cpu set in the gear config when the Flywheel job was launched? If so, use this. The gear must have these as config variables to set them. See table below for formatting.
If no settings were found in the gear config, check the gear job tags for qualifying tags indicating RAM and CPU settings. Valid tags are of the following forms and will be validated in a scheduler-specific manner:
- ram=23G, RAM=32G, ram=32, scheduler_ram=12
- cpu=10, CPU=12, cpus=2, scheduler_cpu=4
If no setting was found for that specific job, check the settings/cast.yml file for these variables. Setting this will apply to HPC jobs submitted by the HPC Client. Only step 1. overrides this.
If the setting is still not found, then use the default one set for that specific scheduler type (e.g., Slurm). This is hardcoded and should not be changed.

Formatting guide for variables 'scheduler_ram' and 'scheduler_cpu'

scheduler/cluster	RAM	CPU
Slurm	'8G'	'8'
LSF	'rusage[mem=4000]'	'1'
SGE	'8G'	'4-8' (sets CPU range)

How do I use a custom script template for the jobs submitted to my HPC?

The HPC Client creates a shell script (.sh) for every job that is submitted to your HPC through your scheduler (e.g. Slurm). It creates this using a default script template for the type of scheduler on your HPC. If you would like to use a custom one, you can do so by using the script variable in the settings/cast.yml file. It is not recommended to edit the default templates in the source code (e.g. src/cluster/slurm.py).

How do I send my jobs to a specific partition on my HPC?

When you use a custom script template, you can set the partition(s) to which all your jobs will be sent. For example, if your scheduler is Slurm, you can add the following line in your custom script template:

   #SBATCH --partition=<partition1_name>,<partition2_name>

Example:

#SBATCH --partition=gpu-1,gpu-2

How do I check my version of the HPC Client?

As of version 2.1.0, you can check the version of the HPC Client by running the following command:

fw-hpc-client --version