FAQs
How do I update the HPC Client to the latest release?
The HPC Client will receive upgrades with enhancements and fixes. This document describes the process for upgrading the source code for your HPC Client.
How do I update my Flywheel engine?
As Flywheel instances are updated new features will be available. Updating the Flywheel engine binary will ensure these features are available. See this document for the directions on updating the Flywheel compute engine.
What are the Best Practices for building gears to run an HPC Cluster?
Gears running on HPC Clusters use one of two container executors, Singluarity (Apptainer) and Podman. To ensure effective execution of gears in these restricted environments, follow these two guidelines:
- The gear will only write to
/flywheel/v0/work(gear_context.work_dir) and/flywheel/v0/output(gear_context.output_dir). - The algorithm, gear code, and included packages are world-readable (
chmod -R a+rX /{target}/{directory}). Avoid installation in any/root/` environment.
Temporary (e.g. /tmp) folders in container executors may be restrictingly small. Furthermore, the remainder of the container file system will be read-only.
How do I use GPUs on my Slurm Cluster?
With effective Slurm, hpc-client, and compute engine configuration, gears can be scheduled on a GPU node with a Slurm Scheduler. See this document for details.
How do I set a Slurm Job Priority from a Flywheel Job Priority?
It can be convenient to impose a priority for scheduling a Slurm job. These directions demonstrate how to map the Flywheel Job Priorities (e.g. low, medium, high, critical) to scheduling a job on a Slurm Scheduler.
How do I send Flywheel jobs to specific HPC clusters on my network?
It is possible to schedule jobs from a Flywheel instance to run on specific HPC clusters on your network. With the configuration below, it is as simple as adding additional tags to the job submission.
Configuration
First, the hpc-client must be installed on each cluster.
Next, as is demonstrated in examples/settings/multi_hpc_cast.yml, adding the following will enable additional cluster-specific filtering.
Likewise, add the above with a cluster-specific tag to each cluster you wish to schedule specific jobs on.
After the configuration is performed, adding the cluster_1_tag to the Flywheel job will ensure that the job will be scheduled only on the tag-specific cluster.
NOTE: All clusters must have specific filter tags or they will duplicate the job execution.
How do I set up multiple HPC Clients to run on each cluster?
You can set up multiple HPC Clients with different cast tags, so that you need only use one tag in the Flywheel job to run on the desired cluster. This is useful if you have multiple HPC clusters that cannot be accessed from the same head/login node.
Use the variable custom_cast_tags in the settings/cast.yml file to set up multiple HPC Clients. For example, to set up two HPC Clients, one for each of two clusters, you would set the custom_cast_tags variable as follows:
This will replace the typical 'hpc' tag with the custom tag(s) you set. Any tag that is added to the lists would also need to be configured on your Flywheel site's hold engine configuration (done by the Flywheel team).
If you have multiple HPC clusters that can be accessed from the same head/login node (i.e., installing only one HPC Client), use the filter_tags cast variable instead (see above).
Note
It is also possible to set up multiple HPC Clients using only the filter_tags variable, but one would always have to add a minimum two tags to the job--one for the HPC Client ('hpc') and one for the cluster (e.g., 'hpc-1). Using this method, however, the custom_cast_tags variable would not be used and no Flywheel site configuration would be necessary.
Can I use Podman instead of Singularity to execute my gears?
Podman's ability to run gears without conversion and execute as root may make it desirable to use Podman in some circumstances. See this document for details about deploying the HPC Client with Podman.
How do I set ram and cpu settings for my job?
Starting in version 2.0.0, the HPC Client will perform the following checks for setting ram and cpu settings:
- Was
scheduler_ramorscheduler_cpuset in the gear config when the Flywheel job was launched? If so, use this. The gear must have these as config variables to set them. See table below for formatting. - If no settings were found in the gear config, check the gear job tags for qualifying tags indicating RAM and CPU settings. Valid tags are of the following forms and will be validated in a scheduler-specific manner:
ram=23G,RAM=32G,ram=32,scheduler_ram=12cpu=10,CPU=12,cpus=2,scheduler_cpu=4
- If no setting was found for that specific job, check the
settings/cast.ymlfile for these variables. Setting this will apply to HPC jobs submitted by the HPC Client. Only step 1. overrides this. - If the setting is still not found, then use the default one set for that specific scheduler type (e.g., Slurm). This is hardcoded and should not be changed.
Formatting guide for variables 'scheduler_ram' and 'scheduler_cpu'
| scheduler/cluster | RAM | CPU |
|---|---|---|
| Slurm | '8G' | '8' |
| LSF | 'rusage[mem=4000]' | '1' |
| SGE | '8G' | '4-8' (sets CPU range) |
How do I use a custom script template for the jobs submitted to my HPC?
The HPC Client creates a shell script (.sh) for every job that is submitted to your HPC through your scheduler (e.g. Slurm). It creates this using a default script template for the type of scheduler on your HPC. If you would like to use a custom one, you can do so by using the script variable in the settings/cast.yml file. It is not recommended to edit the default templates in the source code (e.g. fw_hpc_client/cluster/slurm.py).
How do I send my jobs to a specific partition on my HPC?
When you use a custom script template, you can set the partition(s) to which all your jobs will be sent. For example, if your scheduler is Slurm, you can add the following line in your custom script template:
Example:
How do I check my version of the HPC Client?
As of version 2.1.0, you can check the version of the HPC Client by running the following command:
How do I use a custom python script to modify my HPC job settings?
The HPC Client allows you to use a custom python script to modify the job settings before the job is submitted to the HPC. This can be useful if you want to set the job settings based on the Flywheel job's configuration (e.g., gear name, gear version, etc.).
To use a custom python script, you can set the site_modification_script variable in the settings/cast.yml file. The script should be a python script that takes a a Flywheel job, input job settings, and a python logger as inputs. It must define the function modify_job_settings that takes these inputs and returns the modified job settings. For example:
def modify_job_settings(job, inp_job_settings, logger):
"""
Modify the job settings based on the input job and the input job settings.
Args:
-----
job: Job
A Flywheel job object. Some of the relevant fields are:
- id: str
- origin: flywheel.Origin
- id: str # The ID of the user who created the job
- gear_info: flywheel.GearInfo
- name: str
- version: str
- category: str
- config: flywheel.JobOutputConfig
- config: dict # The gear configuration
- inputs: dict # The input files; names are based on the gear's manifest
- <input_1_name>: flywheel.JobFileInput
- location: flywheel.Location
- name: str # The name of the file
- object: flywheel.JobFileObject
- classification: dict
- info: dict # The file custom info
- size: int # The size of the file in bytes
- type: str # The type of the file, e.g., 'dicom'
- version: int # The version of the file
inp_job_settings: fw_hpc_client.util.defn.JobSettings
The input job settings. The units of ram, cpu, gpu, time, ntasks, and priority
vary by cluster type. Available fields are:
- fw_id: str # The associated Flywheel job ID
- singularity_debug: bool
- singularity_writable: bool
- ram: Optional[str]
- cpu: Optional[str]
- gpu: Optional[str] # 0 or more GPUs
- time: Optional[str] # time limit before cluster kills job
- ntasks: Optional[str] # number of tasks to request
- priority: Optional[int]
log: Logger
The logger object.
Returns:
--------
JobSettings
The modified job settings. See inp_job_settings for available fields.
"""
# Modify the job settings here
gear_name = job.gear_info.name
if gear_name == "bids-qsiprep" and job.config["config"].get(
"recon-only", False
):
job_settings.cpu = "10"
log.info("Setting CPU to 10 for bids-qsiprep non-recon-only job")
return job_settings
You can then set the site_modification_script variable in the settings/cast.yml file to the path to your custom python script:
For security, certain modules and functions are not allowed in the custom python script.
# Modules that are NEVER allowed (for security reasons)
FORBIDDEN_MODULES = {"os", "sys", "subprocess", "socket", "shutil"}
# The ONLY allowed modules from the HPC Client package
ALLOWED_MODULES = {"fw_hpc_client.util.file_utils", "fw_hpc_client.util.defn"}
FORBIDDEN_FUNCTIONS = {"exec", "eval", "open", "__import__", "compile"}
For additional examples, see examples/settings/custom_mods_to_HPC_job_settings from the main repository.