0 - Directory Configuration and Setup

Overview

Before installing and configuring the fw-hpc-client, it's important to understand the directory structure that will be created and how the different directories are used. This guide explains what directories you'll need and their purposes.

Directory Structure

The fw-hpc-client uses a specific directory structure to organize configuration files, logs, and temporary data. Here's what gets created:

<configuration-directory>/         # Your main working directory
├── settings/                      # Configuration files (created by setup)
│   ├── cast.yml                   # Main configuration file
│   ├── credentials.sh             # Environment variables and secrets
│   └── start-cast.sh              # Bootstrap script for running fw-hpc-client
├── logs/                          # Log files and temporary data (created by setup)
│   ├── generated/                 # Generated HPC job scripts
│   ├── queue/                     # HPC job log files
│   └── temp/                      # Temporary engine runtime files
│       └── log.json -> /dev/null  # Symlink to suppress engine logs
└── .git/                          # Git repository (optional, user-created)
    └── .gitignore                 # Git ignore file (optional, user-created)

Directory Purposes

Configuration Directory (`<configuration-directory>/`)

This is your main working directory where you'll run fw-hpc-client commands. You can name this anything you like. The documentation examples use fw-cast.

Location: Can be anywhere you have write access.

Settings Directory (`settings/`)

Contains all configuration files that you'll need to customize for your environment.

Files created by fw-hpc-client setup:

cast.yml: Main configuration file containing:
Scheduler type (slurm, lsf, sge)
Job filtering settings
Resource allocation defaults
Script templates (optional)
credentials.sh: Environment variables including:
Flywheel site connection details
Singularity/Podman configuration
Storage directory paths
⚠️ Contains secrets - protect with chmod 0600
start-cast.sh: Bootstrap script that:
Sources credentials
Activates Python environment (if needed)
Runs fw-hpc-client with proper logging

Logs Directory (`logs/`)

Contains runtime data and logs generated during operation.

Subdirectories:

generated/: HPC job scripts (.sh files)
One script per Flywheel job
Named like job-<flywheel-job-id>.sh
Used to submit jobs to your HPC scheduler
queue/: HPC job output logs (.txt files)
Captures stdout/stderr from HPC jobs
Named like job-<flywheel-job-id>.txt
Useful for debugging failed jobs
temp/: Temporary engine runtime files
Contains log.json symlinked to /dev/null
Used by Flywheel engine during job execution

Storage Considerations

Local Storage Requirements

From the system requirements:

Minimum: 64 GB (assumes main storage drive is mapped and contains cached singularity images)
Alternative: ~2 TB if no mapped drive is used (for cached singularity files, downloaded files, intermediate files)

Shared Storage

If your HPC has shared storage accessible from both login nodes and compute nodes, you can set ENGINE_CACHE_DIR and ENGINE_TEMP_DIR to shared locations in credentials.sh.

Directory Permissions

From the cluster installation guide:

# Protect credentials file
chmod 0600 ./settings/credentials.sh

Environment Variables

From the cluster installation guide, several directories can be customized via environment variables in credentials.sh:

SINGULARITY_TMPDIR - Temporary working space for building containers
SINGULARITY_WORKDIR - Working directory for /tmp, /var/tmp and $HOME
SINGULARITY_CACHEDIR - Where docker and sif images are stored
ENGINE_CACHE_DIR - Where gear inputs and output files are stored
ENGINE_TEMP_DIR - Where gear inputs and output files are stored

Git Repository Setup

From the tracking changes guide, it is recommended to use Git to track your configuration files.

Next Steps

After understanding this directory structure, proceed to Choose an Integration Method.