A conceptual depiction of a modern node with four eight-core processors that share a common memory pool. A node typically also handles local storage, network connectivity, and power. It is increasingly common for nodes to include special accelerator hardware like GPUs or TPUs. This node has a GPU.1
HPC vs. Regular Computing
Feature
HPC
Regular Computing
Processing Power
Many CPUs & GPUs across nodes
One or few CPUs with multiple cores
Parallelism
Massively parallel (distributed computing)
Limited parallelism (multi-threading within a CPU)
Scalability
High (thousands of nodes)
Low (single machine or few nodes)
Job Scheduling
Slurm, PBS, or LSF for batch scheduling
OS task scheduler (e.g., Windows Task Manager, cron)
MSI uses Slurm to manage computational resources efficiently.
Jobs are submitted to a queue (partition) and wait for available resources.
Users must create Slurm job scripts to request resources and execute calculations.
Writing and Submitting Job Scripts
Writing Job Scripts – Define resource requests and execution commands.
Submitting Job Scripts – Use sbatch to submit jobs to Slurm.
Slurm Directives – Specify job parameters with #SBATCH directives.
Walkthrough
Create job script (.sh)
#!/bin/bash -l # Specifies the shell to use for the script (bash in this case), and the -l option makes it a login shell# Slurm directives to request resources#SBATCH --time=0:10:00 # Set the maximum runtime for the job to 10 minutes#SBATCH --ntasks=1 # Request 1 task (process) for this job#SBATCH --mem=10g # Request 10GB of RAM#SBATCH --tmp=10g # Request 10GB of temporary (scratch) disk space#SBATCH --mail-type=ALL # Receive emails on all events (start, end, fail)#SBATCH --mail-user=braak014@umn.edu # Set email address to receive notifications# Change to the specified working directorycd ~/Files/natcap_teems/skill_session_msi/luep# Load the necessary software modulesmodule load python3 # Load the Python 3 modulemodule load conda # Load the Conda module# Manually source the .bashrc file to initialize Condasource ~/.bashrc # Ensures that Conda is properly initialized before use# Activate the Conda environment where dependencies are set upconda activate teems01 # Activate the Conda environment "teems01"# Run the Python scriptpython3 run_deposition_calculation.py # Executes the specified Python script using Python 3
Walkthrough
Submit sbatch
sbatch submit_run_deposition_calc.sh squeue -u username # to check status of jobs# scancel jobIDnumber # to cancel a job