Can I use more than one CPU for my job?

Not every software package can make use of multiple CPUs (a trait called multithreading), so it is important to determine whether the package you are using can do so before allocating more CPUs to it, otherwise, you will be just wasting CPU cycles and money.

Two ways to determine whether your software package can make use of multiple CPUs

  • Find it in a list of known multithreaded software
  • Run it with multiple CPUs and analyze the resulting job statistics

Known multithreaded software packages

Below is a list of some commonly-used software packages which are known to be multithreaded:

bwa mem samtools STAR CIBERSORTx BLAST+
HISAT2 Trinity SPAdes GATK Picard
Kraken2 Cufflinks Salmon Kallisto Clustal
RSEM SRA Toolkit PLINK Trimmomatic Prokka
VCFtools BCFtools Bowtie2 FastQC Omega

You can also look at the documentation for your software package and look for command-line parameters that include the words thread or cpu in them.

Run with multiple CPUs and analyze the job statistics

To set up the multiple CPU run, add the --cpus-per-task=2 parameter to either your sbatch command line or to an #SBATCH line in your script, and run your job.

sbatch [other-slurm-arguments-if-any] --cpus-per-task=2 [your-script]

The number of “CPUs per task” you give it as above is available within your jobs as the environment variable ${SLURM_CPUS_PER_TASK}, so you can pass it along to any arguments that your software has which control the number of CPUs/threads it can use:

mycommand -arg1 -arg2 --threads=${SLURM_CPUS_PER_TASK} ...

When that job is complete, use the seff (Slurm efficiency) tool to analyze the job statistics by running it on the job number given to your job.

seff [your-job-number]

Job ID: [your-job-number]
Cluster: scg
User/Group: bettingr/upg_bettingr
State: COMPLETED (exit code 0)
Nodes: 2
CPU Utilized: 08:04:12
CPU Efficiency: 98.22% of 08:13:00 core-walltime
Memory Utilized: 5.42 GB (estimated maximum)
Memory Efficiency: 45.13% of 12.00 GB (1.00 GB/core)

If the CPU Efficiency is close to 100% (like it is in the example above), then your software is using the extra CPU that you gave it and is ready for multiple CPUs to be assigned to future jobs. If that number is nearer to 50%, then the package ignored your extra CPU and just ran on a single one – you should not bother adding any more CPUs to these jobs.

If you have determined that your job can make use of extra CPUs, you can experiment to see what is the best number of CPUs to use. Ideally, a job with 2 CPUs will run twice as fast as one with 1, and a job with 4 CPUs will run 4 times as fast, but this linear progression rarely plays out in practice. Try different numbers to see where this linearity breaks down – that point is your optimal number of CPUs.

One final reminder: the more CPUs you request, the harder it will be for Slurm to schedule your job. If you request more than 24, the number of nodes which can support that request goes down considerably. You might choose to run your jobs with fewer than your optimal number of CPUs to make it easier to find the resources for your job requests.