Garpur and Jötunn both use the Slurm workload manager.
Users should always run their computations on the compute nodes, not the login nodes. This is done by using the queuing system.
If you have previously been using torque then the following site might be useful http://www.nersc.gov/users/computational-systems/cori/running-jobs/for-edison-users/torque-moab-vs-slurm-comparisons/
To run a job on the system, you need to create a job script. A job script is a reqular shell script either bash or csh with some directives which specifies number of cpus, memory etc. Then, this will be interpreted by the batch system on submission. Below is a very basic job sample script:
#SBATCH -n 1
#SBATCH -J ExampleJobName
#print hostname on which the job is running
Once you have your job script ready, you can use qsub command as follows:
sbatch <your job script filename>
To check the status of your job you can then do:
squeue -u <username>
Other options of note for job scripts
- -N nodes= 2 –ntasks-per-node=4
- Requests 2 nodes with 4 cores each for the job
- -p himem
- Requests to use the high memory nodes or gpu nodes [Garpur only]
If you want to run a job on the compute nodes interactivly, you can use the command
salloc -N 1
To leave this interactive session use the “exit” command.