Simulation Jobs

SCC uses the Grid Engine (GE) queuing system (Son of Grid Engine 8.1.8) for simulation job management (Please see the GE tutorial). The GE can be used in text mode and graphical (qmon) on the frontend server. All nodes are able to submit and execute jobs.

The following GE queues are currently available:

Queue	Slots	Default/max. run time	Usage	User
scc	3892	7 days/10 days	All SCC nodes	all
long	2736	7 days/120 days	Long running jobs	all
old	488	7 days/30 days	Older Nodes (AMD and Intel)	all
pc	616	7 days/7 days	Workstations	@theophys
gpu	308	7 days/10 days	GPU nodes	all

You can select a single queue or let GE decide by specifying needed resources (see below), but consider that requesting high values may impact negatively in your job scheduling.

All queues support serial and parallel jobs. For parallel jobs use the parallel environments whether your job uses shared memory (like OpenMP) or distributed memory (like MPI).

Parallel Environment	Usage	Max. Slots	Example
smp/openmp	Shared Memory (single node)	16-64	-pe smp 20
mpi	Distributed Memory	all	-pe mpi 42
mpi-20	Distributed Memory (exclusive nodes)	n x 20	-pe mpi-20 160
mpi-8/mpi-12/mpi-16/mpi-24	Distributed Memory (exclusive nodes)	n x 8/12/16/24	-pe mpi-8 16

The GE slots refer to real CPU core. To use Hyper-threading you need to specify the number of used cores explicitly in your job script (see GE tutorial).
All queues are configured for fair scheduling (ticket based job priority) and reservation (handle serial and parallel jobs at the same time) to treat all users and jobs fairly. The available resources per user obviously depend on the contribution of the users group to SCC.

For GPU-jobs please check the GPU-page.

Resources can be requested with the -l option. Especially h_vmem and h_rt are important for most jobs. All available resources are:

Resource	Example (qsub option)	Explaination
h_vmem	-l h_vmem=4G	request 4 GB memory PER SLOT for the job (default 1 GB, max 768 GB)
h_rt	-l h_rt=48:00:00	request 2 days run time (default 7 days, max 120 days)
infiniband	-l ib	request Infiniband interconnect (fast network)
exclusive	-l ex	request exclusive usage of a single node (use only for MPI jobs on single nodes, add "-w w" if job is rejected)
max10	-l m10	limit your number of used slots by all jobs to 10 (useful if you don't want to fill the group quota)
max100	-l m100=2	limit your number of used slots by all jobs to 50
max1000	-l m1000=2.5	limit your number of used slots by all jobs to 400
cputype	-l p="haswell\|ivybridge"	request CPU type (epyc3,epyc,cascadelake,skylake,broadwell,haswell, ivybridge, sandybridge, phi, corei7, core2, core2duo)
epyc3, epyc,cl,sl,bw,hw, ivy, sandy, phi,corei7, core2, core2duo	-l hw	request exact CPU type (EPYC 3 / EPYC / Cascadelake / Skylake / Broadwell / Haswell / Ivy Bridge / Sandy Bridge / Phi / Core i7 / Core 2 Quad / Core 2 Duo)
avx	-l avx	Only nodes supporting AVX
avx2	-l avx2	Only nodes supporting AVX2 (Haswell and higher)
avx512	-l avx512	Only nodes supporting AVX512 (Skylake and Cascadelake)

with following settings:

default GE options: -cwd -q scc,old,pc,long -R y
- work in current working directory
- default queue: dlr, scc and pc
- reservation with back filling for parallel jobs
fair scheduling (ticket based job priority)
multiple queues per node without over subscription
starter method to set OMP_NUM_THREADS to NSLOTS and set Modules environment

You can use qquota to see limitations applying for you. If you have a lot of jobs, please consider using array jobs.

Please do not send jobs to single nodes like "qsub -q scc@scc042". The best choise is almost always to use any queue and let the queuing system decide by the resources you specify. You may limit the selection of nodes by specifying certain hostgroups by using "-q scc@@scc-ivy-64GB", etc.

Hostgroups	Specification	Number of nodes
@scc-epyc3-256GB	AMD EPYC 3 CPU, 256 GB RAM	4
@scc-cascadelake-192GB	Cascadelake CPU, 192 GB RAM	8
@scc-skylake-192GB	Skylake CPU, 192 GB RAM	28
@scc-broadwell-128GB	Broadwell CPU, 128 GB RAM	4
@scc-broadwell-512GB	Broadwell CPU, 512 GB RAM	12
@scc-haswell-64GB	Haswell CPU, 64 GB RAM	8
@scc-haswell-256GB	Haswell CPU, 256 GB RAM	4
@scc-ivy-64GB	Ivy Bridge CPU, 64 GB RAM	16
@scc-ivy-256GB	Ivy Bridge CPU, 256 GB RAM	35
@scc-sandy-64GB	Sandy Bridge CPU, 64 GB RAM	2
@scc-sandy-128GB	Sandy-Bridge CPU, 128 GB RAM	5
@scc-sandy-256GB	Sandy-Bridge CPU, 256 GB RAM	3
@scc-gpu	Ivy Bridge CPU, Tesla K20 GPU	scc066
@scc-gpu2	Haswell CPU, Tesla K80 GPU	scc116,scc117
@scc-gpu3	Silver 4114 CPU, 4 NVIDIA V100 GPU	scc146
@scc-gpu-epyc	AMD EPYC 7401P, 8 RTX 2080TI	scc195-scc199
	AMD EPYC 7713, 4 NVIDIA L40	scc192

Search University of Konstanz

Results

Suggestions