kuacc-nodes

Lists cluster nodes with specifications (CPU types, GPU list,Memory, Features etc). it is detailed version of sinfo command.

CODE

(base) [yakarken18@login02 ~]$ kuacc-nodes
NODELIST        STATE       CPUS    S:C:T    MEMORY   TMP_DISK GRES                          AVAIL_FEATURES                                    
ai[01,04-06,09] mixed         40   2:20:1    503000          0 gpu:tesla_t4:8                ai,ib,compute,40cpu,gpu                                 
ai[11-14]       mixed         40   2:20:1    503000          0 gpu:tesla_v100:8              ai,ib,compute,40cpu,gpu                                          
buyukliman      mixed         72   2:18:2    250000          0 (null)                        hamsi,ib,compute,72cpu                                     
da[03-04]       mixed         20   2:10:1    250000          0 gpu:tesla_k20m:1              biyofiz,ib,compute,20cpu                                          
dy02            mixed         36   2:18:1    504000          0 gpu:tesla_k80:4               ai,ib,compute,36cpu,gpu                                         
dy03            mixed         24   2:12:1    450000          0 gpu:tesla_k80:8               ai,ib,compute,24cpu,gpu                                  
it[01-02]       mixed         40   2:20:1    500000          0 gpu:tesla_v100:1              IT,ib,compute,tesla_v100                                          
sm01            mixed         20   2:10:1     64000          0 gpu:tesla_k40m:2,gpu:tesla_k80:1 iui,ib,compute,20cpu                                      
ag01            allocated      8    2:4:1    124000          0 gpu:gtx_1080ti:2              cosbi,ib,compute,8cpu,gpu                                       
ai[02-03,07-08, allocated     40   2:20:1    503000          0 gpu:tesla_t4:8                ai,ib,compute,40cpu,gpu                                  
be[01-12]       allocated     12    2:6:1    126000          0 gpu:tesla_k20m:1              ilac,ib,compute,12cpu                                                 
da02            allocated     20   2:10:1    250000          0 gpu:tesla_k20m:1              biyofiz,ib,compute,20cpu                                         
it[03-04]       allocated     40   2:20:1    500000          0 gpu:tesla_v100:1              IT,ib,compute,tesla_v100                                          
rk01            allocated     72   2:18:2    504000          0 (null)                        kutem,ib,compute,72cpu,HT                                         
da01            idle          20   2:10:1    250000          0 gpu:tesla_k20m:1              biyofiz,ib,compute,20cpu                                         
ke[01-08]       allocated     36   2:18:1    504000          0 (null)                        cosmos,ib,compute,36cpu                                                   

============================================================================================================
KUACC NODES CPU LIST
============================================================================================================
login02-login03  |model name      : Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
ag01             |model name      : Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz
be01 - be14      |model name      : Intel(R) Xeon(R) CPU E5-2640    @ 2.50GHz
buyukliman       |model name      : Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
da01 - da04      |model name      : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
dy02             |model name      : Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
dy03             |model name      : Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.10GHz
ke01 - ke08      |model name      : Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
rk01             |model name      : Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
sm01             |model name      : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
it01 - it04      |model name      : Intel(R) Xeon(R) Gold 6148      @ 2.40GHz
ai01 - ai14      |model name      : Intel(R) Xeon(R) Gold 6248      @ 2.50GHz
============================================================================================================

NODELIST, STATE, CPUS, S:C:T(Socket:CorePerSocket:ThreadsPerCore), MEMORY, TMP_DISK (TEMPORARY_DISK), GRES, AVAIL_FEATURES

This command lists specifications of all nodes in cluster. It is very useful when you need specific resource. For example, GRES column shows all gpu types in cluster. You can choose a specific gpu by using gres flag in your script with gres from this command output.

CODE

#SBATCH --gres=gpu_tesla_k80

AVAIL_FEATURES column in command output lists all features of nodes. These Features are set by system admins and used for constraint flag in slurm job script.

For example,

CODE

(base) [yakarken18@login02 ~]$ kuacc-nodes |grep ai
ai[01,04-06,09] mixed         40   2:20:1    503000          0 gpu:tesla_t4:8   ai,ib,compute,40cpu,gpu,tesla_t4,6248,molpro,vnc                                                       
ai[11-14]       mixed         40   2:20:1    503000          0 gpu:tesla_v100:8 ai,ib,compute,40cpu,gpu,tesla_v100,6248,molpro,vnc                                                      
dy02            mixed         36   2:18:1    504000          0 gpu:tesla_k80:4  ai,ib,compute,36cpu,gpu,tesla_k80,e52695,e52695v4,                                                      
dy03            mixed         24   2:12:1    450000          0 gpu:tesla_k80:8  ai,ib,compute,24cpu,gpu,tesla_k80,e52695,e52695v2,                                                       
ai[02-03,07-08, allocated     40   2:20:1    503000          0 gpu:tesla_t4:8   ai,ib,compute,40cpu,gpu,tesla_t4,6248,molpro,vnc                                                        
ai01 - ai14      |model name      : Intel(R) Xeon(R) Gold 6248      @ 2.50GHz

CODE

ai[01-04,06-10] mixed 40 2:20:1 503000 0 gpu:tesla_t4:8                ai,ib,compute,40cpu,gpu,tesla_t4,6248,molpro,vnc

Features for ai nodes:

ai[01-10]: Feature=ai,ib,compute,40cpu,gpu,tesla_t4,6248,molpro,vnc
ai[11-14]: Feature= ai,ib,compute,40cpu,gpu,tesla_v100,6248,molpro,vnc
dy02: Feature= ai,ib,compute,36cpu,gpu,tesla_k80,e52695,e52695v4,vnc
dy03: Feature=ai,ib,compute,24cpu,gpu,tesla_k80,e52695,e52695v2,vnc
ai: ai partition
ib: node with infiniband network

compute:compute node

24cpu: node with 24cores
36cpu: node with 36cores
40cpu: node with 40cores
tesla_XX: node with tesla_XX gpu
molpro: node with 1TB local disk
6248: node with Intel Gold 6248 cpu
e52695: node with Intel e52695 cpu

Some users needs to run their jobs on specific node. For example, on Intel Gold 6248 cpus. User can use 6248 feature in constraint flag and user limits jobs to run on 6248 cpus.

Any feature can be added into node available feature list.

Examples:

By using constraint and feature tesla_k80. You can request only tesla_k80 nodes.

CODE

#SBATCH --constraint=tesla_k80

Some applications create to many tmp files. This will cause on issue on file system. Therefore, they use server disk for scratch data. Servers with molpro feature have local disks over 1TB. There is tmp2 folder on these nodes which can be used as a scratch folder.

By using constraint and feature molpro. tmp2 folder as stratch.

CODE

#SBATCH --constraint=molpro
export $TMP=/tmp2

At the end of job, scratch data should be cleaned.