CPU and Memory Usage

After job is submitted, user is allowed to use ssh for connecting compute node on which job is running.

CODE

ssh username@compute_node

After ssh to compute node, user can use following commands to check memory and cpu usage of submitted job.

ps: lists all processes.

CODE

ps -u username -o %cpu,rss,args

this will give you instantaneous usage every time you run command. Memory usage is in kilobytes.

top/htop: runs interactively and show live usage statistics.

CODE

htop -u username

Note: https://gridpane.com/kb/how-to-use-the-top-command-to-monitor-system-processes-and-resource-usage/

https://gridpane.com/kb/how-to-use-the-htop-command-to-monitor-system-processes-and-resource-usage/

nvidia-smi: shows GPU parameters and GPU memory usage.

CODE

[root@it04 ~]# nvidia-smi
Wed Oct 27 09:14:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   37C    P0    59W / 250W |    607MiB / 32510MiB |     24%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    390761      C   namd2                             603MiB |
+-----------------------------------------------------------------------------

607MiB / 32510MiB is how much GPU you use, consider if you need this much GPU or not.