Skip to main content

squeue & kuacc-queue & valar-queue

After submitting jobs on KUACC or VALAR, users should monitor their jobs to check status and resource usage.

squeue: Displays information about jobs currently in the Slurm scheduling queue (pending, running, held, etc.).

CODE
squeue –u username 

Shows only your jobs.

kuacc-queue / valar-queue: Cluster-specific helper scripts that provide a simplified or customized view of the job queue (often with extra formatting or filtering for those systems).

CODE
kuacc-queue | grep username 
CODE
valar-queue | grep username 

or

CODE
kuacc-queue | more 
CODE
valar-queue | more 

Below is an example of the valar-queue command output, along with explanations of each column.

image004.jpg

Description

JOBID

Unique identifier assigned to the job by Slurm.

PARTITION

Partition (queue) where the job is submitted, e.g., short, long, gpu.

NAME

Job name (set with #SBATCH -J; if not provided, the script name is used).

USER

Username of the job owner.

TIME_LEFT

Remaining walltime before the job reaches its limit.

TIME_LIMIT

Maximum walltime allocated to the job (from #SBATCH -t).

START_TIME

Actual or scheduled start time of the job.

ST (STATE/STATUS)

Current state of the job. Common values include: R = Running PD = Pending CG = Completing CD = Completed CA = Cancelled F = Failed TO = Timeout NF = Node Fail OOM = Out of Memory S = Suspended

NODES

Number of nodes allocated to the job.

CPUS

Total number of CPU cores allocated to the job.

NODELIST(REASON)

If running, shows the node(s) allocated. If pending, shows the reason (e.g., Resources, Priority, Dependency).

NODELIST(REASON) Column: Meaning of Common Reasons

Reason

Meaning

InvalidQOS

The job’s Quality of Service (QOS) is invalid.

Priority

One or more higher-priority jobs are ahead in the queue; your job will run eventually.

Resources

The job is waiting for required resources (CPU, memory, nodes) to become available.

PartitionNodeLimit

The job’s requested nodes exceed the partition’s current limits, or required nodes are DOWN/DRAINED.

PartitionTimeLimit

The job’s requested runtime exceeds the partition’s time limit.

QOSJobLimit

The maximum number of jobs allowed for this QOS has been reached.

QOSResourceLimit

The QOS has reached a resource allocation limit.

QOSTimeLimit

The QOS has reached its maximum allowed time.

QOSMaxCpuPerUserLimit

Maximum number of CPUs per user for this QOS has been reached; job will run eventually.

QOSGrpMaxJobsLimit

Maximum number of jobs for the QOS group has been reached; job will run eventually.

QOSGrpCpuLimit

All CPUs allocated to the job’s QOS group are in use; job will run eventually.

QOSGrpNodeLimit

All nodes allocated to the job’s QOS group are in use; job will run eventually.

For a complete list of reason codes, refer to this link. In practice, the most commonly encountered reasons are Priority, Resources, and QOSMaxCpuPerUserLimit.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.