Skip to content

Job Scheduling

Job scheduling ​is a complex problem. Periodically, the scheduler will attempt to assign the highest priority jobs in the pending list to the most appropriate available resources. The process of scheduling a job has two distinct stages:

  1. Job Selection - every job in the pending job list is assigned a priority (a scalar value), and the entire list is sorted in order of priority, highest priority first.
  2. Job Scheduling - this is where a job is assigned to a set of free resources. The system attempts to find suitable resources for the jobs in priority sequence.

Fair Tree in Slurm

The Slurm job engine uses the Fair Tree algorithm to implement fairshare job scheduling, ensuring equitable resource allocation among users and accounts. The Fair Tree algorithm prioritizes users based on their fairshare factors, which are calculated by considering the shares and usage of each user and their siblings within the association tree. This hierarchical approach ensures that if one account has a higher fairshare factor than another, all users within the higher-priority account will have higher fairshare factors than those in the lower-priority account.

The algorithm works by creating a rooted plane tree, logically sorting it by fairshare values, and then traversing it in a depth-first manner. This method allows Slurm to rank users in descending order of priority, ensuring that users from higher-priority accounts receive more favorable scheduling. The fairshare value for each user is determined by their rank divided by the total number of user associations, with the highest-ranked user receiving a fairshare value of 1.01. This system helps prevent precision loss and ensures that account coordinators cannot inadvertently harm the priority of their users relative to others.

Override Policy

More than one scheduling algorithm may be at play at any time. One frequent need is if an external deadline or crunch time is known ahead of time, URCF staff may reserve some amount of resources in order to ensure timely completion of those jobs. In such a case, a Resource Override may be put in place. Please contact URCF Support if you require increased priority to meet a deadline.