Usage Rates
Picotte has a "tiered" pricing structure, with two tiers of usage:
- A free tier, which, as the name suggests, does not cost money, but has restrictions on usage and lower priority scheduling
- A paid tier, which has costs for compute and storage, but removes the free tier restrictions and has higher priority scheduling
The goal of this structure is to enable anyone at Drexel interested in using HPC in their research to try Picotte for free, while helping offset costs by having groups with more computational needs pay for their usage.
The tiers are implemented in SLURM, the Picotte job scheduler, using the SLURM concepts of "accounts" and "partitions".
Your research group will typically have two accounts1. For example, if the
PI's name is Sara Zhang, these would be zhangprj
and zhangfreeprj
. Use
--account=zhangfreeprj
to submit free tier jobs and --account=zhangprj
for
paid tier jobs.
"Partitions" are what SLURM calls groups of worker nodes with the same hardware and resource limits. When you submit a job, you submit it to a particular partition depening on your needs. The paid partitions on Picotte are:
Partitions | |
---|---|
Name | Description |
def |
Standard compute nodes. "def" is short for "default". If you're unsure, use this one. |
bm |
"Big memory" nodes. Use for jobs that need a lot of RAM. |
gpu |
GPU nodes. Use for jobs that need GPUs, like AI, machine learning, or shaders. |
long |
Same as def , but with longer runtime limits. Use for CPU jobs that you expect to run for many days. |
gpulong |
Same as gpu , but with longer runtime limits. Use for GPU jobs that you expect to run for many days. |
The free tier has two partitions, def-sm
and gpu-sm
, which are the same as
the paid def
and gpu
partitions, but with more resource limits and lower
priority scheduling. The free tier does not have access to the big memory or
long partitions.
Free tier
The free tier, as the name suggests, has no associated costs. You can use it indefinitely without paying anything.
Free tier jobs have the following limits:
Partition | Maximum Nodes | Maximum Cores per Node | Maximum GPUs | Maximum Runtime |
---|---|---|---|---|
def-sm |
4 | 24 | N/A | 12 hours |
gpu-sm |
1 | 12 | 1 | 4 hours |
Free tier jobs are also scheduled with lower priority than paid jobs. Effectively this means paid jobs get to "cut in line" in front of free jobs. At times of high usage, free tier jobs may have to wait a long time to complete.
Paid tier
To access the Picotte free tier, you (or your PI, if you're a student) needs to provide what's called a "fund-org" or "cost center". This is a ten-digit code that allows us bill your group using the University's accounting system2. URCF staff will add it to the Picotte backend and then grant you access to the paid tier partitions.
Costs
Picotte usage is billed in "service units" (SUs). All compute and storage utilization is measured in SUs and then charged to your cost center at a rate of $0.01 per SU.
Compute costs are as follows:
Picotte Compute Rates | ||
---|---|---|
Resource type | Partition | SUs |
CPU compute | def , long |
1 per core-hour |
Big memory | bm |
68 per TiB-hour |
GPU | gpu , gpulong |
43 per GPU device-hour |
Resources not shown in this table don't contribute to costs. For example,
jobs on def
cost the same regardless of how much memory they use, the only
thing that matters for billing is CPU core-hours. Jobs on gpu
cost the same
regardless of how many CPU cores they use, the only thing that matters for
billing is GPU device-hours.
Storage costs 1000 SU (= $10) per TiB-month. The first 500GiB of storage for each group is free.
Examples:
- A
gpu
job that runs for 1 hour using 4 GPUs costs:1 * 4 * 43 = 172 SUs = $1.72
- A
def
job that runs for 20 hours on 2 CPU cores costs:20 * 2 * 1 = 40 SUs = $0.40
- A
bm
job that runs for 10 hours using 500GiB of memory costs:10 * 0.5 * 68 = 340 SUs = $3.40
- Storing 5 TiB of data for one month costs:
(5 - 0.5) * 1000 = 4500 SUs = $4.5
(subtract 0.5 because the first 500GiB is free)
Note
All resource usage above is computed based on resources reserved for the actual, not requested, lifetime of a job. Examples:
- A job requests 4 GPU devices for 1 hour, but runs only on one GPU device for 1 hour. While the actual usage is 1 GPU-hour, the resources allocated to the job are 4 GPU-hours. The billable amount is 4 GPU-hours = 172 SU. This is because those resources are made unavailable to others.
- A job requests 1 GPU device for 4 hours, but completes in 1 hour. The actual usage is 1 GPU-hour. The charge is for 1 GPU-hour = 43 SU.
Limits
While the paid tier removes the tight resource limits imposed on free tier jobs, it does still have some limits, just to prevent runaway usage. These are as follows:
Partition | Maximum Nodes | Maximum Cores per Node | Maximum GPUs | Maximum Runtime |
---|---|---|---|---|
def |
24 | Unlimited | N/A | 48 hours |
gpu |
8 | Unlimited | Unlimited | 36 hours |
bm |
1 | Unlimited | N/A | 21 days |
long |
18 | Unlimited | N/A | 8 days |
gpulong |
2 | Unlimited | Unlimited | 8 days |
Some queues also have minimum resource requirements. For gpu
and gpulong
, you
must request at least one GPU (--gres=gpu:1
) or your job will not run. For
bm
, you must request at least 200G of memory (--mem=200G
) or your job will
not run. This is to prevent jobs that could run in the default compute nodes
from accidentally wasting the specialized resources of the big memory or GPU
nodes.
-
These are SLURM "bank accounts", not the user accounts your group members use to log in to Picotte. The naming is very confusing. An "account" in this sense is an abstraction that SLURM uses to keep track of where a job should be billed. They're sometimes also called "projects". Even if your group has many users, they'll typically be submitting jobs to one or two accounts. ↩
-
Fund-orgs are typically grouped into a six-digit fund code and a four-digit org code, seperated by a dash, e.g 123456-7890. You can read more here. ↩