Skip to content

Picotte Hardware and Software

Picotte is the new URCF HPC cluster.

Nodes

Management node - picottemgmt

Login node - picottelogin

Standard compute nodes (74 nodes)

750px

Big memory compute nodes (2 nodes)

GPU compute nodes (12 nodes)

750px

Storage

Parallel Scratch Storage

  • BeeGFS on 4 Dell servers
    • Total useable volume: 175 TB
    • Connected to cluster via 100 Gbps HDR InfiniBand

Persistent Storage

  • Isilon scale-out storage
    • Total useable volume: 649 TB
      • 7.2 TB SSD caching
    • Connected to cluster using NFS via 6x 10 Gbps Ethernet
    • Connected to campus using SMB (Windows file sharing) via 2x 10 Gbps Ethernet

Local Scratch Storage

  • 960 GB or 1920 GB SSD

Network Fabrics

High Performance Cluster Network

  • Mellanox HDR Infiniband @ 100 Gbps, latency < 0.2 μs

General Purpose Cluster Network

  • 10 Gbps ethernet

Software

Theoretical Peak Performance

The theoretical peak performance adds up the theoretical peak performance of all the individual processors or GPU devices. It does not take into account any effects that occur during actual computation. Excluded are the login and management nodes. Performance is measured in number of floating point operations per second (FLOPS).

Theoretical performance of individual CPUs or GPUs:

  • Standard nodes ("def" partition) - Intel Xeon Platinum 8268[1]: 1459.2 GFLOPS (152 sockets total)
  • Big memory nodes ("bm" partition) - Intel Xeon Platinum 8260[2]: 1152.0 GFLOPS (24 sockets total)
  • GPU devices ("gpu" partition) - Nvidia Tesla V100 for NVLink[3]: 15,700 GFLOPS (48 devices total)

Total theoretical peak performance: 1.0 PFLOPS (1,003,046.4 GFLOPS)

Benchmark Results

  • Benchmarked by Dell using HPL (High-Performance Linpack), and CUDA-enabled HPL for GPU nodes
    • all standard nodes: 145.88 TFLOPS
    • all big memory nodes: 5.54 TFLOPS
    • all GPU nodes: 251.00 TFLOPS

References

[1] Export Compliance Metrics for Intel Xeon Processors (PDF)

[2]

[3] Nvidia V100 Tensor Core GPU