ALFAHD HPC

ALFAHD - High-Performance Computing

The College of Petroleum Engineering and Geosciences (CPG) at King Fahd University of Petroleum and Minerals has recently acquired a High-Performance Computing (HPC) facility. The HPC has been named as Alfahd. The system has been procured, installed, benchmarked, and tested. To ensure the appropriate use of the facility, the user training was also conducted.

The CPG HPC is a collection of computers and disk arrays (3944 cores) that are connected via one of the fastest networks available in the market today (Intel OmniPath). This network can transfer data at a delivery speed of 100 Gigabits per Second. We also rolled-out a parallel file system with 350 TeraBytes (TB) of memory and 153 Tera Floating Point Operations (Tflops) of computing capacity. Also, more than 30 applications that leverage HPC architecture and used by CPG researchers have been installed, tested, and rolled-out.

With this high-end infrastructure in place, Alfahd will enable students, faculty, and researchers to run programs at a larger scale than they would be able to on a laptop or personal computer. The application benchmarking results so far indicated excellent performance.

HPC Web Portal

An HPC portal has been installed and configured on the Alfahd cluster to make it easier for the users to submit, track and manage their jobs on the HPC cluster. The HPC portal is built over SLURM job scheduler - which together provides robust cluster and workload management capabilities that are accessible using the web-based interfaces, making it powerful and simple to use. For applications requiring MPI, the robust, commercial MPI library accelerates and scales HPC applications for a shorter time to solution.

Click below to access Alfahd HPC portal and use your Alfahd HPC credentials to Login and submit your jobs.

The HPC portal includes the following:

Central Web Portal
Workload management
Workload monitoring and reporting
Integrated application templates for job submission

Cluster Details

With the development of information technology, the traditional method of performing tasks (such as simulation, data processing, and data analysis) on a single node cannot meet the requirements for computing accuracy and efficiency. That is why high-performance parallel computing appeared. High-performance computing (HPC) clusters improve work efficiency through parallel processing on multiple nodes. Requirements of this project included hardware devices for computing, storage, and networking and upper-layer software.

The Alfahd cluster has 3944 cores, out of which 3304 cores are from 118 Compute servers with Intel Xeon Broadwell EP processors/128GB memory per node, and 640 cores are from Knights Landing Intel Xeon Phi processor with 96 GB memory per node.

Master Node Details:

Number of nodes – 2 nodes.
Processors – Intel Xeon CPU E5-2680 v4 -14 Cores @ 2.40GHz (Dual-processor).
Memory – 128 GB RAM.
Brand – Huawei Fusion server RH2288 2U rack servers

Management Node Details:

Number of nodes – 2 nodes.
Processors – Intel Xeon CPU E5-2680 v4 -14 Cores @ 2.40GHz (Dual-processor).
Memory – 64 GB RAM.
Brand – Huawei Fusion server RH2288 2U rack servers

Compute Nodes Details:

Number of nodes – 118 nodes.
Processors – Intel Xeon CPU E5-2680 v4 – 14 Cores @ 2.40GHz (Dual-processor).
Memory – 128 GB RAM.
Total no. of cores – 3304.
Brand – Huawei Fusion server RH1288v3 1U rack servers

Compute Xeon-Phi Nodes Details:

Number of nodes – 10 nodes.
Processors – Intel Xeon Phi CPU 7230 - 64 Cores @ 1.30GHz.
Memory – 96 GB.
Total no. of cores – 640.
Brand – Dell PowerEdge C6320p

Network Communication Details:

This cluster has four networks:

Inter-node Communication: 100Gbps Intel Omni Path Fabric, 2:1 blocking.
Cluster Administration: 10GbE Ethernet Interconnect.
Remote Management: 100/1000 Mbps Ethernet Interconnect.
Lustre Storage network: 16 Gbit/s Fiber Channel (FC) network

Storage Details:

This cluster has four Huawei RH2288 rack servers, and (2) Huawei OceanStor 5800V3 Storage, to support Intel Enterprise Lustre Parallel File Systems that provides 320TB usable storage with 10 Gbps throughput.

Operating System Details: