High Performance Computing

High Performance Computing (HPC).

HPC Types

HPC types:

Centralized (monolitic, mainframes)
Clustering
Grid computing

A supercomputer is a system focused on high speed processing, and are very specialized.

The most common solution in the 2020’s is clustering, as centralized is considered too old and grid computing too complex.

There is another architecture that is called Caches Only Memory Architecture (COMA).

The Amdahl Law analyzes what would be the speedup according to a load depending on the paralellizable parts and the number of processors available.

When we apply a network to MIMD, we get cluster computing and grid computing.

Mainframe

Mainframe is focused on many in and out operations.

IBM is the most popular manufacturer of mainframes. The scripting language for IBM mainframes is Job Control Language (JCL).

As of 2024, there are different organizations within the Spain public sector that use mainframes, such as Agencia Tributaria, Seguridad Social or the Ministry of Defense.

Mainframe OSs

z/OS is a popular mainframe OS developed by IBM.

IBM Customer Information Center System (CICS) is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE.

Mainframe Scripting Languages

Scripting languages are used to perform operations on mainframes.

Job Control Languages (JCL) is the scripting language for IBM mainframes.

To write a comment in JCL, it must start by //*.

Mainframe Programming Languages

Traditional programming languages for mainframe programs are COBOL, Rexx (Restructured Extended Executor) and Programming Language One (PL/I).

Most modern mainframes may use Java and C#.

Clustering

Cluster computing focuses on paralelism over distribution, and have a local scope.

Cluster has a middleware to control processing.

A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.

Components in a cluster system:

Computing nodes
High performance interconnection network
Cluster middleware / SSI

Cluster middleware is a specialized form of middleware that focuses on managing clustered environments, ensuring seamless operation among multiple machines.

It allows that all machines are interfaced in the same way, even if they are different.

Single System Image (SSI) gives the impression that the cluster system is a single computer.

Cluster middleware service layers:

SSI Hardware layer
SSI at the OS kernel llevel
SSI Application level

SSI hardware layer provides hardware-level abstractions for resource sharing. Examples: Virtualization of memory or storage across nodes.

Another example could be a hardware load balancer, with the possibilities of A+P (active+passive) or A+A (active+active).

SSI at the Operating System Kernel Level. Enables process migration, distributed scheduling, and system-wide resource management. Examples: Cluster-aware Linux kernel modifications.

Example: Scyld Beowulf, Solaris MC, Windows Server 2003.

SSI at the Application Level. Provides application-specific clustering, such as distributed databases or parallel processing frameworks. Examples: MPI (Message Passing Interface) for distributed computing.

MOre examples: Linux Virtual Server, Kimberlite, Open Solaris, Microsoft Cluster Server (Advanced Server o Databacenter Server).

It is important that the devices used in clusters must use open technologies (including hardware and software), interoperable and using standards.

Possible uses:

Databases
Analytics
AI

It is important to consider that the applications used in clustering are aimed for parallel computing.

The devices used in clustering should be dedicated for clustering, and shold try to follow the principle of symmetric archicture.

The cluster uses SAN networks for communications. You can read this post about SAN.

A successful case of cluster implementation is Google Web Server. It was originally written in Java, and it is now written in Go.

Cluster are usually synchronous.

Globus Toolkit 4 is an standard proposed Global Grid Forum (GGF).

The processors used in grid are Intel Xeon and Tanium Core.

Types of clusters

There are these types of clustering:

High-Performance Computing (HPC) Clusters:

These clusters are designed to deliver high computational power by combining the resources of multiple servers. They are often used for tasks that require significant processing power, such as scientific simulations, data analysis, and complex calculations.

High-Availability (HA) Clusters:

High-availability clusters are designed to ensure that services remain operational with minimal downtime. They achieve this by using redundant hardware and software to automatically failover to backup systems in case of a failure. These clusters are critical for applications where uptime is essential, such as in financial systems or e-commerce platforms.

Load-Balancing Clusters:

Load-balancing clusters distribute incoming network traffic across multiple servers to ensure no single server is overwhelmed. This improves the responsiveness and availability of applications. Load balancers can be hardware-based or software-based and are commonly used in web servers, databases, and other services that handle a large number of requests.

Storage Clusters:

These clusters are focused on providing high availability and scalability for data storage. They often use distributed file systems to ensure data is replicated across multiple nodes, providing redundancy and fault tolerance.

Database Clusters:

Database clusters are specialized clusters designed to manage and distribute database operations across multiple servers. They can provide high availability, scalability, and improved performance for database-intensive applications.

Failover Clusters:

Similar to high-availability clusters, failover clusters are designed to provide redundancy and ensure continuous operation. In the event of a failure, the workload is automatically transferred to a standby server or node

Grid Computing

Grid computing focuses on distribution over paralelism, and have a global scope. It has a more disperse nature than clusters, and they are frequently global.

Grid computing is distributed and parallelized, in this order.

It requires modularization of tasks.

Cloud computing would be a specific case of grid computing.

It is asymmetric, in contract with clustering.

It uses an IG engine.

The grid engine is logical, instead of physical clustering.

The decentralized management is used in grid computer, in contrast to the centralized management in clustering.

It is organized in a master-slave. The results of processing are sent to the master nodes.

Is asynchronous, in contrast to clustering, that is synchronous.

The process and the data can be distributed.

An open standard for grid computer is Open Grid Service Architecture (OGSA).

Flynn’s taxonomy

You can read a post about Flynn’s taxonomy.