This is an old revision of the document!
high performance computing. Implies a higher percentage of CPU and memory usage than typical administrative computing. In academia, used for and implies computing for research. Also HTC, high throughput computing, similar but more oriented to processing large data files.
a computer in a box, similar to a desktop computer but typically more powerful and packaged for rackmount.
a group of nodes connected by a network. Depending on budget and need for communication, the network may be inexpensive commodity (Gigabit ethernet) or faster and more expensive (Infiniband).
a relatively large and powerful cluster. Exact definition uncertain.
a computer dedicated to scientific computing tasks. Usually controlled by a scheduler program. Usually isolated from the public internet by head nodes.
a computer connected to the public internet and dedicated to logins, editing, moving data, submitting jobs.
a computer connected to the public internet and dedicated to moving data.
Circa 1980, one computer= one node= one CPU= one socket = one core = one thread. Since then computer design has become more complicated. Typical server computers used for HPC have two to eight sockets (usually two) on a mainboard or motherboard. Each socket holds a CPU “central processing unit”. Each CPU has 1 to 64 cores, each of which can run an independent program. Intel computers also have 1 or 2 hardware threads or hyperthreads per core, each capable of running an independent program.
a graphical processing unit, a specialized type of CPU derived from a graphics card. Effectively has hundreds of small cores. For certain tasks, much faster than a general-purpose CPU. Presently available versions must be attached to and be controlled by a program running on a CPU.
In software, a program that runs multiple tasks or software threads, each of which sees the same available memory available from the operatin system and shares that memory using one of the multiple shared memory communication methods (OpenMP, pthreads, POSIX shm, MPI over shared memory, etc.) In hardware, usually a node or a single computer system that supports shared memory access across its memory. Present-day versions usually implemented as NUMA. Implies a limit to the memory size of the program which is determined by presently available hardware.
non-uniform memory access, some CPUs (and their cores) have faster access to memory that is physically attached to them. Memory that is attached to other CPUs is accessed on an internal network, causing additional latency.
delay, or the time it takes to access a minimal message over a given network. Used to characterize networks in combination with
the amount of data that can be moved over a network per second.
In software, a program or group of programs that run on multiple nodes or shared-memory instances and use programs such as MPI to communicate between the nodes. In hardware, a cluster that runs distributed-memory programs. Distributed-memory programs are limited in memory size only by the size of the cluster that runs them.
A design element used to make fast computers affordable. Memory is arranged in levels with very small and very fast and very expensive levels close to the CPU, and each succeeding level is larger and slower. Most modern computers have registers (very fast and of KB size), L1 to L3 or L4 cache of MB size, and main memory of GB size, or “memory” if unspecified. The operating system automatically handles staging data from main memory through the cache and registers, unless the programmer uses assemble language to control that staging.