5.9 KiB
Beowulf cluster
Multi-computer architecture which can be used for parallel computations.
It is usually composed of commodity, non custom hardware and software components and is trivially reproducible, like any PC capable of running a Unix-like operating system with standard Ethernet adapters and switches.
The cluster usually consists of one server node, and one or more client nodes connected via some kind of network.
The server controls the whole cluster, and provides files to the clients. It is also the cluster's console and gateway to the outside world. Large Beowulf machines might have more than one server node, and possibly other nodes dedicated to particular tasks like consoles or monitoring stations.
In most cases, client nodes in a Beowulf system are dumb, and the dumber the better. Clients are configured and controlled by the server, and do only what they are told to do.
Beowulf clusters behave more like a single machine rather than many workstations: nodes can be thought of as a CPU and memory package which is plugged into the cluster, much like a CPU or memory module can be plugged into a motherboard.
Beowulf is no more than a technology of clustering computers to form a parallel, virtual supercomputer. One can build a Beowulf class machine using a standard Linux distribution without any additional software; two networked computers sharing a folder via NFS and which trust each other to execute remote shells can be considered a two node Beowulf machine.
Table of contents
Scheduler
Takes care of scheduling the jobs and juggling the resources in the cluster.
The most used one at the time of writing is Slurm.
Create a quick and dirty cluster
Uses MPICH on Linux.
Just follow this procedure:
- prepare at least 2 Linux hosts
- assign a fixed, known IP address to all the hosts
- create a file on the server node listing the IP addresses of all the client nodes (e.g.
machines_file) - install, enable and start SSH on all the hosts
- configure SSH on all the hosts to let the server node connect to all the client nodes without using a password
- install MPICH on all the hosts, possibly the same version
- test the installation:
# execute `hostname` on all hosts mpiexec -f 'machines_file' -n 'number_of_processes' 'hostname'
See the Vagrant example.
Further readings
- Protogonus: The FINAL Labs™ HPC Cluster
- A simple Beowulf cluster
- Building a Beowulf cluster from old MacBooks:
- Engineering a Beowulf-style compute cluster
- Parallel and distributed computing with Raspberry Pi clusters
- Sequence analysis on a 216-processor Beowulf cluster
- Setting up an MPICH2 cluster in Ubuntu
- The Beowulf howto
- BOINC
- Folding@Home