6.2 KiB
After going through the tutorial, you should be familiar and comfortable enough with OpenCHAMI to make changes to the deployment process and configuration. We're going to cover some of the more common use-cases that an OpenCHAMI user would want to pursue.
At this point, we can use what we have learned so far in the OpenCHAMI tutorial to customize our nodes in various ways such as changing how we serve images, deriving new images, and updating our cloud-init config. This sections explores some of the use cases that you may want to explore to utilize OpenCHAMI to fit your own needs.
Adding SLURM and MPI to the Compute Node
After getting our nodes to boot using our compute images, let's try running a test MPI job. We need to install and configure both SLURM and MPI to do so. We can do this at least two ways here:
- Create a new
compute-mpiimage similar to thecompute-debugimage using thecompute-baseimage as a base. You do not have to rebuild the parent images unless you want to make changes to them, but keep in mind that you will also have to rebuild any derivative images.
Building Into the Image
Installing via Cloud-Init
Alternatively, we can install the necessary SLURM and MPI packages in our cloud-init config and set up or node in the cmds section of the config file.
Let's start by making changes to the cloud-init config file in /opt/workdir/cloud-init/computes.yaml that we used previously. Note that we are using a pre-built RPMs to install SLURM and OpenMPI from the Rocky 9 repos.
- name: compute
description: "compute config"
file:
encoding: plain
content: |
## template: jinja
#cloud-config
merge_how:
- name: list
settings: [append]
- name: dict
settings: [no_replace, recurse_list]
users:
- name: root
ssh_authorized_keys: {{ ds.meta_data.instance_data.v1.public_keys }}
disable_root: false
packages:
- slurm
- openmpi
cmds:
- systemctl enable slurmctld
- systemctl enable slurmdbd
We added the packages section to tell cloud-init to install the slurm and openmpi packages after booting the compute
Prepare SLURM on Head Node
Run a Sample MPI job across two VMs
After we have installed both SLURM and OpenMPI on the compute node, let's try and launch a "hello world" MPI job. To do so, we will need three things:
- Source code for MPI program
- Compiled MPI executable binary
- SLURM job script
We create the MPI program in C. First, create a new directory to store our source code. Then, edit the /opt/workdir/apps/hello.c file.
mkdir -p /opt/workdir/apps/mpi/hello
# edit /opt/workdir/apps/mpi/hello/hello.c
Now copy the contents below into the hello.c file.
/*The Parallel Hello World Program*/
#include <stdio.h>
#include <mpi.h>
main(int argc, char **argv)
{
int node;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &node);
printf("Hello World from Node %d\n",node);
MPI_Finalize();
}
Compile the program.
cd /opt/workdir/apps/mpi/hello
mpicc hello.c -o hello
You should have an hello executable in the /opt/workdir/apps/mpi/hello directory now. We can use this binary executable with SLURM to launch process in parallel.
Let's create a job script to launch the executable we just created. Create a new directory to hold our SLURM job script. Then, edit a new file called launch-hello.sh in the new /opt/workdir/jobscripts directory.
mkdir -p /opt/workdir/jobscripts
cd /opt/workdir/jobscripts
# edit launch.sh
Copy the contents below into the launch-hello.sh job script.
Note
The contents of your job script may vary significantly depending on your cluster. Refer to the documentation for your institution and adjust the script accordingly to your needs.
#!/bin/bash
#SBATCH --job-name=hello
#SBATCH --account=account_name
#SBATCH --partition=partition_name
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --time=00:00:30
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK /opt/workdir/apps/mpi/hello/hello
We should now have everything we need to test our MPI job with our compute node(s). Launch the job with the sbatch command.
sbatch /opt/workdir/jobscripts/launch-hello.sh
We can confirm the job is running with the squeue command.
squeue
You should see a list with a job named hello that was given in the launch-hello.sh job script.
# TODO: add output of squeue above
If you saw the output above, you should now be able to inspect the output of the job when it completes.
# TODO: add output of MPI job (should be something like hello.o and/or hello.e)
And that's it! You have successfully launched an MPI job with SLURM from an OpenCHAMI deployed system.
Serving the Root Filesystem with NFS (import-image.sh)
For this tutorial, we served images via HTTP using a local S3 bucket (MinIO) and OCI registry. We could instead serve our images using NFS by setting up and running a NFS server on the head node, include NFS tools in our base image, and configuring our nodes to work with NFS.
Enable WireGuard Security for the cloud-init-server
Using Image Layers to Customize Boot Image and with a Common Base
Often, we want to allocate nodes for different purposes using different images. Let's use the base image that we created before and create another Kubernetes layer called kubernetes-worker based on the base image we created before. We would need to modify the boot script to use this new Kubernetes image and update cloud-init set up the nodes.
Using kexec to Reboot Nodes For an Upgrade or Specific Kernal
Discovering Nodes Dynamically with Redfish
In this tutorial, we used static discovery to to populate our inventory in SMD instead of dynamically discovering nodes on our network. Static discovery is good when we know beforehand the MAC address, IP address, xname, and NID of our nodes and guarantee determistic behavior. However, if we don't know these properties or if we want to update our inventory state, we can use magellan to scan, collect, and populate SMD with these properties.