From 60b4a0cf8b2dc131ecec46ff9fe6ec05bad061ef Mon Sep 17 00:00:00 2001 From: "David J. Allen" Date: Tue, 22 Jul 2025 12:54:37 -0600 Subject: [PATCH 1/2] chore: fixed titles missing from documents --- Deployments/Deployments.md | 2 ++ Getting Started.md | 2 ++ Software/Magellan.md | 1 + Software/Software.md | 2 ++ Software/State Management Database (SMD).md | 2 ++ Troubleshooting.md | 2 ++ Tutorial/Environments/AWS Tutorial Environment.md | 1 + Tutorial/Environments/Jetstream2 Tutorial Environment.md | 1 + Tutorial/OpenCHAMI Tutorial.md | 1 + Use Cases/Adding SLURM and MPI to the Compute Node.md | 2 ++ Use Cases/Advanced Use Cases.md | 1 + Use Cases/Discovering Nodes Dynamically with Redfish.md | 2 ++ .../Enable WireGuard Security for the `cloud-init-server`.md | 2 ++ .../Serving the Root Filesystem with NFS (import-image.sh).md | 2 ++ ...g Image Layers to Customize Boot Image with a Common Base.md | 2 ++ ...xec` to Reboot Nodes For an Upgrade or Specialized Kernel.md | 1 + 16 files changed, 26 insertions(+) diff --git a/Deployments/Deployments.md b/Deployments/Deployments.md index d79127c..51bb30f 100644 --- a/Deployments/Deployments.md +++ b/Deployments/Deployments.md @@ -1,3 +1,5 @@ +# Deployments + OpenCHAMI offers deploying the microservices in several ways. This document covers the supported ways to deploy - [[Deploying with Podman Quadlets]] diff --git a/Getting Started.md b/Getting Started.md index 369d05b..4526523 100644 --- a/Getting Started.md +++ b/Getting Started.md @@ -1,3 +1,5 @@ +# Getting Started + OpenCHAMI provides a [tutorial](https://github.com/OpenCHAMI/tutorial-2025) to introduce new users to the project. This tutorial demonstrates how to quickly jump start a development environment with the OpenCHAMI services using Podman quadlets and `systemd`. The main part of the tutorial is organized into 2 phases that covers the following topics: 1. Preparing Head Node or Instance diff --git a/Software/Magellan.md b/Software/Magellan.md index e69de29..f825074 100644 --- a/Software/Magellan.md +++ b/Software/Magellan.md @@ -0,0 +1 @@ +# Magellan \ No newline at end of file diff --git a/Software/Software.md b/Software/Software.md index fe7358f..980ca91 100644 --- a/Software/Software.md +++ b/Software/Software.md @@ -1,3 +1,5 @@ +# Software + The OpenCHAMI project contains a collection of software built to discover, manage, and provision nodes. This sections contains a brief introduction and user guide to quickly get you started with each tool or service. - **[Magellan](Magellan.md)** - Redfish-based tool for automatic node discovery and firmware management diff --git a/Software/State Management Database (SMD).md b/Software/State Management Database (SMD).md index e69de29..0f2a4c9 100644 --- a/Software/State Management Database (SMD).md +++ b/Software/State Management Database (SMD).md @@ -0,0 +1,2 @@ + +# State Management Database (SMD) \ No newline at end of file diff --git a/Troubleshooting.md b/Troubleshooting.md index f38bdf2..1d2e827 100644 --- a/Troubleshooting.md +++ b/Troubleshooting.md @@ -1,3 +1,5 @@ +# Troubleshooting + Sometimes, things don't always work out as we would expect them to when trying to install the services or boot nodes. Whether your issue is related to the services or configuration, this section covers a list of issues you may run into working with OpenCHAMI. Keep in mind that this list is continuously updated as the software is changed. ### Services Not Starting diff --git a/Tutorial/Environments/AWS Tutorial Environment.md b/Tutorial/Environments/AWS Tutorial Environment.md index 45ffff3..0145e5c 100644 --- a/Tutorial/Environments/AWS Tutorial Environment.md +++ b/Tutorial/Environments/AWS Tutorial Environment.md @@ -1,3 +1,4 @@ +# AWS Tutorial Environment For this tutorial, you will be provided with your own EC2 instance and ssh key for access to it. If you would like to replicate it outside the tutorial environment, here are the relevant details. diff --git a/Tutorial/Environments/Jetstream2 Tutorial Environment.md b/Tutorial/Environments/Jetstream2 Tutorial Environment.md index c0672f5..1843d98 100644 --- a/Tutorial/Environments/Jetstream2 Tutorial Environment.md +++ b/Tutorial/Environments/Jetstream2 Tutorial Environment.md @@ -1,3 +1,4 @@ +# Jetstream2 Tutorial Environment For this tutorial, you will be provided with your own compute instance and ssh key for access to it. If you would like to replicate it outside the tutorial environment, here are the relevant details. diff --git a/Tutorial/OpenCHAMI Tutorial.md b/Tutorial/OpenCHAMI Tutorial.md index 1defd8e..127e8ec 100644 --- a/Tutorial/OpenCHAMI Tutorial.md +++ b/Tutorial/OpenCHAMI Tutorial.md @@ -1,3 +1,4 @@ +# OpenCHAMI Tutorial Welcome to the OpenCHAMI hands-on tutorial! This guide walks you through building a complete PXE-boot & cloud-init environment for HPC compute nodes using libvirt/KVM. diff --git a/Use Cases/Adding SLURM and MPI to the Compute Node.md b/Use Cases/Adding SLURM and MPI to the Compute Node.md index 8982ced..e7049ed 100644 --- a/Use Cases/Adding SLURM and MPI to the Compute Node.md +++ b/Use Cases/Adding SLURM and MPI to the Compute Node.md @@ -1,3 +1,5 @@ +# Adding SLURM and MPI to the Compute Node + After getting our nodes to boot using our compute images, let's try running a test MPI job. We need to install and configure both SLURM and MPI to do so. We can do this at least two ways here: - Create a new `compute-mpi` image similar to the `compute-debug` image using the `compute-base` image as a base. You do not have to rebuild the parent images unless you want to make changes to them, but keep in mind that you will also have to rebuild any derivative images. diff --git a/Use Cases/Advanced Use Cases.md b/Use Cases/Advanced Use Cases.md index 71d70a5..5cae571 100644 --- a/Use Cases/Advanced Use Cases.md +++ b/Use Cases/Advanced Use Cases.md @@ -1,3 +1,4 @@ +# Advanced Use Cases After going through the [tutorial](https://github.com/OpenCHAMI/tutorial-2025), you should be familiar and comfortable enough with OpenCHAMI to make changes to the deployment process and configuration. We're going to cover some of the more common use-cases that an OpenCHAMI user would want to pursue. At this point, we can use what we have learned so far in the OpenCHAMI tutorial to customize our nodes in various ways such as changing how we serve images, deriving new images, and updating our cloud-init config. This sections explores some of the use cases that you may want to explore to utilize OpenCHAMI to fit your own needs. diff --git a/Use Cases/Discovering Nodes Dynamically with Redfish.md b/Use Cases/Discovering Nodes Dynamically with Redfish.md index f8043e0..77cb042 100644 --- a/Use Cases/Discovering Nodes Dynamically with Redfish.md +++ b/Use Cases/Discovering Nodes Dynamically with Redfish.md @@ -1,3 +1,5 @@ +# Discovering Nodes Dynamically with Redfish + In the tutorial, we used static discovery to populate our inventory in SMD instead of dynamically discovering nodes on our network. Static discovery is good when we know beforehand the MAC address, IP address, xname, and/or node ID of our nodes and guarantees deterministic behavior. However, sometimes we might not know these properties or we may want to check the current state of our hardware, say for a failure. In these scenario, we can probe our hardware dynamically using the scanning feature from `magellan` and then update the state of our inventory. For this demonstration, we have two prerequisites: diff --git a/Use Cases/Enable WireGuard Security for the `cloud-init-server`.md b/Use Cases/Enable WireGuard Security for the `cloud-init-server`.md index 1db5966..70227ef 100644 --- a/Use Cases/Enable WireGuard Security for the `cloud-init-server`.md +++ b/Use Cases/Enable WireGuard Security for the `cloud-init-server`.md @@ -1,3 +1,5 @@ +# Enable WireGuard Security for the `cloud-init-server` + When nodes boot in OpenCHAMI, they make a request out to the `cloud-init-server` to retrieve a cloud-init config. The request is not encrypted and can be intercepted and modified. # Using WireGuard with Cloud-Init diff --git a/Use Cases/Serving the Root Filesystem with NFS (import-image.sh).md b/Use Cases/Serving the Root Filesystem with NFS (import-image.sh).md index 7295dc4..e687c4f 100644 --- a/Use Cases/Serving the Root Filesystem with NFS (import-image.sh).md +++ b/Use Cases/Serving the Root Filesystem with NFS (import-image.sh).md @@ -1,3 +1,5 @@ +# Serving the Root Filesystem with NFS (import-image.sh) + For the [tutorial](https://github.com/OpenCHAMI/tutorial-2025), we served images via HTTP with a local S3 bucket using MinIO and an OCI registry. We could instead serve our images by network mounting the directories that hold our images with NFS. We can spin up a NFS server on the head node by including NFS tools in our base image and configure our nodes to mount the images. Configure NFS to serve your SquashFS `nfsroot` with as much performance as possible. diff --git a/Use Cases/Using Image Layers to Customize Boot Image with a Common Base.md b/Use Cases/Using Image Layers to Customize Boot Image with a Common Base.md index ab7c9b9..3de9731 100644 --- a/Use Cases/Using Image Layers to Customize Boot Image with a Common Base.md +++ b/Use Cases/Using Image Layers to Customize Boot Image with a Common Base.md @@ -1 +1,3 @@ +# Using Image Layers to Customize Boot Image with a Common Base + Often, we want to allocate nodes for different purposes using different images. Let's use the base image that we created before and create another Kubernetes layer called `kubernetes-worker` based on the `base` image we created before. We would need to modify the boot script to use this new Kubernetes image and update cloud-init set up the nodes. \ No newline at end of file diff --git a/Use Cases/Using `kexec` to Reboot Nodes For an Upgrade or Specialized Kernel.md b/Use Cases/Using `kexec` to Reboot Nodes For an Upgrade or Specialized Kernel.md index e69de29..a97e1fc 100644 --- a/Use Cases/Using `kexec` to Reboot Nodes For an Upgrade or Specialized Kernel.md +++ b/Use Cases/Using `kexec` to Reboot Nodes For an Upgrade or Specialized Kernel.md @@ -0,0 +1 @@ +# Using `kexec` to Reboot Nodes For an Upgrade or Specialized Kernel \ No newline at end of file From ce6881746334ee00bb7b70abe7558457583c0034 Mon Sep 17 00:00:00 2001 From: "David J. Allen" Date: Tue, 22 Jul 2025 12:55:23 -0600 Subject: [PATCH 2/2] chore: add .gitignore --- .gitignore | 1 + 1 file changed, 1 insertion(+) create mode 100644 .gitignore diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..dd33554 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.obsidian