In a previous article, we introduced the MPI Operator from the Kubeflow project. We used it to perform a particular type of MPI processing job: computational fluid dynamics (CFD) with the OpenFOAM toolkit.
When you break down what’s going on in a typical OpenFOAM processing run, you have a lot of pre- and post-processing steps surrounding the meat of the fluid dynamics analysis. Many of these pre- and post-processing steps do not need to be run in parallel, meaning they do not need to run as part of the MPIJob. The MPIJob is only required for parallel processing operations.
As an additional experiment, we set out to see if OpenShift Piplines (Tekton) could be used to break up the work into a more logical sequence of steps. While OpenShift Piplines is typically thought of as being a solution for CI/CD, HPC workloads are decidedly not software application projects. So why would we associate one with the other?
Looking at the upstream Tekton documentation, you see that Tekton Pipelines are just a sequence of Tasks, and Tasks are made up of Steps. In the case of the complicated Morlind Engineering wing CFD analysis, the MPI job looks like the following:
- snappyHexMesh (parallel)
- renumberMesh (parallel)
- checkMesh (parallel)
- patchSummary (parallel)
- potentialFoam (parallel)
- simpleFoam (parallel)
There are ten things to perform in a sequence, where the next one should only be done if the previous one completes successfully. Of course, a complicated bash or shell script could be written to handle all of the error checking and sequential processing, but those types of things are exactly what OpenShift Pipelines is good for: processing in a sequence with error/condition checking.
Because an MPIJob is not a native OpenShift Pipelines Task, there was a little bit of improvisation that was required to make the MPIJob play nicely in a Pipeline, but the uplift was not too great above figuring out Pipelines in general by themselves.
In the end, version 2 of our experimental demo repository shows that you can, in fact, perform these types of workloads using OpenShift Pipelines. There are some benefits in terms of simplifying things in one place, but there is new complexity introduced with the Pipelines YAML syntax. There are also some new filesystem permission challenges that come into play that were able to be slightly glossed over when using the simpler, “pure” MPIJob methodology.
In all, whether or not you should use Pipelines to assist you in running your HPC workloads on OpenShift is more of a question about whether the tradeoffs are worth the benefits. Look at the sample repository, experiment with the examples, and see what works better for you!