In a previous article, we introduced the MPI Operator from the Kubeflow project. We used it to perform a particular type of MPI processing job: computational fluid dynamics (CFD) with the OpenFOAM toolkit.

When you break down what’s going on in a typical OpenFOAM processing run, you have a lot of pre- and post-processing steps surrounding the meat of the fluid dynamics analysis. Many of these pre- and post-processing steps do not need to be run in parallel, meaning they do not need to run as part of the MPIJob. The MPIJob is only required for parallel processing operations.

As an additional experiment, we set out to see if OpenShift Piplines (Tekton) could be used to break up the work into a more logical sequence of steps. While OpenShift Piplines is typically thought of as being a solution for CI/CD, HPC workloads are decidedly not software application projects. So why would we associate one with the other?

Looking at the upstream Tekton documentation, you see that Tekton Pipelines are just a sequence of Tasks, and Tasks are made up of Steps. In the case of the complicated Morlind Engineering wing CFD analysis, the MPI job looks like the following:

  1. surfaceConvert
  2. surfaceFeatures
  3. blockMesh
  4. decomposePar
  5. snappyHexMesh (parallel)
  6. renumberMesh (parallel)
  7. checkMesh (parallel)
  8. patchSummary (parallel)
  9. potentialFoam (parallel)
  10. simpleFoam (parallel)

There are ten things to perform in a sequence, where the next one should only be done if the previous one completes successfully. Of course, a complicated bash or shell script could be written to handle all of the error checking and sequential processing, but those types of things are exactly what OpenShift Pipelines is good for: processing in a sequence with error/condition checking.

Because an MPIJob is not a native OpenShift Pipelines Task, there was a little bit of improvisation that was required to make the MPIJob play nicely in a Pipeline, but the uplift was not too great above figuring out Pipelines in general by themselves.

In the end, version 2 of our experimental demo repository shows that you can, in fact, perform these types of workloads using OpenShift Pipelines. There are some benefits in terms of simplifying things in one place, but there is new complexity introduced with the Pipelines YAML syntax. There are also some new filesystem permission challenges that come into play that were able to be slightly glossed over when using the simpler, “pure” MPIJob methodology.

In all, whether or not you should use Pipelines to assist you in running your HPC workloads on OpenShift is more of a question about whether the tradeoffs are worth the benefits. Look at the sample repository, experiment with the examples, and see what works better for you!


Über den Autor

UI_Icon-Red_Hat-Close-A-Black-RGB

Nach Thema durchsuchen

automation icon

Automatisierung

Das Neueste zum Thema IT-Automatisierung für Technologien, Teams und Umgebungen

AI icon

Künstliche Intelligenz

Erfahren Sie das Neueste von den Plattformen, die es Kunden ermöglichen, KI-Workloads beliebig auszuführen

open hybrid cloud icon

Open Hybrid Cloud

Erfahren Sie, wie wir eine flexiblere Zukunft mit Hybrid Clouds schaffen.

security icon

Sicherheit

Erfahren Sie, wie wir Risiken in verschiedenen Umgebungen und Technologien reduzieren

edge icon

Edge Computing

Erfahren Sie das Neueste von den Plattformen, die die Operations am Edge vereinfachen

Infrastructure icon

Infrastruktur

Erfahren Sie das Neueste von der weltweit führenden Linux-Plattform für Unternehmen

application development icon

Anwendungen

Entdecken Sie unsere Lösungen für komplexe Herausforderungen bei Anwendungen

Virtualization icon

Virtualisierung

Erfahren Sie das Neueste über die Virtualisierung von Workloads in Cloud- oder On-Premise-Umgebungen