Tuning COBALT
MPI
Note
This information was taken from the original Confluence page Debugging COBALT2 networking performance issues
As demonstrated in March 2025, COBALT experienced significant performance slowdowns, achieving less than 30% of the theoretical network peak bandwidth when the network was configured incorrectly. An analysis identified two key issues:
1. MPI configuration - Since Debian 12 does not include a precompiled MPI library, OpenMPI was initially built from source using Spack. However, this resulted in suboptimal routing of all traffic through the 10 Gbit interfaces instead of the intended high-speed 100 Gbit interfaces, creating a bottleneck. Properly compiling OpenMPI with the UCX transport library ensures the use of RDMA, optimizing throughput and latency for intra-node communication. Using Spack, this boils down to installing OpenMPI using the openmpi fabrics=verbs,ucx ^ucx+self+verbs+rdmacm spec.
Here, the command can be interpreted as follows:
fabrics=verbs,ucx: Specifies that OpenMPI should use the Verbs and UCX transport layers. Verbs is the low-level API for RDMA communication, while UCX provides a high-performance communication framework for RDMA and shared memory.
^ucx+dc+rc+verbs+rdmacm: Configures the UCX library with specific features:
+dc: Enables Dynamic Connection transport for InfiniBand
+rc: Enables Reliable Connection transport for InfiniBand
+verbs: Activates support for RDMA Verbs, enabling communication over InfiniBand or RDMA-capable Ethernet.
+rdmacm: Enables RDMA Connection Manager, which provides connection establishment for RDMA communication.
Kernel security mitigations - Despite the correct MPI configuration, performance remained degraded for smaller data sizes. Further investigation revealed that the Skylake CPUs in COBALT2 were affected by hardware security vulnerabilities, which are mitigated by default in the Debian 12 kernel but were not present in the previous CentOS 7 setup.
Tuning recommendations
To keep good network performance, it is essential to:
Compile OpenMPI with UCX using specific settings to enable RDMA for efficient high-speed communication over InfiniBand and Ethernet (56G and 100G).
# This is a Spack Environment file. # # It describes a set of packages to be installed, along with # configuration settings. spack: # add package specs to the `specs` list specs: - osu-micro-benchmarks - gcc@13.1.0 - openmpi@=5.0.7 fabrics=ofi,verbs,ucx ^ucx+cma+dc+rc+ud+verbs +thread_multiple+rdmacm target=zen2 view: false concretizer: unify: when_possible
Please note the spack environment provided in this repository uses the above settings already by default.
2. Optionally (!) disable Linux kernel security mitigations to improve performance for workloads with a high per-packet overhead, particulary small-sized ones.
This can be done by adding mitigations=off to the kernel boot arguments.