Midokura Technology Radar

RDMA & High-performance Networking

networking rdma infiniband team:mido/infra

Feb 2026

Trial

Why?

Large-scale model training and parallel simulations require low-latency, high-bandwidth interconnects between GPUs.
RDMA/InfiniBand and technologies like RoCEv2 reduce communication overhead and improve scaling efficiency.
Emerging post-RoCEv2 protocols and fabrics (e.g., UltraEthernet, RoCE extensions, and proprietary RDMA-like stacks) are gaining traction for improved determinism, telemetry, and Ethernet-native deployment models.

What?

Standardize on supported network fabrics for distributed training clusters.
Track and evaluate post-RoCEv2 protocols and fabrics (e.g., UltraEthernet): benchmark performance, interoperability, and vendor ecosystem maturity.
Validate RDMA capabilities, NUMA/topology effects, and software stack readiness.
Explore DPU/SmartNIC offloads for network & security functions in AI clusters.