Midokura Technology Radar

GPU-aware Kubernetes & Cluster Orchestration

kubernetes orchestration gpu kubeflow team:mido/infra

Feb 2026

Trial

Why?

Scheduling, partitioning, and multiplexing scarce GPU resources efficiently is critical to utilization and cost control.
Integrations with ML platforms (Kubeflow, Ray, K8s device plugins) streamline developer workflows.

What?

Adopt GPU device plugins, topology-aware scheduling, resource quotas, and autoscaling policies.
Pilot workload-specific orchestration (e.g., low-latency inference vs large-batch training).
Integrate with tenant isolation, cost tracking, and quota management systems.