Senior Systems Software Engineer, Kubernetes Scale - DGX Cloud
This role involves driving performance and scale characterization for NVIDIA's DGX Cloud software stack, focusing on Kubernetes and NVIDIA components like GPU Operator, DCGM, and NIM. The engineer will diagnose complex distributed systems issues, build automated testing and monitoring tools, and contribute to open-source communities to optimize large-scale AI infrastructure. Work includes deep performance analysis from orchestration down to hardware level, with a focus on cost efficiency and scalability.