Staff Cloud SRE – AI/ML Platform & GPU Compute
This role involves building and scaling the reliability foundations of Wayve's AI cloud platform, including the Model Development Platform and GPU Compute infrastructure. Responsibilities include defining SLOs, improving capacity planning, leading incident response, and enhancing observability and automation. The position is ideal for someone with experience in SRE, large-scale cloud systems, and GPU-backed environments.