Staff Cloud SRE – AI/ML Platform & GPU Compute
This role involves building and scaling the reliability foundations of Wayve's AI cloud platform, including the Model Development Platform and GPU Compute platform. Responsibilities include defining SLOs, improving capacity planning, leading incident response, and designing observability systems to ensure resilient and performant cloud infrastructure.