Staff ML Performance Engineer (Inference Optimisation)
As a Staff ML Performance Engineer, you will focus on optimizing machine learning inference for edge devices and GPUs, working on large transformer-based models. You will collaborate with model developers, profile bottlenecks, implement optimizations, and build robust benchmarking and testing systems to ensure performance improvements across various devices and software releases.