Test & AI Evaluation Lead

Paddington Robotics
Harwell Oxford, United Kingdom
Last month
Job Type
Permanent
Work Pattern
Full-time
Work Location
Hybrid
Seniority
Lead
Security Clearance
Required
Posted
9 Apr 2026 (Last month)

Salary: Competitive depending on experience

Location: 2-3 days on-site at our Harwell office with travel to client site when required

Contract type: Full-timepermanent - 37.5 hours

A note from the Founders

Oxford Dynamics is at an inflection point.

We operate in some of the most complex and high‑stakes environments in the world - defence, national security, AI and robotics. The decisions we make now, will define not just how fast we grow, but who we become.

You will work closely with all the team. You will be trusted with judgment calls. You will influence the business. And you will see the impact of your work every day in the work we do.

If you are excited by ownership, pace and purpose - and by building something that genuinely matters - we would love to hear from you.

Who We Are

Founded in 2020, Oxford Dynamics (OD) is a fast‑growing UK deep‑tech company developing AI and robotic systems designed to operate in mission‑critical environments.

Our flagship AVIS (A Very Intelligent System) AI framework fuses multi‑modal data - text, imagery, telemetry and sensor feeds - enabling operators to interrogate complex information at speed and make better decisions under pressure. Our STRIDER robotic platform performs autonomous tasks in hazardous environments, protecting people while extending operational reach.

Our ambition is simple but demanding: to converge AI and robotics so machines can sense, understand and act in complex, real‑world environments.

We work with defence and security organisations internationally to help protect nations, infrastructure and lives.

What you will be doing here/ why this role matters

Oxford Dynamics is a small team who rely on a collaborative and positive approach and so the right attitude for this role is equally as important as experience. We are at an important stage and time in our growth, and as aSenior AI Generative Robotics Engineer you will be an essential part of our success.

You’ll work at thecutting edge of agentic and generative AI, building systems that move beyond lab demos and intoreal-world deployment at pace. At Oxford Dynamics, you’ll have the freedom to experiment in a fast-moving environment, the responsibility to deliver, and the opportunity to shape howmulti-agent AI systems operate in complex, constrained, and high-trust environments.

If you’re excited byagent orchestration, VLLMs, and deploying AI where it matters, this role is built for you!

Role Summary

We're hiring aTest & AI Evaluation Lead to own how Oxford Dynamics validates its AI-driven, mission-critical systems - from multi-agent orchestration and LLM outputs through to cloud infrastructure and real-time user-facing applications.

You'll design and lead test approaches where correctness, resilience, and security matter as much as feature velocity. Working embedded with AI, Backend, Frontend, and DevOps, you'll shape how we validate agent behaviours, data pipelines, and end-to-end operational workflows - from research prototypes through to production deployments for Defence and Security customers. Quality is built in from day one, not inspected at the end.

Key Responsibilities

Test Strategy & Leadership

  • Define and own the end-to-end test strategy across AI, backend, frontend, and infrastructure layers.
  • Establish testing standards appropriate for agentic AI systems, including non-deterministic behaviour and probabilistic outputs.
  • Ensure testing aligns with mission-critical, safety-conscious, and security-first delivery expectations.
  • Act as the primary quality authority across projects, advising engineering and product leadership on risk and readiness.

AI & Data-Focused Testing

  • Design approaches for testing multi-agent workflows, including orchestration logic, memory/state handling, and tool integrations.
  • Define validation strategies for LLM outputs, including groundedness, hallucination detection, task success rates, and regression testing.
  • Work with AI Engineers to embed evaluation metrics and pass/fail thresholds into pipelines.
  • Validate data ingestion, transformation, and inference pipelines across structured and unstructured data sources.

Automation & Tooling

  • Drive a test-automation-first mindset, integrating tests into CI/CD pipelines (GitHub Actions, Argo CD).
  • Oversee automated testing across API and service layers, UI (E2E and accessibility), and infrastructure and deployment workflows.
  • Select, implement, and evolve testing tools and frameworks appropriate to modern cloud-native and AI systems.

Non-Functional Testing

  • Own performance, scalability, reliability, and resilience testing for distributed systems.
  • Coordinate security testing activities in line with secure-by-design principles (e.g. IAM, secrets handling, data boundaries).
  • Validate backup, disaster recovery, and failover scenarios alongside DevOps and Backend teams.

Delivery & Collaboration

  • Embed with delivery teams to ensure testing is planned early and executed continuously.
  • Work closely with Product and Engineering to define clear acceptance criteria and definition of done.
  • Provide clear, decision-ready quality reporting to technical and non-technical stakeholders.
  • Support customer-facing demonstrations, trials, and operational readiness assessments.

Related Jobs

View all jobs

Group Product Manager

PolyAI London, United Kingdom

Robotaxi Technical Operations

Wayve London, United Kingdom
On-site

Robotics Production Team Lead

Humanoid London, United Kingdom
On-site

Software Test Engineer

Humanoid London, United Kingdom
On-site

Robotic Test Engineer II

Humanoid London, United Kingdom
On-site

Machine Learning Engineer (Computer Vision)

Matchtech Surrey, United Kingdom
Hybrid

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Robotics Jobs in the UK: Roles, Skills, Salaries and How to Get Hired (2026 Guide)

Robotics Jobs UK 2026: roles, salaries and skills for engineers and researchers in manufacturing, logistics, autonomous vehicles, defence and healthcare. In the UK, most robotics jobs cluster around hubs such as London, Cambridge, Bristol, Oxford, Manchester and Edinburgh, with common titles including Robotics Engineer, SLAM Engineer, Controls Engineer and Mechatronics Engineer. The most efficient way to browse live robotics jobs is via specialist boards like RoboticsJobs.co.uk, which curate roles specifically in this field so you are not lost in generic tech listings. This guide covers everything you need to know about robotics jobs in the UK in 2026, from the roles and skills in demand to where to find live opportunities and how to stand out as a candidate.

Where to Advertise Robotics Jobs in the UK (2026 Guide)

Where to advertise robotics jobs UK in 2026: the specialist boards, university channels and community routes that reach robotics, SLAM and controls talent. The candidate pool spans mechanical engineers, software developers, controls specialists, computer vision researchers and systems integrators — a multidisciplinary mix that general job boards are poorly equipped to reach. The strongest robotics candidates are often embedded in research groups, defence programmes or advanced manufacturing environments, and move between roles through specialist networks and industry events rather than mainstream platforms. This guide, published by RoboticsJobs.co.uk, covers where to advertise robotics roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about hiring across different role types.