Research Scientist, Frontier Red Team (Autonomy)
1 month ago
London
We believe that developing autonomy evals is one of the best ways to study increasingly capable and agentic models. Lead the end-to-end development of autonomy evals and research, including risk and capability modeling, designing, implementing, and regularly running these evals. We are looking fo...