Research Scientist, Autonomous Agents — Reward Modelling
6 days ago
Experience with open‑ended learning, RL, and frontier methods for training LLMs (RLHF, RLAIF, multi‑turn RL, multi‑agent interactions, reward function design and modelling, etc. Research Scientist, Autonomous Agents — Reward Modelling. This role will focus on research into developing next-ge