Research Scientist, Autonomous Agents — Reward Modelling
hace 6 días
Research Scientist, Autonomous Agents — Reward Modelling. Experience with open‑ended learning, RL, and frontier methods for training LLMs (RLHF, RLAIF, multi‑turn RL, multi‑agent interactions, reward function design and modelling, etc.