Research Scientist, Autonomous Agents — Reward Modelling
47 minutes ago
Research Scientist, Autonomous Agents — Reward Modelling. Experience with open‑ended learning, RL, and frontier methods for training LLMs (RLHF, RLAIF, multi‑turn RL, multi‑agent interactions, reward function design and modelling, etc.