Senior Research Scientist - Reinforcement Learning, MoEs
2 months ago
Depth in implementing and post‑training MoEs/LLMs/VLMs/Diffusion models, with a track record of shipped research or publications in MoEs, RL, or agents. Build reward models and learning loops: RLHF/RLAIF, preference modeling, DPO/IPO‑style objectives, offline/online RL, curriculum learning, and