Machine Learning Engineer Intern - LLMs
hace 5 días
Paris
The project will begin with a research phase, where the intern will review existing methods for LLM alignment, including RLHF, Direct Preference Optimization (DPO), and RL with constraints. Once the research foundation is set, the intern will experiment with different training techniques, prompt ...