Senior Product Engineer - Training Platform
hace 3 días
San Francisco
Multinode training: Multinode training enables customers to easily run training jobs across multiple compute nodes, enabling users to train large models like GLM 4. Familiarity or experience with the open source training stack and frameworks (NCCL, PyTorch, Megatron, NemoRL, VeRL, Axolotl, HF Traine