Research Engineer
2 days ago
Alameda
Research Engineer, Foundation Models About the Opportunity We are seeking a Research Engineer to help advance the next generation of large-scale AI systems. This role sits at the intersection of research and engineering, focusing on the development, training, evaluation, and deployment of state-of-the-art machine learning models. You will work across the full model lifecycle, from building large-scale datasets and training infrastructure to experimenting with new model architectures and inference techniques. This is an opportunity to contribute directly to cutting-edge work in large language models, reinforcement learning, long-context systems, and scalable AI infrastructure. Responsibilities • Develop and optimize training, evaluation, and deployment pipelines for large-scale AI models, • Improve inference efficiency, latency, and throughput across advanced model architectures, • Design and maintain research and production frameworks used for model development, • Train and scale foundation models across large distributed GPU environments, • Build and manage large-scale data processing, collection, and curation pipelines, • Create high-quality datasets to improve model performance and targeted capabilities, • Research, prototype, and benchmark novel model architectures and training approaches, • Contribute to experimentation in areas such as reinforcement learning, long-context modeling, reasoning systems, and inference optimization, • Collaborate closely with researchers and engineers to transition ideas from experimentation to production Qualifications Required • Strong software engineering and systems development experience, • Deep understanding of modern machine learning and deep learning techniques, • Experience training, fine-tuning, or evaluating large language models, • Familiarity with distributed computing and large-scale infrastructure, • Experience building and maintaining data pipelines and ETL workflows, • Ability to design experiments, analyze results, and iterate on research directions, • Strong problem-solving skills and a research-oriented mindset Preferred • Experience working with large GPU clusters and distributed training frameworks, • Background in model optimization, inference systems, or AI infrastructure, • Contributions to machine learning research, open-source projects, or published work, • Experience with reinforcement learning, long-context models, or large-scale data systems What We Value • Ownership and accountability, • Strong collaboration and communication skills, • Bias toward execution and practical problem-solving, • Intellectual curiosity and continuous learning, • High standards for technical excellence and product quality, • Ability to thrive in fast-moving, high-impact environments Compensation & Benefits • Competitive base salary and equity package, • Comprehensive medical, dental, and vision coverage, • 401(k) program with employer matching, • Flexible paid time off policy, • Relocation assistance and visa sponsorship, where applicable, • Opportunity to work alongside a highly talented and mission-driven team, • Access to cutting-edge infrastructure and research resources Keywords: Machine Learning, Artificial Intelligence, Deep Learning, Large Language Models, LLMs, Foundation Models, Generative AI, Applied AI, AI Research, Research Engineering, Model Training, Distributed Training, Pretraining, Fine-Tuning, Post-Training, Reinforcement Learning, RLHF, Reinforcement Learning from Human Feedback, Inference Optimization, Model Serving, Model Evaluation, Long Context Models, Reasoning Models, AI Infrastructure, GPU Clusters, High Performance Computing, HPC, Distributed Systems, CUDA, PyTorch, JAX, TensorFlow, Neural Networks, Transformer Models, Retrieval Augmented Generation, RAG, Synthetic Data, Data Engineering, Data Pipelines, ETL, Data Processing, Web Crawling, Data Collection, Feature Engineering, MLOps, ML Systems, Scalable Systems, Parallel Computing, Model Architecture Design, Experimentation, Research Scientists, Research Engineers, Software Engineering, Backend Engineering, Performance Optimization, Production ML, AI Agents, Agentic AI, Autonomous Systems, Prompt Engineering, Multi-Agent Systems, Vector Databases, Embeddings, Quantization, Model Compression, Infrastructure Engineering, Cloud Computing, Kubernetes, Python, C++, Open Source AI, Frontier Models, Applied Research, Statistical Learning, Computer Science, Algorithms, Large Scale Computing, Model Alignment, AI Safety, Training Infrastructure, Compute Optimization, Inference Systems, Foundation Model Research, Machine Learning Infrastructure, AI Platform Engineering, Systems Engineering, Data Infrastructure, Production Systems, Scalable AI Systems, Research & Development, Advanced AI Systems, Emerging Technologies, Distributed Computing, GPU Optimization, AI Product Development,