Senior ML Research Engineer
2 days ago
Egham
Contract Type: 6 month contract outsourced via agency on an hourly rate Location: Egham Hybrid: 3 days onsite (minimum) and 2 days working from home Rate: Very much dependant on level of experience. Key responsibilities include: • Performance Optimization: Profile and debug performance bottlenecks at the OS, runtime, and model levels., • Model Deployment: Work across the stack—from model conversion, quantization, and optimization to runtime integration of AI models on-device., • Toolchain Evaluation: Compare deployment toolchains and runtimes for latency, memory, and accuracy trade-offs., • Open-Source Contribution: Enhance open-source libraries by adding new features and improving capabilities., • Experimentation & Analysis: Conduct rigorous experiments and statistical analysis to evaluate algorithms and systems., • Prototyping: Lead the development of software prototypes and experimental systems with high code quality., • Collaboration: Work closely with a multidisciplinary team of researchers and engineers to integrate research findings into products. We not require a PhD holder this time which is unusual for the AI Team. We're looking for someone with: • Technical Expertise: Strong OS fundamentals (memory management, multithreading, user/kernel mode interaction) and expertise in ARM CPU architectures., • Programming Skills: Expert proficiency in Python and Rust, with desirable knowledge in C and C++., • AI Knowledge: Solid understanding of machine learning and deep learning fundamentals, including architectures and evaluation metrics., • Problem-Solving: Strong analytical skills and the ability to design and conduct rigorous experiments., • Team Player: Excellent communication and collaboration skills, with a results-oriented attitude Desirable Skills: • Experience with ARM 64-bit architecture and CPU hardware architectures., • Knowledge of trusted execution environments (confidential computing)., • Hands-on experience with deep learning model optimization (quantization, pruning, distillation)., • Familiarity with lightweight inference runtimes (ExecuTorch, llama.cpp, Candle).