AI Data Scientist
hace 8 días
Barcelona
We’re looking for a driven AI Data Scientist to work alongside world-leading experts in quantum computing and AI, contributing to cutting-edge projects that redefine the limits of Generative AI. Location: Madrid or Barcelona (Hybrid) Contract: Fixed Term contract ending 30th June 2026 What You’ll Do • Design and lead comprehensive evaluation strategies for our Agentic AI and Retrieval-Augmented Generation (RAG) systems, translating complex business needs into measurable success metrics., • Shape system design by bringing a data- and evaluation-first perspective to retrieval, orchestration, tool usage, and memory components—solving high-impact, real-world problems., • Develop multi-step evaluation frameworks that reflect real-world performance across components such as retrieval, reasoning, and tool use in both cloud and edge environments., • Go beyond model benchmarks by defining rigorous, outcome-focused metrics that measure reasoning, factual accuracy, robustness, and user success., • Build and maintain reproducible evaluation pipelines, including datasets, test suites, configurations, and automated regression tracking., • Curate and generate high-quality datasets, including synthetic and adversarial examples, to strengthen coverage and system robustness., • Implement and refine LLM-as-a-judge evaluations, ensuring alignment with human judgment and fairness across tasks., • Perform deep error analyses, identifying failure patterns and translating them into actionable insights for engineers and researchers., • Collaborate closely with ML teams to create a data flywheel — where evaluation continuously informs prompt design, data generation, training, and deployment., • Monitor operational metrics (latency, cost, reliability) to ensure evaluations mirror production and customer realities., • Champion best practices in code quality, documentation, version control, and reproducibility within ML pipelines., • Mentor and collaborate, contributing to a culture of learning, experimentation, and continuous improvement. What You’ll Bring • Master’s or PhD in Computer Science, Machine Learning, Data Science, Physics, Engineering, or a related technical discipline., • 3+ years (mid-level) or 5+ years (senior) of experience as a Data Scientist, ML Engineer, or Research Scientist in applied AI/ML projects in production., • Demonstrated expertise in evaluating machine learning systems, ideally in LLMs, RAG pipelines, or multi-agent architectures., • Proven ability to design and operationalize evaluation methodologies beyond static benchmarks — measuring reasoning, robustness, and task success., • Strong background in dataset creation and curation, including synthetic data generation., • Hands-on experience with agentic AI, retrievers and vector databases, and orchestration frameworks such as LangGraph or LlamaIndex., • Solid engineering foundations with proficiency in Python, Docker, Git, and scalable, modular ML codebases., • Familiarity with key frameworks and libraries: PyTorch, HuggingFace, LangGraph, LlamaIndex, Pandas, etc., • Experience with cloud platforms (preferably AWS)., • Strong communication and problem-solving skills, with fluency in English. By applying to this role you understand that we may collect your personal data and store and process it on our systems. For more information please see our Privacy Notice