AI Platform Engineer
1 day ago
Barcelona
🔹 AI Platform / MLOps Engineer 🔹 📌 About the role We are looking for an AI Platform / MLOps Engineer to join a fast-growing AI team within an international technology environment. In this role, you will be responsible for operating, scaling, and improving AI/ML systems in production, ensuring that training, inference, pipelines, and platform services are reliable, observable, secure, and cost-efficient. You will work at the intersection of MLOps, DevOps, Cloud Engineering, and AI platform architecture, supporting the full lifecycle of AI systems — from model training environments to production inference, CI/CD automation, monitoring, and cost optimisation. This is a hands-on role for someone with a strong platform engineering mindset, solid experience in AWS, infrastructure, automation, and ML tooling, and a passion for building production-grade AI systems. If you enjoy making AI systems scalable, reliable, observable, and ready for real-world usage — this could be a great fit. 💻 What you’ll do 🔹 Operate and scale AI/ML platforms end-to-end, including training, inference, pipelines, and production environments 🔹 Build and maintain robust ML infrastructure using tools such as AWS SageMaker, MLflow, feature stores, and related ML platform components 🔹 Design and implement CI/CD pipelines for ML models, AI workloads, and platform services 🔹 Set up and optimise training and inference environments for reliability, scalability, and performance 🔹 Implement observability, monitoring, alerting, and cost-control mechanisms for AI workloads 🔹 Support production deployments of ML/AI systems with a strong focus on automation and operational excellence 🔹 Work with DevOps and platform tooling such as AWS, Terraform, Kubernetes, Docker, GitHub Actions / CI/CD tools 🔹 Collaborate with AI Engineers, Data Scientists, Data Engineers, and Tech Leads to ensure AI solutions are production-ready 🔹 Contribute to best practices around MLOps, model versioning, experiment tracking, deployment, monitoring, and governance 🔹 Work with LLM and agentic tooling ecosystems such as LangChain, LangFuse, LangSmith, or similar platforms 🔹 Troubleshoot production issues related to infrastructure, pipelines, inference performance, latency, reliability, and cost 💡 Must Have 🔹 Solid background in Platform Engineering, DevOps, Cloud Engineering, MLOps, or ML Platform Engineering 🔹 Hands-on experience with AWS and cloud-native services 🔹 Experience with Infrastructure as Code, especially Terraform 🔹 Strong experience building and maintaining CI/CD pipelines 🔹 Experience with ML platform tooling such as SageMaker, MLflow, feature stores, or similar tools 🔹 Understanding of ML/AI workflows: training, inference, model deployment, pipelines, monitoring, and lifecycle management 🔹 Experience setting up and managing production environments for AI/ML workloads 🔹 Strong understanding of observability, monitoring, alerting, scalability, and cost optimisation 🔹 Familiarity with containerisation and orchestration tools such as Docker and Kubernetes 🔹 Experience with LLM / agentic tooling such as LangChain, LangFuse, LangSmith, or similar frameworks/platforms 🔹 Strong automation mindset and ability to build reliable, repeatable, production-grade systems 🔹 Strong problem-solving skills and ownership mindset 🔹 Fluent English and Spanish ✨ Nice to Have 🔹 Experience with data pipelines or data engineering workflows 🔹 Experience with AWS Bedrock, vector databases, or LLM infrastructure 🔹 Experience with model monitoring, drift detection, evaluation pipelines, or AI observability platforms 🔹 Experience with workflow orchestration tools such as Airflow, Prefect, or similar 🔹 Knowledge of security, governance, and compliance practices for AI/ML platforms 🔹 Experience working in Agile / Scrum environments 🔹 Previous experience in travel, aviation, digital platforms, or large-scale enterprise environments 🏢 Hybrid model - 2 days onsite per week 🌍 Why join this project? 🤝 People first – diverse and inclusive culture in an international environment. 🚀 Modern cloud platforms and large-scale, global projects. 🧠 Be part of a high-impact environment where AI systems are moving from experimentation to production. ⚙️ Strong focus on engineering quality, automation, reliability, and scalability. ☁️ Hands-on exposure to AWS, MLOps, LLM tooling, and production AI infrastructure. 📈 Opportunity to shape how AI platforms are built, deployed, monitored, and scaled. 😁 High team stability and collaborative culture. 🎓 €1200 per year training budget and continuous learning opportunities. 💰 Flexible compensation model. 🩺 Private health insurance and benefits package. ⚡ Flexible working hours and hybrid model. 🏋️ Wellhub: fitness, wellness, and mental health support. ⚽ Football and paddle tennis teams sponsored by Capitole. 🥳 Team buildings, global events, and strong tech communities. ✨ Want to know more about us? Click ___ and discover all the details. 🔍 Curious about our culture? Check out what people are saying about us on ___. 💬 We know that not every candidate will meet 100% of the requirements. If your profile doesn’t match perfectly but you believe you can add value, we’d still love to hear from you. 👉 Ready for the challenge? Apply now and help build scalable, reliable, production-ready AI platforms. Empowering People, Unlocking Innovation. Information Security Notice • The employee will have access to confidential information related to Capitole and the assigned project., • Compliance with internal security and information protection policies is mandatory., • NDA signature required.