Senior Data Engineer
hace 2 días
City of London
Mercuria is a major player in the physical and financial global commodity markets, with key trading centers in London, Geneva, Houston, Singapore, Shanghai, and Beijing. We actively trade in all major commodity asset classes - from crude and refined oil products to power & gas, LNG, coal, and emissions, through to freight, metals, and agricultural products. We operate a globally diversified technology team across hubs such as Geneva, London, Houston, and Singapore, while working closely with strategic co-development centers in Bucharest, Bangalore, and Hyderabad. All teams follow an Agile delivery model in partnership with our business stakeholders to deliver multi-asset-class commodity systems with a focus on automation, user experience, optimization, innovation, and control. The Role This is a fantastic opportunity to join one of the largest integrated energy and commodity trading companies in the world. We are looking for a Senior Data Engineer with strong technical expertise in Databricks, data engineering, and cloud-native analytics platforms. You will contribute to the development and expansion of our global analytics platform—supporting Front Office Trading across commodities—by building scalable, secure, and efficient data solutions. You will work alongside data scientists, ML engineers, and business stakeholders to understand requirements, design and build robust data pipelines, and deliver end-to-end analytics and ML/AI capabilities. Key Responsibilities • Design, build, and maintain scalable data pipelines and Delta Lake architectures in Databricks on AWS., • Develop and enhance the Front Office data warehouse to ensure performance, reliability, and data quality for trading analytics., • Partner with data scientists and quants to prepare ML-ready datasets and support the development of production-grade ML/AI pipelines., • Implement and maintain CI/CD pipelines, testing frameworks, and observability tools for data engineering workflows., • Contribute to MLOps practices, including model tracking, deployment, and monitoring using MLflow and Databricks tools., • Participate in code reviews, data modeling sessions, and collaborative solutioning across cross-functional teams., • Ensure compliance with data governance, security, and performance standards., • Stay current with Databricks platform enhancements and cloud data technologies to apply best practices and recommend improvements. Skills and Experience • BS/MS in Computer Science, Software Engineering, or equivalent technical discipline., • 8+ years of hands-on experience building large-scale distributed data pipelines and architectures., • Expert-level knowledge in Apache Spark, PySpark, and Databricks—including experience with Delta Lake, Unity Catalog, MLflow, and Databricks Workflows., • Deep proficiency in Python and SQL, with proven experience building modular, testable, reusable pipeline components., • Strong experience with AWS cloud services including S3, Lambda, Glue, API Gateway, IAM, EC2, EKS, and integration of AWS-native components with Databricks., • Advanced skills in Infrastructure as Code (IaC) using Terraform for provisioning data infrastructure, including permissions, clusters, jobs, and lakehouse resources., • Proven experience in building MLOps pipelines, tracking model lifecycle, and integrating with modern ML frameworks (e.g., scikit-learn, XGBoost, TensorFlow)., • Exposure to streaming data pipelines (e.g., Kafka, Structured Streaming) and real-time analytics architectures is a strong plus., • Experience implementing robust DevOps practices for data engineering: versioning, testing frameworks, deployment automation, monitoring., • Familiarity with data governance, access control, and regulatory compliance requirements in financial or trading environments., • Excellent communication and problem-solving skills, with a strong sense of ownership and ability to work in agile cross-functional teams., • Experience in commodity trading markets (e.g., power, gas, crude, freight) is a significant advantage., • Certifications in Databricks, AWS, or relevant big data technologies are preferred.