Senior MLOps Engineer
hace 2 días
Chazo
Job Description: Senior MLOps Engineer Location: Remote from Spain (Spanish employment contract) We are seeking an experienced MLOps Engineer with expertise in Google Cloud Platform (GCP) to design, build, and optimize end-to-end AI, ML, and data engineering pipelines. This role involves deploying machine learning models, LLMs, and traditional AI models, as well as managing data processing workflows in a GCP-first environment. The ideal candidate will have experience working with Google Kubernetes Engine (GKE), Apache Spark, Dataproc, Terraform, Vertex AI, and Airflow (Cloud Composer) to ensure scalable and efficient AI/ML operations. While Amazon Web Services (AWS) experience is a plus, it is not required. Requirements: • 4-year degree preferred relevant experience will be considered, • 3+ years of MLOps/DevOps/Data Engineering experience, with expertise in Google Cloud Platform (Vertex AI, Dataproc, BigQuery, Cloud Functions, Cloud Composer, GKE)., • Hands-on experience building AI/ML pipelines and data engineering workflows using Apache Airflow (Cloud Composer), Spark, Databricks, and distributed data processing frameworks., • Experience working with LLMs and traditional AI/ML models, including fine-tuning, inference optimization, quantization, and serving., • Proficiency in CI/CD for ML, version control (Git), and workflow orchestration (Airflow, Kubeflow, MLflow)., • Strong experience with Terraform for infrastructure automation., • Strong knowledge of Apigee for deploying, managing, and securing machine learning APIs at scale., • Production-ready AI/ML solutions: Proven ability to build, deploy, and maintain AI modelsin real-world production environments., • Programming Skills: Proficiency in Python and familiarity with Bash, Scala, or Terraform scripting., • Experience with security best practices for ML models, including IAM, data encryption, and model governance. Bonus Qualifications/Experience: • Experience with multi-cloud AI/ML solutions., • Familiarity with AWS AI/ML services (SageMaker, EMR, Lambda, EKS, DynamoDB)., • Knowledge of Feature Stores (Feast, Vertex AI Feature Store, AWS Feature Store)., • Understanding of AIOps and ML observability tools., • Experience with real-time AI inference pipelines and low-latency model serving., • Gitlab CI/CD with focus on CI/CD for GCP deployments, • Experience working with PHI/PII in HIPAA and/or GDPR compliant environments Responsibilities: • Build, deploy, and automate AI and ML pipelines on Google Cloud Platform (GCP) using tools such as Vertex AI, BigQuery, Dataproc, Cloud Functions, and GKE., • Deploy, optimize, and scale Large Language Models (LLMs) and other AI/ML models using platforms like Hugging Face Transformers, OpenAI API, Google Gemini, Meta Llama, TensorFlow, and PyTorch., • Design and manage data ingestion, transformation, and processing workflows using Apache Airflow (Cloud Composer), Spark, Databricks, and ETL pipelines., • Deploy AI/ML models and data services using Docker, Kubernetes (GKE), Helm, and serverless architectures including Cloud Run., • Automate and manage ML/AI deployments using Infrastructure as Code tools such as Terraform and CI/CD pipelines with GitHub Actions or GitLab., • Develop scalable, fault-tolerant ML pipelines to train, deploy, and monitor models in production environments., • Deploy AI models using TensorFlow Serving, TorchServe, FastAPI, Flask, and GCP-native serverless technologies like Cloud Run., • Implement monitoring, drift detection, and performance tracking for AI/ML models using MLflow, Prometheus, Grafana, and Vertex AI Model Monitoring., • Ensure security, governance, access control, and compliance best practices across AI and ML workflows., • Design cloud-native architectures with GCP as the core platform, utilizing its AI/ML and data engineering tools.