Kafka Administrator
hace 2 días
Manchester
Role Overview We are seeking experienced Kafka Administrators / DevOps Engineers to join the Streaming CoE team responsible for delivering, operating, and optimizing enterprise-scale Kafka-based streaming platforms. The successful candidates will play a key role in building and supporting highly available, secure, and scalable Streaming-as-a-Service solutions that enable real-time data ingestion and event-driven architectures across critical business workloads. This role requires strong expertise in Apache Kafka, Kubernetes, DevOps automation, Cloud Platforms, and Observability tooling, along with hands-on experience managing large-scale streaming environments. Key Responsibilities Kafka Platform Administration • Design, deploy, configure, and manage enterprise Kafka clusters., • Administer Kafka components including:, • Kafka Brokers, • Kafka Connect, • Schema Registry, • Kafka Streams, • ZooKeeper / KRaft, • Manage topics, partitions, replication, retention policies, and consumer groups., • Perform Kafka cluster upgrades, migrations, scaling, and failover activities., • Monitor cluster health and optimize performance, throughput, and reliability., • Troubleshoot Kafka-related issues including:, • Consumer lag, • ISR issues, • Leader election problems, • Replication delays, • Connectivity and performance bottlenecks, • Implement and maintain Kafka security controls including SSL/TLS, ACLs, authentication, authorization, and encryption. DevOps & Automation • Build and maintain CI/CD pipelines using:, • Jenkins, • GitHub Actions, • Azure DevOps, • Automate platform deployments and configuration management., • Develop Infrastructure-as-Code solutions using Terraform and Ansible., • Support GitOps practices and automated deployment workflows. Kubernetes & Platform Engineering • Deploy and manage Kafka workloads on Kubernetes platforms., • Create and maintain Helm charts., • Support ArgoCD-based GitOps deployment models., • Manage ConfigMaps, Secrets, certificates, and application configurations., • Implement rolling upgrades, scaling strategies, and lifecycle management., • Ensure platform stability, resilience, and operational excellence. Cloud Infrastructure • Support Kafka deployments on one or more cloud platforms:, • AWS, • Microsoft Azure, • Google Cloud Platform (GCP), • Work closely with cloud engineering teams to optimize infrastructure and platform performance., • Support cloud-native deployment and operational practices. Monitoring & Reliability Engineering • Implement and maintain observability solutions using:, • Prometheus, • Grafana, • Dynatrace, • Confluent Control Center, • Create monitoring dashboards, alerts, and operational reports., • Drive SRE best practices and platform reliability improvements., • Participate in incident management, root cause analysis, and continuous improvement initiatives. Required Skills & Experience Kafka Administration (Must Have) • Strong hands-on experience with Apache Kafka., • Deep understanding of:, • Brokers, • Topics, • Partitions, • Replication, • Consumer Groups, • Kafka Connect, • Schema Registry, • Kafka Streams, • ZooKeeper / KRaft, • Experience with Kafka performance tuning and capacity planning., • Experience managing enterprise-scale Kafka clusters., • Strong troubleshooting and operational support experience., • Knowledge of Kafka security best practices.