Site Reliability Engineer (Kubernetes / Multi-Cloud)
1 day ago
Hereford
Company Description Synalogik develops technology that enables organisations to work effectively with complex and disparate data sources. Through automated workflows, our platform collects, processes, and analyses data from a wide range of trusted datasets, allowing users to handle large volumes of information quickly and reliably. Our flagship product, Scout®, helps people make better decisions by bringing together useful information from different sources, supported by systems that are built to be reliable and able to scale. Scout® has become the compliance and investigation platform to a third of the UK gambling market, UK Government Agencies, Insurance, legal and banking companies. Despite our long client list, this is just the beginning for Synalogik. There are a raft of new innovations in the works, and the chance to be part of a rapidly expanding team, work with our data science and industry experts, ensure that your ideas get supported, and then get the satisfaction of seeing them in products used in Tier 1 businesses. Role Description We are looking for a Site Reliability Engineer (SRE) to join an established and growing SRE team supporting Kubernetes-based platforms running across Azure and AWS This role focuses on maintaining reliable, scalable, and observable systems, working closely with engineering teams to ensure services run smoothly in production. You will contribute to the operation of managed Kubernetes platforms (AKS/EKS), supporting best practices in monitoring, automation, and incident response, while continuing to develop your expertise in cloud-native technologies. Key Responsibilities Site Reliability Engineering • Participate in incident response, troubleshooting, and post-incident reviews, • Help reduce operational toil through automation and process improvements, • Contribute to improving system availability, performance, and scalability, • Maintain and improve runbooks and operational documentation, • Participate in 24/7 On-Call Rota Kubernetes & Container Platforms • Support the deployment and operation of AKS and EKS clusters, • Assist with cluster upgrades, scaling, and maintenance, • Work with autoscaling tools (Cluster Autoscaler, KEDA, Karpenter), • Help improve workload reliability and performance Cloud Infrastructure (Azure & AWS) • Contribute to infrastructure provisioning using Terraform, • Support networking, identity, compute, and storage services, • Assist with maintaining secure and scalable environments Observability & Monitoring • Work with Prometheus, Grafana, OpenTelemetry, Azure Monitor, and CloudWatch, • Build dashboards, alerts, and logging/tracing pipelines, • Support monitoring aligned to SLIs/SLOs Security & Compliance • Implement RBAC/IAM, secrets management, and network security controls, • Support compliance and security requirements CI/CD & Automation • Support CI/CD pipelines (Jenkins/ Argo), • Contribute to GitOps workflows (ArgoCD / Flux), • Assist in automating deployments and operations Collaboration • Work closely with software engineers and product owners, • Contribute to team discussions, reviews, and planning Essential Skills & Experience Kubernetes · Hands-on Kubernetes experience (AKS, EKS, or similar) · Understanding of cluster architecture, networking, and scaling Cloud · Experience with Azure and/or AWS · Familiarity with networking, IAM, and core services Infrastructure as Code · Experience with Terraform Observability · Familiarity with monitoring/logging tools (Prometheus, Grafana, loki) Technical Skills · Helm Charts / Kustomize creation and maintenance · Linux fundamentals · Containers (Docker) Desirable Skills · Exposure to both Azure and AWS · GitOps tools (ArgoCD / Flux) · Autoscaling tools (KEDA, Karpenter) · Service mesh exposure Personal Attributes · Problem-solving mindset · Willingness to learn · Good communication skills · Proactive and dependable Qualifications · Relevant certifications (desirable) · 2–4 years of experience in cloud/SRE/platform roles Position Details Location – Hybrd (Hereford based) or Remote Employment Type – Full Time Salary - £45,000 – £55,000 Benefits · Work-life balance is important; you’ll get 24 days annual leave, increasing 1 day every year up to 30 days. · You will have the flexibility to work from home and can work around core hours with flexible working · Monthly one-on-ones. · A competitive Pension scheme into which the company will contribute 5% of qualified earnings · Private medical and dental Healthcare Insurance · Life Insurance · Use of the latest IT technology including top of the range MacBook Pro or Air Synalogik is an equal opportunity employer that is committed to diversity and inclusion in the workplace. We prohibit discrimination and harassment of any kind based on race, colour, sex, religion, sexual orientation, national origin, disability, genetic information, pregnancy, or any other protected characteristic. This policy applies to all employment practices within our organisation, including hiring, recruiting, promotion, termination, redundancy, leave of absence, compensation, benefits, training, and apprenticeship. Synalogik makes hiring decisions based solely on qualifications, merit, and business needs at the time. You must be willing to undergo SC Clearance.