HPC Engineer - Contract via Umbrella - Cambridge/Hybrid
7 days ago
Cambridge
HPC Engineer - Contract via Umbrella - Cambridge/Hybrid Location: Cambridge, hybrid (ideal 3 days onsite) Market rate Description We're looking for an HPC Engineer to join our team in the United Kingdom in a hybrid working mode (ideal 3 days onsite). In this role, you will help build and operate industry-leading high-performance computing (HPC) capabilities, including application build frameworks, containerized applications and cloud-based services. You will work closely with the scientific community to deliver high-quality HPC services, leveraging automation, infrastructure-as-code and DevOps practices to ensure scalability, reliability and performance in a rapidly evolving HPC landscape. Responsibilities • Design, implement and maintain robust platform infrastructure using Infrastructure as Code (IaC) tools such as Terraform, • Develop, deliver and operate research computing services and applications, • Take a Site Reliability Engineering approach to HPC services, managing development, deployment, monitoring and incident response end-to-end, • Solve complex technical problems related to HPC services and user workflows, • Drive innovative computational solutions and exploit emerging technologies, • Administer large-scale cluster and server computing environments and related software (eg, Slurm, LSF, Grid Engine), • Apply DevOps practices and agile methodologies for HPC operations, • Manage virtualized private cloud resources (eg, OpenStack), • Implement and administer large-scale parallel filesystems (eg, Weka, GPFS, Lustre), • Use configuration management tools (eg, Ansible, Salt, Puppet) for IT operations, • 10+ years of experience operating or engineering large-scale computing environments (HPC, HTC or BC), • Strong understanding of Linux system administration, TCP/IP stack and storage subsystems, • Experience with high-speed networks (eg, InfiniBand), • Proven experience with configuration management and automation frameworks, • Hands-on experience with DevOps processes and agile methodologies, • Drive innovative computational solutions and exploit emerging technologies, • Experience in developing and managing relationships with third-party suppliers, • Scientific degree and/or experience in computationally intensive scientific data analysis, • Experience with public cloud infrastructure (AWS, Azure, GCP), • Experience managing virtualized private cloud environments (eg, OpenStack), • Familiarity with container technologies (LXD, Singularity, Docker, Kubernetes), • Development experience with programming languages and tools (Java/C++, Python/Ruby/Perl, SQL), • Experience with HashiCorp tools (Terraform, Vault, Consul, Nomad)