Staff Network Engineer (Menlo Park, CA or Durham, NC) #4507
22 hours ago
Raleigh
Job DescriptionOur mission is to detect cancer early, when it can be cured. We are working to change the trajectory of cancer mortality and bring stakeholders together to adopt innovative, safe, and effective technologies that can transform cancer care. We are a healthcare company, pioneering new technologies to advance early cancer detection. We have built a multi-disciplinary organization of scientists, engineers, and physicians and we are using the power of next-generation sequencing (NGS), population-scale clinical studies, and state-of-the-art computer science and data science to overcome one of medicine’s greatest challenges. GRAIL is headquartered in the bay area of California, with locations in Washington, D.C., North Carolina, and the United Kingdom. It is supported by leading global investors and pharmaceutical, technology, and healthcare companies. For more information, please visit grail.com As a Staff Network Engineer at GRAIL, you will be a hands‑on technical leader responsible for building, operating, and evolving our cloud and hybrid network infrastructure. You’ll spend a significant portion of your time designing, implementing, and troubleshooting secure, scalable, and highly available network solutions in AWS (centered on Amazon VPC), while also owning critical on‑prem and data center networking (Juniper/Aruba) and Palo Alto firewalls. You will both execute (design, configure, implement, monitor, and debug) and provide architectural leadership, standards, and mentorship across teams. A key part of the role includes robust monitoring, logging, dashboarding, and capacity planning to ensure reliable, predictable network performance. This is a hybrid role based in either Menlo Park, CA (moving to Sunnyvale, CA in Fall 2026) or Durham, NC. Our current flexible work arrangement policy requires that a minimum of 60%, or 24 hours, of your total work week be on-site. Your specific schedule, determined in collaboration with your manager, will align with team and business needs and could exceed the 60% requirement for the site.Responsibilities • Staff Network Engineering - AWS and Hybrid Cloud, • AWS VPC Engineering, • Design, build, and maintain Amazon VPCs including CIDR planning, subnet design (public/private), route tables, Internet Gateways (IGW), NAT gateways, and VPC endpoints (Interface/Gateway)., • Configure and manage security controls such as Security Groups, NACLs, AWS Network Firewall, and AWS WAF for defense‑in‑depth across environments., • Hybrid Connectivity, • Implement and support hybrid connectivity using AWS Direct Connect, Site‑to‑Site VPNs, and AWS Transit Gateway for scalable VPC‑to‑VPC and on‑prem connectivity., • Traffic Management & DNS, • Configure Amazon Route 53 for internal and external DNS, routing policies, health checks, and failover., • Deploy and manage Elastic Load Balancing (ALB/NLB/GLB) to provide high availability, SSL termination, path‑based routing, and/or TCP/UDP load balancing., • On‑Prem & Data Center Networking, • Operate and troubleshoot on‑prem and data center networks using Juniper and Aruba platforms (switching, routing, VLANs, VRFs, BGP/OSPF)., • Configure, manage, and tune Palo Alto Networks firewalls, including security policies, NAT, VPN, and content inspection., • Monitoring, Logging & Dashboards, • Design and implement end‑to‑end monitoring, alerting, and dashboards for network health, performance, and security, leveraging tools such as:, • VPC Flow Logs, CloudWatch metrics/logs, and Route 53 health checks., • Firewall logs and on‑prem device telemetry., • Build and maintain dashboards for:, • Link utilization, latency, packet loss, and error rates (DX, VPN, TGW, campus links)., • Load balancer health, connection metrics, and capacity., • DNS performance and resolution issues., • Establish actionable alerting thresholds and runbooks to support rapid incident triage and resolution., • Capacity Planning & Performance, • Perform ongoing capacity planning for AWS networking (VPCs, TGW, DX, VPN, load balancers) and on‑prem links, forecasting growth and identifying bottlenecks., • Analyze traffic patterns and utilization data to right‑size connectivity, optimize routing, and plan upgrades before they become constraints., • Run performance tests and baselines (throughput, latency, failover behavior) and tune configurations accordingly., • Incident Response & Troubleshooting, • Lead network‑related incident response, including real‑time troubleshooting across layers (DNS, TCP/IP, TLS, HTTP, internal app protocols)., • Drive root‑cause analysis (RCA) and implement corrective and preventive actions (runbooks, automation, design changes)., • Architecture & Design (Significant Component), • Own end‑to‑end network architecture for multi‑account, multi‑region AWS environments, ensuring scalability, reliability, observability, and security., • Develop and maintain network reference architectures and patterns for:, • Isolated and regulated environments., • Service‑to‑service connectivity using PrivateLink, VPC peering, and/or VPC Lattice., • Ingress/egress patterns through ELB, Global Accelerator, and centralized egress VPCs., • Design application connectivity, segmentation, and zero‑trust network patterns in partnership with Security and Platform teams., • Evaluate and introduce advanced AWS networking capabilities (e.g., AWS App Mesh, Amazon VPC Lattice, AWS Global Accelerator) where they provide clear operational or performance benefits., • Ensure architectural designs explicitly include observability and capacity planning requirements (telemetry, KPIs, SLOs)., • Automation, Tooling & Governance, • Build and maintain infrastructure‑as‑code for network components (e.g., Terraform/CloudFormation modules for VPCs, TGWs, Direct Connect, routing, firewall rules)., • Integrate network provisioning and configuration into CI/CD pipelines to support safe, auditable, and repeatable deployments., • Automate generation and updates of network monitoring, logging, and dashboard configurations where possible., • Define and codify network standards, guardrails, and best practices for AWS and on‑prem networking, including monitoring and capacity baselines., • Partner with Security and Compliance to ensure designs and implementations meet regulatory and internal policy requirements, including logging and retention requirements., • Collaboration & Leadership, • Act as the primary subject matter expert for AWS networking, hybrid connectivity, and network observability, providing guidance to platform, SRE, security, and application teams., • Mentor other engineers on networking fundamentals, AWS networking, performance troubleshooting, and effective monitoring/dashboards., • Lead and review technical designs, RFCs, and architectural decisions for network‑related projects., • Communicate complex networking concepts, trade‑offs, and capacity risks to both technical and non‑technical stakeholders.These responsibilities summarize the role’s primary responsibilities and are not an exhaustive list. They may change at the company’s discretion.Required Qualifications, • 10+ years of experience in network engineering, with at least several years in a senior/staff or architecture‑oriented role., • Deep, hands‑on experience with AWS networking:, • Amazon VPC (CIDR design, subnets, IGW/NAT, route tables, endpoints)., • Security Groups and NACLs., • AWS Transit Gateway, Site‑to‑Site VPN, and AWS Direct Connect., • Route 53 and ELB (ALB/NLB/GLB)., • Strong enterprise/data center networking experience:, • Juniper and/or Aruba networking platforms., • Routing/switching (BGP, OSPF, VLANs, VRFs, link aggregation, redundancy protocols)., • Hands‑on experience with Palo Alto Networks firewalls (policy management, NAT, VPN, content inspection)., • Demonstrated experience setting up monitoring, logging, and dashboards for network infrastructure (cloud and on‑prem), and using this data for incident response and capacity planning., • Proven track record building and operating secure, highly available, and scalable network infrastructures in production., • Solid understanding of network security principles, segmentation, and zero‑trust concepts., • Strong troubleshooting skills across layers (DNS, TCP/IP, TLS, HTTP, internal app protocols)., • Excellent communication skills and experience working in cross‑functional, fast‑moving environments.Preferred Qualifications, • Experience in healthcare, life sciences, or other highly regulated or security‑sensitive environments., • Experience with:, • AWS Network Firewall, AWS WAF., • AWS App Mesh and/or Amazon VPC Lattice., • AWS Global Accelerator and edge networking patterns., • Proficiency with infrastructure‑as‑code (e.g., Terraform, CloudFormation) and automation/scripting (Python, Bash, PowerShell, etc.)., • Experience designing SLOs, KPIs, and alerting strategies for network reliability and performance., • Familiarity with SD‑WAN, SASE, and/or Zero Trust Network Access (ZTNA) solutions.