Director of Site Reliability Engineering
2 days ago
Houston
Job DescriptionThis Jobot Job is hosted by: Merwan Zattam Are you a fit? Easy Apply now by clicking the "Apply Now" buttonand sending us your resume. Salary: $220,000 - $260,000 per year A bit about us: We are a mission-driven organization dedicated to making AI adoption safe and secure for enterprises worldwide. As the leading provider of Security for AI, our platform protects agentic, generative, and predictive AI applications across the entire lifecycle—safeguarding intellectual property, ensuring compliance, and enabling organizations to innovate with confidence. Our team was founded by cybersecurity and machine learning veterans who experienced a real adversarial AI attack firsthand. That moment led to the creation of a new category focused entirely on protecting machine learning systems from threats such as prompt injection, adversarial manipulation, model theft, and supply chain compromise. Backed by strategic investors including Microsoft’s Venture Fund (M12), Moore Strategic Ventures, Booz Allen Ventures, IBM Ventures, and Capital One Ventures, we combine patented technology with industry-leading research to defend the world’s most critical AI systems. Recognized by Gartner as a “Cool Vendor for AI Security” and trusted by Fortune 500 organizations, government agencies, and enterprises across highly regulated industries, we are shaping the future of AI security in real time. With strong product–market fit and rapid growth, this is an opportunity to join a generational company at a true inflection point—where the mission is bold, the bar is high, and the room for impact and growth is unmatched. Why join us? Top Benefits of Working Here Job Details Director of Site Reliability Engineering Remote – United States We are seeking a Director of Site Reliability Engineering to lead the broader Platform Engineering organization with a strategic focus on building a world-class SRE function. Reporting to the VP of Engineering, you will be responsible for the reliability, scalability, and operational excellence of the mission-critical AI security platform used by enterprises and government organizations worldwide. In this senior leadership role, you will define the SRE strategy, mentor and scale a high-performing team, and implement the systems, practices, and culture required to support rapid growth. You will work at the intersection of cutting-edge AI security technology and enterprise-grade infrastructure, ensuring the platform delivers the always-on performance our customers depend on. Your work will directly strengthen the security posture of organizations protecting their most valuable AI assets—from financial institutions and healthcare providers to government and Fortune 500 enterprises. What You’ll Do Build and Lead the SRE Function Define and execute the SRE strategy and roadmap, positioning reliability as a core product feature Build, mentor, and scale a high-performing SRE and Platform Engineering team Establish SRE principles, culture, and best practices across engineering Create clear career development paths and raise the bar for hiring and excellence Drive Platform Reliability & Operational Excellence Own reliability, availability, latency, and performance across multi-cloud, multi-region deployments (AWS, Azure, GCP) Set and achieve SLOs/SLIs aligned with business objectives Architect multi-region resiliency: automated failover, graceful degradation, and disaster recovery Build robust observability: distributed tracing, metrics, logging, and actionable alerting Lead incident management: on-call processes, incident command, blameless post-mortems, and systematic remediation Enable Developer Velocity & Platform Excellence Own CI/CD pipelines and deployment infrastructure for safe, fast, reliable delivery Build internal developer platforms and tooling that reduce toil and improve productivity Implement progressive delivery (canaries, feature flags, automated rollbacks) Partner with engineering teams to embed reliability requirements and design patterns early in development Security, Compliance & Enterprise Requirements Ensure alignment with standards such as FedRAMP, SOC 2, ISO 27001, and other regulatory requirements Build and support air-gapped and on-premises deployment capabilities Implement infrastructure security controls, secrets management, and audit logging Support customer-facing SLAs and maintain trust with enterprise and government clients Scale & Optimize the Platform Lead capacity planning and performance engineering for platform growth Drive chaos engineering and resilience testing to validate system behavior under failure Optimize cost while maintaining reliability and performance Automate operational workflows to eliminate toil and improve efficiency What You Bring Leadership & Experience 8+ years in infrastructure, platform engineering, or SRE roles 4+ years in engineering leadership Experience supporting mission-critical, always-on systems at enterprise scale Strong people leadership and a track record of building high-performing teams Technical Expertise Deep knowledge of cloud infrastructure (AWS, Azure, GCP) and multi-region systems Strong experience with Kubernetes, Docker, and infrastructure-as-code (Terraform, Pulumi, CloudFormation) Proven ability to build and operate large-scale distributed systems Expertise in observability tooling (Prometheus, Grafana, Datadog, New Relic, ELK/EFK, distributed tracing) Proficiency in Python, Go, or similar languages Understanding of databases, data pipelines, message queues, and caching systems Strategic & Operational Skills Experience driving SRE strategy, SLOs/SLIs, error budgets, and incident management Ability to partner across engineering, product, security, and customer success Strong communication skills across technical and non-technical audiences Pragmatic problem-solving and sound decision-making Bonus Experience Background in cybersecurity or AI/ML infrastructure Familiarity with compliance frameworks (FedRAMP, SOC 2, ISO 27001, NIST) Experience supporting air-gapped or on-premise deployments Hands-on experience with chaos engineering and game day exercises Open-source contributions or SRE community leadership Why This Opportunity Stands Out Impact: Define reliability strategy for a category-leading AI security platform Growth: Build and scale the SRE function from the ground up in a fast-growing, well-funded environment Mission: Work on technology that is shaping the future of secure AI adoption Team: Join a world-class engineering organization with deep roots in security, ML, and distributed systems Innovation: Solve novel problems at the intersection of AI, security, and infrastructure Flexibility: Fully remote role with competitive compensation, equity, and benefits Location & Work Environment This is a fully remote position within the United States. We value flexibility, ownership, collaboration, and excellence. The team operates across time zones with a blend of async communication, regular syncs, and purposeful in-person gatherings. Equal Opportunity We are an equal opportunity employer and do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any legally protected status. We are committed to fostering an inclusive environment where all team members can thrive. If you need accommodations during the application or interview process, please let us know. Interested in hearing more? Easy Apply now by clicking the "Apply Now" button. Jobot is an Equal Opportunity Employer. We provide an inclusive work environment that celebrates diversity and all qualified candidates receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, age (40 and over), disability, military status, genetic information or any other basis protected by applicable federal, state, or local laws. Jobot also prohibits harassment of applicants or employees based on any of these protected categories. It is Jobot’s policy to comply with all applicable federal, state and local laws respecting consideration of unemployment status in making hiring decisions. Sometimes Jobot is required to perform background checks with your authorization. Jobot will consider qualified candidates with criminal histories in a manner consistent with any applicable federal, state, or local law regarding criminal backgrounds, including but not limited to the Los Angeles Fair Chance Initiative for Hiring and the San Francisco Fair Chance Ordinance. Information collected and processed as part of your Jobot candidate profile, and any job applications, resumes, or other information you choose to submit is subject to Jobot's Privacy Policy, as well as the Jobot California Worker Privacy Notice and Jobot Notice Regarding Automated Employment Decision Tools which are available at jobot.com/legal. By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from Jobot, and/or its agents and contracted partners. Frequency varies for text messages. Message and data rates may apply. Carriers are not liable for delayed or undelivered messages. You can reply STOP to cancel and HELP for help. You can access our privacy policy here: jobot.com/privacy-policyCompany DescriptionJobot is on a mission to connect good people with good jobs. By combining AI-powered technology with the expertise of Jobot Pros, our experienced recruiters, we help you find career opportunities that align with your goals and values. Founded in 2018 and employee-owned since 2024, Jobot is committed to fostering a culture of kindness, respect, innovation, and connection. As an industry leader, we’ve been recognized as a top workplace by Forbes, Fortune, USA Today, and Staffing Industry Analysts (SIA). Ready to find a good job? Create your profile today at Jobot.com.Jobot is on a mission to connect good people with good jobs. By combining AI-powered technology with the expertise of Jobot Pros, our experienced recruiters, we help you find career opportunities that align with your goals and values.\r\n\r\nFounded in 2018 and employee-owned since 2024, Jobot is committed to fostering a culture of kindness, respect, innovation, and connection. As an industry leader, we’ve been recognized as a top workplace by Forbes, Fortune, USA Today, and Staffing Industry Analysts (SIA).\r\n\r\nReady to find a good job? Create your profile today at Jobot.com.