Lead DevSecOps Engineer (Blockchain/Canton/DAML)
hace 19 horas
Jersey City
As a Lead DevSecOps Engineer, you will play a key role in driving the delivery and operations of DTCC’s Web3 Platform, with a primary focus on leading infrastructure development and operational excellence for the Canton/DAML-based distributed ledger environment. You will work closely with engineering teams to establish and maintain DevSecOps best practices, including CI/CD automation, infrastructure reliability, observability, and secure deployment patterns across both cloud and DLT components. In this role, you will serve as the Canton/DAML infrastructure and operations subject matter expert (SME)—guiding platform design decisions, defining production-grade deployment patterns, and ensuring that Canton/DAML services are scalable, resilient, secure, and compliant. You will collaborate with in-house engineering teams and external vendor partners to integrate platform components and deliver production-ready Web3 capabilities, supporting globally distributed teams through hands-on troubleshooting, environment improvements, tooling enhancements, and automation initiatives. Primary Responsibilites • Lead the design, build, and operationalization of Canton/DAML infrastructure (e.g., participant nodes and supporting services), ensuring production readiness, resilience, scalability, and security., • Own Canton/DAML environment strategy across development, test, staging, and production—standardizing environment configurations, release processes, and operational runbooks., • Develop and maintain Infrastructure-as-Code (IaC) for blockchain/Web3 platform deployments, including network topology, identity and access controls, encryption/key management integrations, compute, storage, and secrets management., • Build and evolve CI/CD pipelines for blockchain and associated application workloads, including automated validation/testing, security scanning, artifact promotion, and controlled releases with rollback strategies., • Define and implement observability standards across platform services—metrics, logs, traces, dashboards, and alerting—supporting SLOs/SLAs and rapid incident response., • Establish high availability (HA) and disaster recovery (DR) patterns for platform infrastructure (multi-zone/region design where applicable), including backup/restore, upgrade strategies, and operational readiness., • Partner with architecture, risk, and security to ensure platform deployments align with enterprise security and compliance controls, including certificate management, IAM integration, least-privilege access, and auditability., • Provide hands-on L3/L4 support for production platform operations, including performance tuning, incident triage, root cause analysis, and continuous improvement initiatives., • Coordinate with internal teams and vendor/partners to support platform upgrades, configuration changes, and operational improvements, ensuring minimal disruption and strong change management practices. Talents Needed for Success • Hands-on experience operating and scaling Canton/DAML in production or production-like environments (infrastructure, deployments, upgrades, monitoring, and incident response)., • Demonstrated ability to act as a technical lead / SME for DLT infrastructure, defining deployment standards, operational processes, and reliability practices., • Experience building secure CI/CD and IaC patterns for complex distributed systems (DLT, microservices, event-driven platforms). Technical Requirements • Canton/DAML: Experience with Canton and DAML application/platform lifecycle, including deployment architecture, environment management, security controls, and observability., • Azure DLT / Web3: Experience supporting Azure-based DLT stacks (e.g., Canton, Besu) and integrating with enterprise cloud services., • Security + Identity: Strong understanding of IAM, certificate-based auth, secrets management, encryption/key vault patterns, and network segmentation for secure DLT operations., • Reliability Engineering: Proven experience implementing HA/DR, SRE practices, and performance optimization for distributed platforms. Nice to Have • Experience with DAML application build/release workflows and supporting developer enablement (tooling, templates, automation)., • Experience with distributed ledger operational concerns (latency, throughput, node lifecycle management, certificate rotation, topology changes)., • Background in SRE practices, including error budgets, capacity planning, and resilience testing (chaos testing, failure injection).