Lead Software Platform Engineer-Foundational Platform & Interoperability
1 day ago
Raleigh
Lead Software Platform Engineer For Foundational Platform & Interoperability The Lead Software Platform Engineer for Foundational Platform & Interoperability plays a critical role in designing and delivering the enterprise abstractions, automation frameworks, and declarative interoperability capabilities that power Truist's next generation software delivery ecosystem. This role focuses on building reusable platform services, not oneoff tooling—establishing the manifestdriven, policyascode, orchestration, and lifecycle automation foundations that enable teams to integrate, deliver, and operate software with consistency, resilience, and auditability. The ideal candidate is a hands-on engineer who applies a software engineering mindset to platform problems, builds scalable internal products, and thrives in creating the connective tissue that makes large, federated engineering organizations feel unified. They will help establish and evolve core interoperability patterns—such as automated change controls, standardized integration contracts, golden-path onboarding, asset lifecycle intelligence, CI/CD abstraction layers, and cross-platform agentic automation—ensuring teams can build and ship with velocity while meeting regulatory expectations. This role is central to Truist's modernization strategy: turning platform capabilities into consumable products, reducing cognitive load for engineers, and enabling a fully integrated ecosystem where services, systems, and teams interoperate through clear contracts, automated governance, and platform-led consistency. For this opportunity, Truist will not sponsor an applicant for work visa status or employment authorization, nor will we offer any immigration-related support for this position (including, but not limited to H-1B, F-1 OPT, F-1 STEM OPT, F-1 CPT, J-1, TN-1 or TN-2, E-3, O-1, or future sponsorship for U.S. lawful permanent residence status.) This position is office-centric 5 days a week in one of our Truist hub locations. ESSENTIAL DUTIES AND RESPONSIBILITIES Following is a summary of the essential functions for this job. Other duties may be performed, both major and minor, which are not mentioned below. Specific activities may change from time to time. • Performs problem tracking, diagnosis and root-cause analysis, replication, troubleshooting, and resolution for highly complex issues., • In this capacity, oversees others who perform programming and debugging activities., • Responds to issues in a timely manner by receiving and investigating incidents or service tickets., • Provides technical consultation on extremely challenging or unusual situations., • May lead large, complex projects related to improving processes or support capabilities., • May engage and mange external vendors., • Interprets internal/external business challenges and recommends best practices., • Uses sophisticated analytical thought to exercise judgment and identify innovative solutions., • Mentors less experienced teammates to build technical expertise., • Bachelor's degree and eight years of experience in development or production support or an equivalent combination of education and work experience., • Deep specialized and/or broad functional knowledge., • Sound understanding of business and organizational strategies and processes., • Ability to interpret internal and external business challenges and recommend best practices., • Ability to lead complex projects., • Sophisticated analytical skills and the ability to solve complex technical and business problems., • Master's degree in Computer Science, Engineering, Data Science, or related field, with 10+ years of experience building cloud native platforms, automation frameworks, or distributed systems at enterprise scale., • Strong hands-on programming experience in:, • Java (core platform services, integration frameworks), • Python (automation, orchestration, agentic workflows), • Node.js/TypeScript (API gateways, developer-facing services, UI integration), • Expertise in modern API design & interoperability standards, including:, • Designing and documenting RESTful and gRPC APIs using OpenAPI/Swagger and protobuf, • Consistent versioning, backward compatibility, and contract governance, • Authentication/authorization patterns ( OAuth2.1/OIDC, JWT ), policy enforcement ( RBAC/ABAC ), rate limiting, quotas, and zerotrust boundaries, • Strong API hygiene: idempotency, request validation, pagination, content negotiation, and RFC 7807 error standards, • Advanced data access and persistence experience, including:, • Relational databases ( PostgreSQL, MySQL, SQL Server ) and NoSQL systems ( MongoDB, DynamoDB, Cosmos DB ), • Schema migration tooling ( Flyway, Liquibase ), • Transaction modeling, indexing strategies, read/write separation, query optimization, • Distributed caching patterns with Redis/Memcached (readthrough, writethrough/behind, cacheaside, TTL strategy, stampede prevention), • Deep expertise in messaging, streaming, and integration patterns, including:, • Kafka, RabbitMQ, SQS/SNS, Azure Service Bus, • Event-driven architecture, Sagas, Outbox patterns, Debezium CDC, • Schema governance using Schema Registry / Avro, • Semantic delivery guarantees (exactly-once, at-least-once), • Platform-grade resiliency & reliability engineering, including:, • Timeouts, retries (jittered backoff), circuit breakers, bulkheads, hedging, load shedding, • Libraries and mesh capabilities: Resilience4j, Polly, Envoy, • Multi-AZ/region patterns, graceful degradation, compensation workflows, • Defining SLIs/SLOs, contributing to error budgets, performing chaos testing/fault injection, • Cloud, runtime, and platform orchestration expertise, including:, • OpenShift, Kubernetes, and container-native service design, • AWS event-driven and serverless patterns (Lambda, EventBridge, SNS/SQS), • Multi-tenant platform architectures and service mesh–driven interoperability, • Deep observability, telemetry, and operational analytics, including:, • Structured logging with correlation/trace IDs and PII-safe filtering, • Full-stack instrumentation with OpenTelemetry (metrics, traces, logs), • Metrics pipelines using Prometheus/Grafana and RED/USE methodology, • Splunk and Dynatrace for distributed tracing, APM, log analytics, dashboards, alerting, and service health visualization, • SLO-driven operational dashboards and runbooks, • Comprehensive automated testing expertise, including:, • Unit, integration ( Testcontainers ), contract ( Pact ), functional/UI ( Playwright/Cypress ), • Mutation testing ( PIT, Stryker ), • Performance testing ( k6, Gatling, Locust ), • Security scanning (SAST/DAST/dependency scanning), • Smoke/canary testing and deterministic CI pipelines, • Test data management (seeded data, anonymized snapshots), • Modern frontend engineering experience (preferred but not required), including:, • TypeScript and React (preferred), Angular, or Vue, • State management (Redux, Zustand, NgRx, Pinia), • OAuth2.1/OIDC/PKCE flows and secure client-side API consumption, • Performance best practices: code splitting, lazy loading, Lighthouse budgets, WCAG 2.2 AA accessibility, • Component libraries (Storybook) and visual regression testing