Infrastructure Architect - AWS
1 day ago
Atlanta
INFRASTRUCTURE ARCHITECT - AWS DIRECT HIRE ATLANTA, GA (Hybrid) USC/GC - NO SPONSORSHIP PROVIDED Job Summary The Infrastructure Architect will lead the strategic design and implementation of enterprise-grade cloud infrastructure solutions on AWS. This senior role architects scalable, secure, and highly available infrastructure supporting multi-tenant SaaS applications, comprehensive monitoring systems, and mission-critical operations. The position requires deep AWS expertise, multi-tenant architecture proficiency, and ability to integrate AWS Marketplace and third-party solutions while delivering robust, cost-effective infrastructure at scale. Essential Duties and Responsibilities • Define infrastructure architecture strategy and roadmap aligned with business objectives, • Design scalable, secure AWS infrastructure following Well-Architected Framework and SaaS Lens principles, • Architect multi-tenant SaaS infrastructure with tenant isolation, resource segregation, and automated provisioning, • Develop production-grade Infrastructure as Code using Terraform, AWS CDK, and CloudFormation, • Create reusable, modular infrastructure components and tenant-aware templates, • Design CI/CD pipelines using GitLab CI/CD for infrastructure deployment with automated testing and security scanning, • Implement deployment strategies (blue-green, canary, rolling) for infrastructure and containerized workloads (ECS/EKS), • Design enterprise monitoring and observability using Nagios, CloudWatch, Prometheus, and Grafana, • Establish tenant-aware monitoring with APM, distributed tracing (X-Ray), log aggregation, and intelligent alerting, • Architect solutions leveraging AWS Marketplace products for monitoring, security, and infrastructure management, • Establish secure integration patterns with third-party SaaS platforms (OAuth, SAML, API gateways), • Design tenant isolation strategies across compute, storage, network, and database layers, • Implement metering, usage tracking, and cost allocation systems for multi-tenant environments, • Establish tenant tiering (free/standard/premium) with resource quotas, SLAs, and performance isolation, • Establish infrastructure standards, governance frameworks, and policy-as-code (Config, SCPs, OPA), • Design disaster recovery, backup strategies, and business continuity procedures, • Design multi-account strategies using AWS Organizations, Control Tower, and Landing Zone, • Lead technical architecture reviews and recommend optimal AWS infrastructure solutionsMentor engineering teams on AWS best practices, infrastructure patterns, and DevOps practices• Lead incident response, post-mortems, and continuous improvement initiatives, • Document architectural decisions, infrastructure designs, and operational procedures•, • Work effectively across global teams and different time zones QUALIFICATIONS AND BACKGROUND Education Required: Bachelor's degree in Computer Science or Information SystemsPreferred: Master's degree in Computer Science or Information SystemsCertifications Required: AWS Certified DevOps Engineer Professional Experience Required:, • 10+ years in infrastructure architecture and cloud engineering•, • Extensive hands-on AWS infrastructure expertise across core services, • Deep expertise with Terraform, AWS CDK, and CloudFormation Templates (CFT), • Proven track record architecting production-grade AWS infrastructure for enterprise environments, • Expert-level infrastructure CI/CD pipeline design using GitLab CI/CD, • Strong proficiency in SaaS infrastructure and multi-tenant design principles, • Experience architecting large-scale infrastructure for multi-tenant SaaS platforms, • Proven success integrating AWS Marketplace products and third-party SaaS platforms, • Experience with enterprise monitoring and observability platforms including Nagios Preferred:, • Experience building/operating SaaS infrastructure at scale, • Multi-cloud experience (Azure, GCP), • AWS SaaS Factory program participation, • AWS Marketplace seller/ISV experience, • Advanced FinOps and infrastructure cost optimization expertise Skills: AWS Cloud Infrastructure (Expert Level):, • Compute: EC2, ECS/Fargate, EKS, Lambda, Auto Scaling, Batch, • Storage: S3, EBS, EFS, FSx, Storage Gateway, Backup, • Database: RDS, Aurora, DynamoDB, Redshift, ElastiCache, DocumentDB, • Networking: VPC, Route 53, CloudFront, API Gateway, Direct Connect, Transit Gateway, PrivateLink VPN, • Security: IAM, KMS, Secrets Manager, Cognito, GuardDuty, Security Hub, WAF, Shield, Macie, • Management: CloudWatch, CloudTrail, Systems Manager, Config, Control Tower, Organizations, Service Catalog Infrastructure as Code, • Terraform (advanced modules, state management, workspaces, complex architectures), • AWS CDK (TypeScript/Python with constructs, patterns, custom resources), • CloudFormation (nested stacks, StackSets, custom resources, drift detection), • Policy-as-code (AWS Config Rules, Service Control Policies, OPA), • Git version control with GitFlow and trunk-based development, • SaaS & Multi-Tenant Infrastructure: Tenant isolation patterns (VPC isolation, account-level, database-level, row-level security), • Identity & access management (Cognito, tenant-aware IAM, RBAC), • API Gateway (usage plans, tenant routing, rate limiting, throttling), • Metering, billing, usage tracking, cost allocation tags, • Resource pooling, capacity planning, workload management, • Well-Architected SaaS Lens implementation CI/CD & Infrastructure Automation, • GitLab CI/CD (advanced pipelines, runners, security scanning, artifact management), • AWS Developer Tools (CodePipeline, CodeBuild, CodeDeploy, CodeArtifact), • Containers (Docker, ECS/Fargate, EKS, ECR, Helm charts, Kubernetes operators), • Deployment strategies (blue-green, canary, rolling updates, feature flags), • GitOps practices and infrastructure drift detection, • Infrastructure testing frameworks, • Nagios (Core/XI, NRPE, NCPA, custom plugin development), • CloudWatch (Metrics, Logs, Alarms, Dashboards, Synthetics, Insights), X-Ray, • Prometheus, Grafana, Managed Prometheus, Managed Grafana, • Log management (CloudWatch Logs Insights, OpenSearch Service), • Distributed tracing (X-Ray, OpenTelemetry), • Alerting (SNS, EventBridge, PagerDuty, Opsgenie), • APM tools integration (Datadog, New Relic, Dynatrace), • Tenant-aware monitoring with isolated metrics and dashboards, • Security & Compliance: IAM policies, roles, SCPs, permission boundaries; Encryption (KMS, at-rest, in-transit), secrets management, data masking; Zero-trust architectures and least privilege principles, • Compliance frameworks (HIPAA, PCI-DSS, SOC 2, GDPR, ISO 27001), • Security scanning, vulnerability management, AWS Security Hub, • Network security (security groups, NACLs, WAF, Shield) Infrastructure Optimization: Cost optimization (Cost Explorer, Trusted Advisor, Compute Optimizer, Savings Plans, RI, Spot); Performance tuning and capacity planning; High availability and fault tolerance design; Disaster recovery and backup strategies (AWS Backup, cross-region replication); Multi-region and multi-account architectures, • Development & Automation: Scripting (Python, Bash, PowerShell, Go); Git workflows and version control; REST APIs, GraphQL integration; AWS CLI and SDKs (boto3); Configuration management (Ansible, Chef, Puppet - preferred) Competencies Strategic & Technical Leadership:, • Strong executive presence communicating complex technical concepts to non-technical stakeholders, • Experience managing infrastructure budgets, resource allocation, and vendor relationships, • Analytical & Decision-Making: Strong analytical and decision-making under pressure; Data-driven with ability to evaluate infrastructure trade-offs; Detail-oriented with focus on reliability, scalability, and security; Drive to uncover root causes and implement solutions