Azure Site Reliability Engineer
20 hours ago
Richmond
Senior Azure Support Engineer • Location: Richmond-Upon-Thames, • This is a hybrid role., • You must be British for Security Clearance reasons We are hiring for a Senior Azure Support Engineer to join a growing business and be responsible for the maintenance and support of the Azure Cloud Environment that hosts SaaS web-based applications. • The environment operates hundreds of Single Tenant (ST) and Multi-Tenant (MT) deployments across Azure, each with their own servers, database, and storage., • This role exists to keep every deployment reliable resolving ensuring uptime and building automation., • Investigating and fixing complex application and infrastructure issues., • Monitoring capacity, performance, and error budgets across all deployments., • Designing automation and tooling to improve reliability and reduce manual work. Technical Skills required for the Senior Azure Support Engineer • 3+ years in third-line support, SRE, or cloud operations for enterprise SaaS., • Proven track record in incident resolution and root cause analysis., • Experience working with both multi-tenant and single-tenant cloud architectures., • Strong background in supporting C#/ .NET Core/ MVC web applications with SQL Server backends and Azure Blob Storage., • Advanced Azure diagnostics (Application Insights, Log Analytics, Kusto Query Language)., • Proficient in SQL for investigation and remediation., • Scripting and automation skills in PowerShell and/or C#., • Understanding of Azure components: App Services, VMs, SQL DB, Blob Storage, scaling strategies., • Experience in capacity planning, SLOs, and error budget management, • Azure Monitor, Application Insights, Log Analytics, Azure Data Explorer (KQL), Azure Functions, Logic Apps, PowerShell, C#, SQL Server Management Studio, Azure Storage Explorer, Power BI (for reporting). The Senior Azure Support Engineer responsibilities and tasks: • Monitor ST and MT environments for server performance, response times, error rates, etc., • Detect and resolve database issues, stalled file processing, or misplaced storage objects., • Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents., • Provide third-line support for escalated customer cases, collaborating with development., • Maintain uptime, performance, and scalability across all ST and MT deployments., • Define and track service-level objectives (SLOs)., • Perform capacity planning for servers, databases, and storage, scaling resources., • Identify systemic patterns causing downtime and implement fixes at scale., • Build PowerShell scripts and automation (Azure Functions, Logic Apps), • Automate environment health checks and reporting.