EDP Platform Engineer / Databricks Administrator
hace 2 días
Washington
Job Description Custom Software Systems, Inc. (CSS) is seeking an experienced EDP Platform Engineer / Databricks Administrator to serve as the hands-on technical owner of the agency’s Databricks platform supporting the Enterprise Data Platform (EDP). This role is responsible for end-to-end platform operations, security, and governance—ensuring the environment is compliant, reliable, cost-efficient, and capable of supporting secure analytics and AI/ML workloads at scale. The ideal candidate will partner closely with development, cloud, and governance teams to maintain and enhance Databricks environments across the SDLC, drive automation and best practices, and ensure seamless, secure deployments from development through production. Responsibilities • Administer Databricks account and workspaces across SDLC environments; standardize, • configuration, naming, and operational patterns., • Configure and maintain clusters/compute, job compute, SQL warehouses, runtime versions, libraries, repos, and workspace settings., • Implement platform monitoring/alerting, operational dashboards, and health checks; maintain runbooks and operational procedures., • Provide Tier 2/3 operational support: troubleshoot incidents, perform root-cause analysis, and drive remediation and preventive actions., • Manage change control for upgrades, feature rollouts, configuration changes, and integration changes; document impacts and rollback plans., • Enforce least privilege across platform resources (workspaces, jobs, clusters, SQL warehouses, repos, secrets) using role/group-based access patterns., • Configure and manage secrets and secure credential handling (secret scopes / key management integrations) for platform and data connectivity., • Enable and maintain audit logging and access/event visibility; support security reviews and evidence requests., • Administer Unity Catalog governance: metastores, catalogs/schemas/tables, ownership, grants, and environment/domain patterns., • Configure and manage external locations, storage credentials, and governed access to cloud object storage., • Partner with governance stakeholders to support metadata/lineage integration, classification/tagging, and retention controls where applicable. • Coordinate secure connectivity and guardrails with cloud/network teams: private connectivity patterns, egress controls, firewall/proxy needs. • Configure cloud integrations required for governed data access and service connectivity (roles/permissions, endpoints, storage integrations). • Implement cost guardrails: cluster policies, auto-termination, scheduling, workload sizing standards, and capacity planning. • Produce usage/cost insights and optimization recommendations, address waste drivers (idle compute, oversized clusters, inefficient jobs). • Automate administration and configuration using APIs/CLI/IaC (e.g., Terraform) to reduce manual drift and improve repeatability. • Maintain platform documentation: configuration baselines, security/governance standards, onboarding guides, and troubleshooting references. • Design and implement backup and disaster recovery procedures for workspace configurations, notebooks, Unity Catalog metadata, and job definitions; maintain recovery runbooks and perform periodic DR testing aligned to RTO/RPO objectives. • Monitor and optimize platform performance, including SQL warehouse query tuning, cluster autoscaling configuration, Photon enablement, and Delta Lake optimization guidance (OPTIMIZE, VACUUM, Z-ordering strategies). • Administer Delta Live Tables (DLT) pipelines and coordinate with data engineering teams on pipeline health, data quality monitoring, failed job remediation, and pipeline configuration best practices. • Manage third-party integrations and ecosystem connectivity, including BI tool integrations (e.g., Power BI), and external metadata catalog integrations. • Implement Databricks Asset Bundles (DABs) for standardized deployment patterns; automate workspace resource deployment (jobs, pipelines, dashboards) across SDLC environments using bundle-based CI/CD workflows. • Conduct capacity planning and scalability analysis, including forecasting concurrent user/workload growth, platform scaling strategies, and proactive resource allocation during peak usage periods., • Facilitate user onboarding and enablement, including new user/team onboarding procedures, training coordination, workspace access provisioning, and creation of self-service documentation/guides. Clearance • Must be clearable. Citizenship • US Citizenship Required Qualifications • 7+ years in cloud/data platform administration and operations, including 4+ years supporting Databricks or similar platforms. Knowledge, Skills & Abilities • Experience with the Scrum framework, Agile engineering, Lean methodologies or DevOps., • Experience with one or more of the following: system. development, software development, hardware development, or mission support., • Experience working with DevOps CI/CD related technologies (Azure DevOps, Git, Jenkins, Puppet, Docker, Confluence, Sonar Lint, and J-Unit., • Ability to work at the conceptual level and with program leads, customers, and internal teams to ensure successful system development, integration, and deployment., • Ability Hands-on experience administering Databricks (workspace administration, clusters/compute policies, jobs, SQL warehouses, repos, runtime management) and expertise using Databricks CLI., • Strong Unity Catalog administration: metastores; catalogs/schemas; grants; service principals; external locations; storage credentials; governed storage access., • Identity & Access Management proficiency: SSO concepts, SCIM provisioning, group-based RBAC, service principals, least-privilege patterns., • Security fundamentals: secrets management, secure connectivity, audit logging, access monitoring, and evidence-ready operations. Cloud platform expertise (AWS ): IAM roles/policies, object storage security patterns, networking basics (VPC concepts), logging/monitoring integration., • Automation skills: scripting and/or IaC using Terraform/CLI/REST APIs for repeatable configuration and environment promotion., • Experience implementing data governance controls (classification/tagging, lineage/metadata integrations) in partnership with governance teams., • CI/CD practices for jobs/notebooks/config promotion across SDLC environments., • Understanding of lakehouse concepts (e.g., Delta, table lifecycle management, separation of storage/compute)., • SQL proficiency and data engineering fundamentals for troubleshooting query performance issues, understanding ETL/ELT workflow patterns, and debugging data pipeline failures; basic Python/Scala familiarity for notebook/code issue diagnosis., • Experience with compliance and regulatory frameworks (FedRAMP, HIPAA, SOC2, or similar) including implementation of data residency requirements, retention policies, and audit-ready evidence collection., • Hands-on experience with AWS security and networking services, including PrivateLink, Secrets Manager/Systems Manager integration, CloudWatch/CloudTrail integration, S3 bucket policies, cross-account access patterns, and KMS encryption key management., • Experience administering Databricks serverless compute, Workspace Git integrations (GitLab), Databricks Asset Bundles (DABs) for deployment automation, and modern workspace features supporting DevOps workflows., • SLA/SLO management and stakeholder communication skills; ability to define platform service levels, produce operational reports, translate technical issues to business stakeholders, and manage vendor relationships (Databricks account teams). Certificates • At least one of the following certifications or their equivalent: o Cloud-related (e.g., DevOps, Security, and/or ML) o Databricks Platform Administrator/Databricks AWS Platform Architect o Databricks Certified Data Engineer Associate/Professional o AWS Certified Solutions Architect Associate or Professional Education • Bachelor’s degree in Engineering, Computer Science, Information Systems or IT related discipline, or equivalent practical experience. Compensation & Benefits • Wage Range: Negotiable, • General Benefits: Custom Software Systems, Inc. offers our employees a competitive benefits package that may include: Health insurance plans Health Savings Account (HSA) Dental Vision Long-term disability Short-term disability Basic term life insurance Supplemental term life insurance for employees, spouses, and dependents Simple IRA Parking/Commuting expense reimbursement Training/Education Company DescriptionCompany Background: Headquartered in Leesburg, Virginia, Custom Software Systems, Inc. (CSS) is a Woman-Owned (WOSB) and HUBZone certified small business. Built on a foundation of trusted client partnerships, CSS has fostered a stakeholder-centric yet disciplined approach to IT solutions development. This ensures our ability to consistently meet or exceed our customers' expectations. Benefits: CSS is a very employee oriented company knowing that well trained, professional associates are what make our company great. We offer a competitive benefits package that includes: paid holidays and paid time off; medical insurance that includes vision; dental insurance; company paid long and short-term disability and life insurance; a Simple IRA plan (similar to 401k); parking and commuter reimbursement. We also work with our employees on training and professional certification plans that benefit the employee, the client and CSS - a win-win-win strategy. Equal Opportunity Employer: CSS provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws. CSS complies with applicable state and local laws governing non-discrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including, but not limited to, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. CSS expressly prohibits any form of unlawful employee harassment based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. Improper interference with the ability of CSS employees to perform their expected job duties is absolutely not tolerated.Company Background:\r\nHeadquartered in Leesburg, Virginia, Custom Software Systems, Inc. (CSS) is a Woman-Owned (WOSB) and HUBZone certified small business. Built on a foundation of trusted client partnerships, CSS has fostered a stakeholder-centric yet disciplined approach to IT solutions development. This ensures our ability to consistently meet or exceed our customers' expectations. \r\n\r\nBenefits:\r\nCSS is a very employee oriented company knowing that well trained, professional associates are what make our company great. We offer a competitive benefits package that includes: paid holidays and paid time off; medical insurance that includes vision; dental insurance; company paid long and short-term disability and life insurance; a Simple IRA plan (similar to 401k); parking and commuter reimbursement. We also work with our employees on training and professional certification plans that benefit the employee, the client and CSS - a win-win-win strategy.\r\n\r\nEqual Opportunity Employer:\r\nCSS provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws. CSS complies with applicable state and local laws governing non-discrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including, but not limited to, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.\r\n\r\nCSS expressly prohibits any form of unlawful employee harassment based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. Improper interference with the ability of CSS employees to perform their expected job duties is absolutely not tolerated.