Data Governance Architect
1 day ago
Cambridge
Job Summary We are seeking a Data Governance Architect to design and implement enterprise data governance capabilities across modern pharma data platforms. The ideal candidate has deep experience in pharmaceutical or life sciences data environments, strong knowledge of Databricks Lakehouse and Medallion Architecture, and hands-on experience with Collibra and Unity catalog for metadata management, business glossary, data cataloging, stewardship workflows, lineage, and governance operating models. This role will define the technical and information architecture required to govern data across R&D, clinical, regulatory, manufacturing, quality, commercial, and enterprise data domains. *Must be onsite 3 days a week in Cambridge, MA Key Responsibilities Design and implement enterprise data governance architecture across pharma data platforms, including Databricks, Collibra, cloud storage, BI, and downstream analytics systems. Define governance patterns for Databricks Lakehouse, including Bronze, Silver, and Gold layers, data ownership, metadata standards, lineage capture, access control, data quality, retention, and certification. Partner with data engineering teams to embed governance controls into Medallion Architecture pipelines, including ingestion standards, transformation rules, validation checkpoints, and curated data product governance. Lead Collibra architecture and configuration for data catalog, business glossary, data domains, assets, policies, stewardship workflows, lineage, and data quality integration. Define metadata models connecting business terms, data products, datasets, reports, pipelines, data owners, data stewards, and regulatory classifications. Architect governance solutions using Databricks Unity Catalog, Collibra, cloud IAM, privacy tools, security controls, and data quality frameworks. Establish architecture standards for sensitive and regulated pharma data, including GxP, HIPAA, GDPR, clinical trial data, patient data, manufacturing quality data, and proprietary R&D data. Create reference architectures, solution blueprints, implementation patterns, data governance standards, and technical design documents. Advise on integration between Collibra and Databricks for metadata harvesting, lineage, catalog synchronization, stewardship workflows, and governance policy enforcement. Support enterprise data product strategy, including data mesh, domain ownership, certified datasets, data marketplaces, and reusable governed data assets. Collaborate with enterprise architecture, data platform, security, privacy, compliance, quality, and business data domain teams. Required Qualifications 8+ years of experience in data architecture, data governance architecture, information architecture, data management, or enterprise data platforms. Experience working in pharma, biotech, life sciences, healthcare, or another regulated industry. Strong understanding of Databricks Lakehouse Platform, Unity Catalog, Delta Lake, access controls, lineage, workspace patterns, and data platform governance. Hands-on experience with Medallion Architecture, including Bronze, Silver, and Gold data layers. Hands-on experience with Collibra, including operating model design, data catalog, glossary, workflows, stewardship, lineage, policies, and asset model configuration. Strong knowledge of data governance capabilities, including metadata management, lineage, data quality, master/reference data, access management, privacy classification, data lifecycle, and policy management. Experience designing governance frameworks for regulated data, including GxP, 21 CFR Part 11, HIPAA, GDPR, patient data, clinical data, and quality data. Ability to translate business, regulatory, compliance, and privacy requirements into scalable technical architecture. Experience with cloud data platforms, preferably AWS, Azure, or GCP. Excellent documentation, communication, and stakeholder engagement skills. Preferred Qualifications Experience integrating Collibra with Databricks, Unity Catalog, cloud platforms, BI tools, ETL tools, or data quality tools. Experience with data quality platforms such as Great Expectations, Deequ, Informatica DQ, Collibra DQ, Soda, or Monte Carlo. Experience with data marketplace, data mesh, domain-driven data ownership, and governed data product models. Experience with pharma data domains such as clinical trials, genomics, regulatory submissions, pharmacovigilance, manufacturing, quality, supply chain, medical affairs, or commercial analytics. Knowledge of semantic layers, ontology, master data management, reference data management, and knowledge graphs. Certifications in Databricks, Collibra, CDMP, cloud architecture, or enterprise architecture are a plus.