Data Architect & Lead Data Engineer
hace 23 horas
Paris
About Persistent: We are an AI-led, platform-driven Digital Engineering and Enterprise Modernization partner, combining deep technical expertise and industry experience to help our clients anticipate what’s next. Our offerings and proven solutions create a unique competitive advantage for our clients by giving them the power to see beyond and rise above. We work with many industry-leading organizations across the world, including 20 Fortune 50 companies and 4 of the 5 top banks in both the US and India, and numerous innovators across the healthcare ecosystem. Our disruptor’s mindset, commitment to client success, and agility to thrive in the dynamic environment have enabled us to sustain our growth momentum. Persistent has been recognized across top industry platforms for innovation, leadership, and inclusion. We have delivered 21 sequential quarters of growth with $389.7M in Q1 FY26 revenue, up 3.9% Q-o-Q and 18.8% Y-o-Y growth. Our 25,000+ global team members, located in 18 countries, have been instrumental in helping the market leaders transform their industries. We have been named the fastest-growing Indian IT services brand in the “Brand Finance India 100” 2025 report. We were also cited as a Leader in the ISG Provider Lens™ 2025 for Digital Engineering Services 2025 and the Everest Group Talent Readiness for Next-generation Application Services PEAK Matrix® Assessment 2025. We are fast growing company (with $1 billion dollars in revenue). Plan to grow at Krakow, Poland. If you’re interested in working on bleeding Data and AI technologies, join us. Read more - ___ About Position: • Role: Data Architect & Lead Data Engineer, • Location: Paris La Défense 7, • Hybrid: 2 days to office Job Description 1. Architecture & Design • Define target-state Lakehouse architecture on AWS (S3 + Databricks + Delta Lake + Unity Catalog + Databricks SQL)., • Select patterns for ingestion (batch/stream/CDC), processing (ETL/ELT), serving (Databricks SQL/BI), DQ, lineage, and observability., • Design for multi-environment isolation, workspace strategy, data domains, catalog/schema/table layout, and naming conventions., • Establish SLA/SLO-driven designs for throughput, latency, concurrency, and cost guardrails. 2. Teradata Migration Leadership • Lead discovery and rationalization: inventory databases, tables, views, macros, stored procs, BTEQ/TPT/FastExport /FastLoad jobs, dependencies, schedules, and usage patterns., • Define migration waves, coexistence strategy, backfill, and cutover/rollback plans., • Drive code conversion from Teradata SQL/BTQ to PySpark/Spark SQL/Delta, optimizing for Photon where applicable., • Replace Teradata utilities with Databricks Workflows/Jobs, DLT (Delta Live Tables), or Spark-native alternatives. 3. Build Standards & Performance • Codify data modeling (bronze/silver/gold layers), Delta Lake best practices (OPTIMIZE, Z‑ORDER, VACUUM, file sizing, partitioning, compression)., • Optimize Spark (joins, shuffles, broadcast, AQE, caching, skew handling) and cluster policies (autoscaling, pools, spot vs on‑demand)., • Define testing strategy (unit/integration/regression/row‑count/delta‑diff), data reconciliation, and performance benchmarking. 4. Security, Governance & Compliance • Implement Unity Catalog for centralized governance: catalogs, schemas, table/column/row-level security, data masking, secrets, tokens., • Integrate IAM, KMS, VPC controls, private links, SCIM, and audit logging; ensure compliance with org/regulatory standards., • Enable lineage (Unity Catalog/OpenLineage), DQ (expectations/Great Expectations/Delta expectations) and change management. 5. DevOps, Operations & FinOps • Establish CI/CD for notebooks/repos/packages using Git, branching strategy, Databricks Repos, and pipelines (Azure DevOps/GitHub Actions/Jenkins)., • Define infra-as-code (Terraform) for workspaces, clusters, UC objects, jobs, and policies., • Set up monitoring & alerting (CloudWatch + Databricks metrics/logs), SLAs, auto-remediation, and runbook/SOPs., • Drive cost optimization: cluster sizing, Photon, job vs all-purpose, spot usage, table layout, and workload scheduling. 6. Stakeholder Management & Leadership • Partner with product owners, business SMEs, data stewards, and platform teams., • Conduct architecture reviews, design walkthroughs, risk/RAID management, and executive updates., • 10–15+ years in Data/Analytics; 5+ years with Spark/Databricks; 3+ end-to-end migrations or net-new Lakehouse builds (at least 1 Teradata → Databricks preferred)., • Deep expertise in:, • Databricks: Workspaces, Jobs/Workflows, DLT, Unity Catalog, Delta Lake, Photon, Clusters/Policies, Repos, Databricks SQL., • Spark: PySpark/Scala/SQL, performance tuning (joins/shuffle/AQE/broadcast), caching, checkpointing, structured streaming., • Teradata: SQL dialect, BTEQ, TPT/FastLoad/FastExport, statistics/partitioning/PI, performance concepts and workload management., • AWS: S3, IAM/KMS, VPC, PrivateLink, CloudWatch, Glue Catalog (interoperability), Lambda (nice-to-have), MSK/Kinesis (for streaming/CDC)., • Proven hands-on code conversion (Teradata SQL → Spark SQL/PySpark) and job orchestration replacement/design., • Strong background in security & governance (RBAC/ABAC, token/secrets, audit), data modeling, DQ, lineage, testing, observability, and SLA-driven design., • Experience with CI/CD (Git), IaC (Terraform), and DevOps for Databricks., • Focused on talent development with quarterly assessment cycles and company-sponsored certifications, • Working with cutting-edge technologies, • Engagement initiatives such as project parties, flexible work hours, Persistent Business Run,, • Private medical and dental care at Medicover for an individual, partner and family Life and accident insurance at Allianz covering an individual, partner and family, • Pension Plans (PPK), • Group personal accident and travel insurance Work from home allowance, • Reimbursement of expenses related to home office, • Reimbursement of costs for purchasing glasses/contact lenses, • My Benefit Systems/Multi cafeteria/Multisport cards, • Reward and recognition awards · Accelerate growth, both professionally and personally · Impact the world in powerful, positive ways, using the latest technologies · Enjoy collaborative innovation, with diversity and work-life wellbeing at the core · Unlock global opportunities to work and learn with the industry’s best…………….??? Let’s unleash your full potential at Persistent - persistent.com/careers “Persistent is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind.”