Sr. Data Infrastructure Engineer
3 days ago
Waltham
Job DescriptionThe Elevator Pitch Join Evolv as Senior Data Infrastructure Engineer in the Machine Learning & Sensors organization, responsible for building and operating the scalable, secure, and reliable data pipelines that power our AI/ML research and production systems. In this role, you will own the end‑to‑end data lifecycle—from collection on thousands to millions of edge devices, through cloud ingestion and processing, into a centralized data factory enabling model training, evaluation, and continuous improvement. Data is the backbone of our mission to deliver best‑in‑class AI‑based weapon detection systems. You will ensure that data flows seamlessly across geographies, devices, and cloud systems while meeting strict requirements for quality, privacy, security, and scale. This role is ideal for someone who thrives at the intersection of distributed systems, cloud pipelines, and ML‑driven data needs. Success in the Role: What performance outcomes will you work toward in the first 6–12 months? In the first 30 days: • Develop a deep understanding of existing edge‑to‑cloud data pipelines and deployment environments., • Review current data ingestion flows, governance policies, and cloud infrastructure., • Assess pain points in data reliability, quality, and operational scalability., • Build relationships with AI/ML, data science, field operations, and cloud engineering teams., • Design and implement improvements to core ingestion, validation, and processing pipelines., • Deploy scalable data pipeline with AWS‑based components (S3, EC2, Lambda, Glue, Step Functions, SageMaker integrations)., • Introduce automated validation workflows to detect corruption, missing metadata, or malformed data., • Design and implement automated model evaluation, model training and model improvement pipeline to speed up experiments, • Own the entire lifecycle of mission‑critical data pipelines supporting AI/ML research and production., • Architect next‑generation edge‑to‑cloud data systems that scale across millions of devices., • Define and enforce data governance frameworks including retention, access control, privacy, and lineage., • Enable ML teams to rapidly experiment through high‑quality, discoverable, versioned datasets. The Work: What type of work will you be doing? What assignments, requirements, or skills will you be performing on a regular basis? End‑to‑End Data Pipeline Ownership: • Design, build, and maintain both research and production data pipelines spanning edge devices, cloud services, and centralized data platforms., • Own the full data lifecycle: collection, ingestion, processing, obfuscation, versioning, access, retention, and retirement., • Edge‑to‑Cloud Data Flow:, • Develop resilient ingestion pipelines capable of handling variable connectivity and device heterogeneity., • Support secure data transfer from the field to cloud storage systems., • Collaborate with field ops to enhance data coverage, observability, and operational robustness., • Data Quality, Governance & Compliance:, • Implement privacy‑preserving transformations and obfuscation pipelines., • Build automated cleaning/validation steps to remove duplicates, detect corruption, and validate metadata., • Provide scalable data services for model training, evaluation, and research experimentation., • Support continuous data refresh and retraining workflows., • Integrate with data labeling services and annotation workflows., • Build and optimize pipelines using AWS services (S3, EC2, SageMaker, Lambda, Glue, Step Functions)., • Partner with AI/ML engineers, scientists, and data scientists to understand data requirements., • Translate feedback into automated improvements in data collection, labeling, and consumption., • Design and manage data schema, data versioning and data factory updates, • Architect systems that scale globally across millions of devices., • Ensure the data platform remains flexible for research and reliable for production operations. Qualifications: Minimum Qualifications: • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Software Engineering, or related field., • 2-3+ years of experience building production data pipelines and data platforms that support AI/ML models., • Strong proficiency in Python, C++ and distributed data processing frameworks., • Hands‑on experience with AWS services including S3, EC2, SageMaker, and Glue., • Experience designing data systems that support large‑scale ML training and experimentation., • Knowledge of data governance, access control, and lifecycle management., • Experience collaborating with ML, data science, operations, and cloud teams. Preferred Qualifications: • Experience building pipelines spanning edge devices and cloud systems., • Background working with large‑scale sensor, image or IoT data., • Familiarity with data labeling tools and annotation workflows., • Experience implementing dataset versioning, lineage, and reproducibility systems., • Understanding of privacy, compliance, or regulated data environments., • Experience supporting global, multi‑region data platforms. Example Problems You Will Own • Design a resilient global ingestion pipeline aggregating sensor data from millions of devices., • Build ML‑ready data services enabling easy discovery, versioning, and consumption of datasets., • Implement automated validation and cleaning workflows that dramatically reduce bad data., • Define and enforce lifecycle and governance policies across research and production datasets. What is leadership like for this role? What is the structure and culture of the team? • You will join our R&D organization, reporting directly to VP of ML and sensors. In this role, you will interface with cross-disciplinary teams of highly skilled and autonomous engineers with expertise in Electromagnetics, Computer Vision, and AI. Our R&D organization includes more than 100 dedicated developers, engineers, scientists, managers and directors, each bringing deep technical knowledge and a strong culture of collaboration and support., • The team culture is one based on building trust, collaboration, on-going development through kindness, authenticity, courage, drive, and fun! Where is the role located? This role is based at our headquarters in Waltham, Massachusetts. Due to the nature of our software-enabled hardware products, this position requires a minimum of 60% or 3 days per week on-site work. What is the salary range? The base salary range for this full-time position is $129,000 - $209,000. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. · Please note that the compensation details listed in role posting reflect the base salary only, and do not include commission, equity, or benefits Benefits At Evolv, we’re on a mission to help make public spaces safer through innovative security technology. So, we're looking for future teammates who embody our values, people who: • Do the right thing, always;, • Put people first', • Own it;, • Win together; and continue to, • Be bold, stay curious. Our Benefits Include: • Equity as part of your total compensation package, • Medical, dental, and vision insurance, • Health Savings Account (HSA), • A 401(k) plan (and 2% company match), • Flexible Paid Time Off (PTO)- take the time you need to recharge, with manager approval and business needs in mind, • Quarterly stipend for perks and benefits that matter most to you, • Tuition reimbursement to support your ongoing learning and development Evolv is committed to offering an inclusive and accessible experience for all job seekers, including individuals with disabilities. If you need a reasonable accommodation as part of the job application process, please connect with us at . Evolv participates in E-verify for all employees after the completion of Form I-9.