Share this job
Senior Data Engineer - Profitable AI Programmatic Advertising Platform
USA
Apply for this job

PLEASE CLICK HERE TO SEE *ALL* OF OUR JOB OPENINGS!


Senior Data Engineer


Seeking a Senior Data Engineer to join the Data Engineering team and take ownership of the data infrastructure and backend services powering the company’s core products. This role sits at the intersection of data platform development, cloud infrastructure, and production systems—including vector search at scale and embedding pipelines—enabling real-world media performance.


You will design, build, and operate the data infrastructure and services that power the products—from Databricks-based pipelines and vector search over 50M+ records to containerized services on AWS—while owning their reliability and performance in production. That means working heavily in Databricks, building and tuning vector search capabilities at scale, deploying and managing services on AWS and Kubernetes, and partnering with data science and product teams to ship new capabilities end to end.


You’ll work closely with data scientists, product managers, and client teams, and you’ll need to be as comfortable debugging a Kubernetes deployment or an AWS service as you are building a data pipeline in Databricks. The ideal candidate writes production-grade code, thinks in systems and scalability, and brings real backend engineering depth to a data-focused role.


Key Responsibilities

Data Infrastructure, Vector Search & Cloud Services:

  • Build and operate production data services and APIs on AWS, deploying and managing containerized applications on Kubernetes (EKS) with full ownership of reliability and performance
  • Implement and scale vector search infrastructure using Databricks Vector Index and Milvus to power audience matching, similarity retrieval, and AI-driven product features across 50M+ records and growing
  • Build and optimize data pipelines and ETL/ELT workflows in Python and SQL, integrating with Databricks and Snowflake where needed to support model-serving and feature delivery
  • Architect scalable, cost-effective cloud infrastructure on AWS (EKS, S3, RDS, Lambda, SQS/SNS) that supports real-time and batch workloads for campaign data, audience signals, and embedding generation

 

Cross-Functional Collaboration & Platform Reliability:

  • Serve as the data infrastructure and platform expert across data science, product, and client teams—translating product requirements into reliable, performant data services and pipelines
  • Own service reliability, monitoring, and incident response; surface infrastructure gaps and performance bottlenecks to inform the platform roadmap
  • Contribute to internal tooling, observability frameworks (logging, metrics, alerting), and engineering best practices across the team


Required Qualifications:

  • 5+ years in data engineering, backend engineering, or platform infrastructure roles. Bachelor’s degree required.
  • Strong hands-on experience with AWS (EKS, S3, RDS, Lambda, IAM, CloudFormation or Terraform) and deploying, scaling, and troubleshooting containerized applications on Kubernetes
  • Ability to write production-grade, testable, and well-documented backend service code
  • Strong proficiency in Python and SQL, with hands-on Databricks experience (Delta Lake, Jobs, Workflows) and practical experience building data services, working with vector databases (Milvus, Databricks Vector Index), and operating high-throughput systems on AWS
  • Experience with CI/CD workflows (GitHub Actions), Docker, Helm, and infrastructure-as-code (Terraform or CloudFormation)
  • Excellent communication skills; ability to translate technical findings for non-technical stakeholders


Nice to Have:

  • Experience with embedding generation, ML model serving, or building feature stores for production ML workloads
  • Working familiarity with Snowflake, PySpark, or dbt for analytics and transformation workloads
  • Experience and familiarity with adtech and digital advertising
  • Experience with event-driven architectures, streaming systems (Kafka, Kinesis), or real-time data processing at scale


Tech Stack:

  • Data Platform: Databricks (Delta Lake, Vector Index), Milvus, Snowflake
  • Cloud & Infrastructure: AWS (EKS, S3, RDS, Lambda, SQS), Kubernetes, Docker, Terraform
  • DevOps & Tooling: GitHub Actions, MLflow, Datadog
  • Languages: Python, SQL


Job-3565858

*LI-NB1

#LI-Remote

Apply for this job
Powered by