Manager, Machine Learning Operations

Zefr2 months ago
Marina Del Rey, CA, United States
Hybrid
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

Zefr is the global leader in brand suitability targeting and measurement across the world's largest platforms. As an official YouTube Measurement Program Partner, Meta for Business Partner, and TikTok for Business Partner, Zefr leverages patented machine learning and AI technology (Cognition AI) to offer brands and agencies precise and transparent brand safety and suitability activation and measurement solutions. Headquartered in Los Angeles, California with additional locations worldwide, the company is powering the age of responsible marketing. We are hiring a Manager of Machine Learning Operations to lead our ML Ops team, oversee the deployment and optimization of ML models at scale, and bridge the gap between research and production.

Key Responsibilities

  • Lead, mentor, and grow a team of Machine Learning Engineers, fostering a culture of innovation and continuous improvement.
  • Design and implement scalable ML infrastructure for model training, deployment, and serving.
  • Establish and enforce best practices for ML model lifecycle management, including versioning, testing, and monitoring.
  • Develop and maintain CI/CD pipelines for machine learning workflows.
  • Optimize model inference performance and reduce latency/cost across production systems.
  • Collaborate with ML Engineers and Data Scientists to productionize models efficiently.
  • Implement robust monitoring, alerting, and observability solutions for ML systems.
  • Drive technical decisions on ML Ops tooling, infrastructure, and architecture.
  • Ensure high availability and reliability of ML services at scale.
  • Manage project timelines, priorities, and resource allocation for the ML Ops team.

Required Qualifications

  • Bachelor's or Master's degree in Computer Science or a related field with 5+ years of professional experience in ML Engineering or MLOps.
  • 2+ years of experience managing or leading engineering teams.
  • Deep expertise in ML model deployment, serving infrastructure, and production ML systems.
  • Hands-on experience with transformer architectures (e.g., BERT, ViT) for natural language and vision tasks.
  • Strong understanding of multimodal embedding techniques for integrating text, image, audio, and structured data.
  • Experience with LLM models such as Gemini, GPT, Claude, Qwen, etc.
  • Experience with ML experiment tracking, model versioning, and feature stores.
  • Strong understanding of CI/CD principles applied to ML workflows.
  • Experience optimizing model inference performance (ONNX, TensorRT, or similar).
  • Excellent leadership, communication, and stakeholder management skills.
  • Track record of building and scaling high-performing engineering teams.
  • Openness to new technologies and creative solutions.

Preferred Qualifications

  • Experience with ad tech and the digital advertising ecosystem.
  • Experience with multimodal LLM fine-tuning.

Benefits & Perks

  • Compensation: The anticipated base salary for this position is between $170,000 and $230,000, determined by job-related skills, experience, and relevant education or training.
  • Benefits: Flexible PTO; Medical, dental, and vision insurance with FSA options; Company-paid life insurance; Paid parental leave; 401(k) with company match; Professional development opportunities; 14 paid holidays; Flexible hybrid work schedule; "Summer Fridays" (shorter work days on select Fridays during the summertime); In-office lunches and lots of free food; Optional in-person and virtual events.

Required Skills

Project Management
Docker
AWS
Python
Kubernetes
ML Engineering
Model Deployment & Serving
GCP
SQL
CI/CD Pipeline Development
Transformer Architectures
ML Lifecycle Management
Monitoring & Observability
Machine Learning Operations
Terraform
Team Leadership