Staff Software Engineer - Data
Job Description
Position Overview
We are looking for a Staff Software Engineer to shape the future of our data platform with a focus on small data at scale. While many companies over-index on heavyweight distributed systems, we believe in the power of efficient, local-first, columnar engines like DuckDB to process and analyze data quickly, reliably, and cost-effectively.
As a Staff Software Engineer, you will set the technical direction for how our teams ingest, transform, and serve data, bridging the gap between lightweight embedded tools and cloud-scale systems. You’ll be hands-on in building pipelines, while also mentoring engineers and setting best practices across the organization.
Salary: $128,000 - $230,000 per year; this role is also eligible for bonus/commission, equity, and additional Benefits:.
About the Company: DoubleVerify
Key Responsibilities
- Architect and Build Data Pipelines
- Design and implement data processing workflows using DuckDB, Polars, and Arrow/Parquet.
- Balance small-data local pipelines with cloud data warehouse backends.
- Champion the Small Data Mindset
- Advocate for efficient, vectorized, local-first approaches where appropriate.
- Drive best practices for designing reproducible and testable data workflows.
- Collaborate Cross-Functionally
- Partner with data science, professional services, and product engineering teams to define semantic data layers.
- Provide technical leadership in how data is versioned, validated, and surfaced for downstream use.
- Operational Excellence
- Establish standards for CI/CD, observability, and reliability in data pipelines.
- Automate workflows and optimize data layout for performance and cost efficiency.
- Mentor & Lead
- Serve as a thought leader in the organization, guiding engineers on when to use lightweight tools versus distributed platforms.
- Mentor senior and mid-level data engineers to accelerate their growth.
Required Qualifications
- Deep expertise in SQL (window functions, CTEs, optimization).
- Strong Python skills with data libraries.
- Proficiency with DuckDB including extensions and parquet/iceberg integration.
- Hands-on experience with columnar formats (Parquet, Arrow, ORC) and schema evolution.
- Expertise in Kubernetes and Helm.
- Cloud storage experience with AWS S3 and GCS.
- Experience with semantic layer frameworks such as CubeJS.
- Familiarity with CI/CD tooling including GitHub Actions, Terraform, and Docker/Kubernetes.
- Track record of leading architecture decisions and mentoring teams.
- Ability to set standards for maintainability and developer experience.
Preferred Qualifications
- Experience with serverless and embedded analytics (e.g., DuckDB WASM in production).
- Exposure to data versioning technologies such as Delta Lake, Iceberg, or Hudi.
- Knowledge of ML/LLM data preparation workflows and vector database integrations.
- Previous experience building hybrid stacks combining local development with cloud warehouse production.
Benefits & Perks
- Benefits: Eligibility for bonus/commission, equity, and additional company benefits.