Data Engineer - Observability Tooling

sepal2 months ago
Los Angeles, CA, United States
Remote
Contract
Junior Level (1-3 years)

Job Description

Position Overview

At Sepal AI, we're pushing the boundaries of AI testing by developing some of the most challenging assessments grounded in real-world software systems. We are in search of a talented Data Engineer with over 3 years of experience and a keen systems perspective to join our team. You will play a crucial role in establishing evaluation environments for AI in high-throughput log analysis settings.

Key Responsibilities

  • Design and create analytical schemas and data pipelines utilizing high-performance tools such as BigQuery, ClickHouse, Snowflake, and Redshift.
  • Engage in complex, distributed queries over extensive log and telemetry datasets.
  • Develop and manage synthetic datasets that emulate real-world DevOps, observability, or cloud infrastructure logs.
  • Optimize distributed query execution plans to minimize timeouts and enhance scanning efficiency.

Required Qualifications

  • Possess 3+ years of experience in data engineering or backend systems roles.
  • Deep expertise in analytical databases and OLAP engines, focusing on large-scale query optimization and performance tuning.
  • Experience with log ingestion pipelines (e.g., FluentBit, Logstash, Vector) and schema design for observability systems.
  • Strong SQL skills with the ability to diagnose performance issues and identify inefficient query patterns.

Preferred Qualifications

  • Experience with Python, Docker, or synthetic data generation.

Benefits & Perks

  • Salary: $50 - 85/hr depending on experience
  • Location: Remote, flexible hours
  • Schedule: Project timeline: 5-6 weeks

Required Skills

Docker
Redshift
Data Pipelines
ClickHouse
Python
Snowflake
SQL
Distributed Query Optimization
Log Ingestion (FluentBit, Logstash, Vector)
BigQuery