Data Engineer - Observability Tooling

sepal2 months ago

Los Angeles, CA, United States

Remote

Contract

Junior Level (1-3 years)

Job Description

Position Overview

At Sepal AI, we're pushing the boundaries of AI testing by developing some of the most challenging assessments grounded in real-world software systems. We are in search of a talented Data Engineer with over 3 years of experience and a keen systems perspective to join our team. You will play a crucial role in establishing evaluation environments for AI in high-throughput log analysis settings.

Key Responsibilities

Design and create analytical schemas and data pipelines utilizing high-performance tools such as BigQuery, ClickHouse, Snowflake, and Redshift.
Engage in complex, distributed queries over extensive log and telemetry datasets.
Develop and manage synthetic datasets that emulate real-world DevOps, observability, or cloud infrastructure logs.
Optimize distributed query execution plans to minimize timeouts and enhance scanning efficiency.

Required Qualifications

Possess 3+ years of experience in data engineering or backend systems roles.
Deep expertise in analytical databases and OLAP engines, focusing on large-scale query optimization and performance tuning.
Experience with log ingestion pipelines (e.g., FluentBit, Logstash, Vector) and schema design for observability systems.
Strong SQL skills with the ability to diagnose performance issues and identify inefficient query patterns.

Preferred Qualifications

Experience with Python, Docker, or synthetic data generation.

Benefits & Perks

Salary: $50 - 85/hr depending on experience
Location: Remote, flexible hours
Schedule: Project timeline: 5-6 weeks

Required Skills

Docker

Redshift

Data Pipelines

ClickHouse

Python

Snowflake

SQL

Distributed Query Optimization

Log Ingestion (FluentBit, Logstash, Vector)

BigQuery