Data Engineer - Observability Tooling
sepal2 months ago
Los Angeles, CA, United States
Remote
Contract
Junior Level (1-3 years)
Job Description
Position Overview
At Sepal AI, we're pushing the boundaries of AI testing by developing some of the most challenging assessments grounded in real-world software systems. We are in search of a talented Data Engineer with over 3 years of experience and a keen systems perspective to join our team. You will play a crucial role in establishing evaluation environments for AI in high-throughput log analysis settings.
Key Responsibilities
- Design and create analytical schemas and data pipelines utilizing high-performance tools such as BigQuery, ClickHouse, Snowflake, and Redshift.
- Engage in complex, distributed queries over extensive log and telemetry datasets.
- Develop and manage synthetic datasets that emulate real-world DevOps, observability, or cloud infrastructure logs.
- Optimize distributed query execution plans to minimize timeouts and enhance scanning efficiency.
Required Qualifications
- Possess 3+ years of experience in data engineering or backend systems roles.
- Deep expertise in analytical databases and OLAP engines, focusing on large-scale query optimization and performance tuning.
- Experience with log ingestion pipelines (e.g., FluentBit, Logstash, Vector) and schema design for observability systems.
- Strong SQL skills with the ability to diagnose performance issues and identify inefficient query patterns.
Preferred Qualifications
- Experience with Python, Docker, or synthetic data generation.
Benefits & Perks
- Salary: $50 - 85/hr depending on experience
- Location: Remote, flexible hours
- Schedule: Project timeline: 5-6 weeks
Required Skills
Docker
Redshift
Data Pipelines
ClickHouse
Python
Snowflake
SQL
Distributed Query Optimization
Log Ingestion (FluentBit, Logstash, Vector)
BigQuery