Software Engineer, Data Infrastructure & Acquisition - Houston, USA

Speechifyabout 2 months ago
Houston, TX, United States
Remote
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

The mission of Speechify is to make sure that reading is never a barrier to learning. Over 50 million people use Speechify’s text-to-speech products – including apps for iOS, Android, Mac, Chrome, and Web – to convert everything from PDFs and books to webpages into audio. Today, nearly 200 people work in a 100% distributed setting at Speechify, coming from leading companies and top academic programs. We’re looking to hire for our Data side of our AI team. In this role, you will be responsible for all aspects of data collection to support our model training operations, building high-quality datasets at petabyte-scale and low cost through a tight integration of infrastructure, engineering, and research work.

Compensation: The United States base salary range for this full-time position is $140,000-$200,000 + bonus + equity depending on experience

Key Responsibilities

  • Be scrappy to find new sources of audio data and bring it into our ingestion pipeline.
  • Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform.
  • Collaborate closely with our Scientists to shift the cost/throughput/quality frontier, delivering richer data at bigger scale and lower cost to power our next-generation models.
  • Collaborate with the AI Team and Speechify Leadership to craft the dataset roadmap for powering next-generation consumer and enterprise products.

Required Qualifications

  • BS/MS/PhD in Computer Science or a related field.
  • 5+ years of industry experience in software development.
  • Proficiency with bash/Python scripting in Linux environments.
  • Proficiency in Docker and Infrastructure-as-Code concepts with professional experience on at least one major Cloud Provider (we use GCP).
  • Ability to handle multiple tasks and adapt to changing priorities.
  • Strong communication skills, both written and verbal.

Preferred Qualifications

  • Experience with web crawlers and large-scale data processing workflows.

Benefits & Perks

  • A fast-growing environment where you can help shape the company and product.
  • An entrepreneurial-minded team that supports risk, intuition, and hustle.
  • A hands-off management approach so you can focus and do your best work.
  • An opportunity to make a big impact in a transformative industry.
  • Competitive salaries, a friendly and laid-back atmosphere, and a commitment to building a great asynchronous culture.
  • Opportunity to work on a life-changing product that millions of people use.
  • Build products that directly impact and support people with learning differences like dyslexia, ADD, low vision, concussions, autism, and more.
  • Work in one of the fastest-growing sectors of tech, at the intersection of artificial intelligence and audio.

Required Skills

Team collaboration
Terraform
Web crawlers
Linux
Docker
Infrastructure-as-Code
GCP
Bash
Python scripting
Data ingestion