Software Engineer II (Backend + Data pipelines)

Scribd6 months ago

Portland, Oregon, United States

Hybrid

Full-time

Junior Level (1-3 years)

Job Description

Position Overview

We're seeking a Software Engineer II with strong backend development experience and a passion for solving complex data challenges at scale. In this role, you'll design, build, and optimize distributed systems that extract, enrich, and process metadata from diverse content types. You will collaborate closely with ML engineers, product managers, and cross-functional partners to integrate machine learning models and LLM-based services into production pipelines, addressing cutting‑edge generative AI and metadata enrichment problems at a global scale.

At Scribd, your base pay is one part of your total Compensation package, determined within a competitive range based on location, experience, and role level. This opportunity also includes a comprehensive and generous benefits package.

Key Responsibilities

Design and build scalable systems to extract, enrich, and process metadata from millions of documents, images, and audio content.
Leverage LLMs to integrate capabilities such as summarization, classification, extraction, and enrichment into metadata pipelines.
Collaborate with cross-functional teams, including ML engineers and product managers, to deliver efficient and reliable metadata solutions.
Optimize and refactor existing systems to improve performance, scalability, and reliability.
Ensure data accuracy, integrity, and quality through automated validation and monitoring processes.
Participate in code reviews, maintaining high-quality standards and best practices across the codebase.
Manage and maintain data pipelines, along with associated security and infrastructure.

Required Qualifications

5+ years of professional software engineering experience.
Proficiency in Python, Scala, Ruby, or similar languages.
Experience designing and building distributed systems at scale.
Hands-on experience deploying and optimizing solutions using ECS, EKS, or AWS Lambda.
Familiarity with infrastructure-as-code tools like Terraform or similar.
Experience working with a public cloud provider (AWS, Azure, or Google Cloud).
Familiarity with data processing frameworks such as Spark or Databricks for large-scale workloads.
Proven ability to test, profile, and optimize systems for performance, scalability, and reliability.
Bachelor’s degree in Computer Science or equivalent professional experience.

Preferred Qualifications

Experience working with LLMs or integrating ML models into production systems.

Benefits & Perks

Healthcare Insurance Coverage (Medical/Dental/Vision): 100% paid for employees
12 weeks paid parental leave
Short-term/long-term disability plans
401k/RSP matching
Onboarding stipend for home office peripherals + accessories
Learning & Development allowance and programs
Quarterly stipend for Wellness, WiFi, etc.
Mental Health support & resources
Free subscription to the Scribd Inc. suite of products
Referral Bonuses and Book Benefit
Sabbaticals and company-wide events
Team engagement budgets, Vacation & Personal Days, and Paid Holidays (+ winter break)
Flexible Sick Time and Volunteer Day
Access to Employee Resource Groups fostering an inclusive workplace
Access to AI Tools: Free access to best‑in‑class AI tools to boost productivity

Required Skills

Scala

Ruby on Rails

HTTP APIs

Databricks

Distributed Systems

Airflow

Terraform

Data Pipelines

Spark

Machine Learning Integration

Python

AWS (Lambda, ECS, SQS, ElastiCache, Sagemaker, Cloudwatch)