Software Engineer II (Backend + Data pipelines)
Job Description
Position Overview
We're seeking a Software Engineer II with strong backend development experience and a passion for solving complex data challenges at scale. In this role, you'll design, build, and optimize distributed systems that extract, enrich, and process metadata from diverse content types. You will collaborate closely with ML engineers, product managers, and cross-functional partners to integrate machine learning models and LLM-based services into production pipelines, addressing cutting‑edge generative AI and metadata enrichment problems at a global scale.
At Scribd, your base pay is one part of your total Compensation package, determined within a competitive range based on location, experience, and role level. This opportunity also includes a comprehensive and generous benefits package.
Key Responsibilities
- Design and build scalable systems to extract, enrich, and process metadata from millions of documents, images, and audio content.
- Leverage LLMs to integrate capabilities such as summarization, classification, extraction, and enrichment into metadata pipelines.
- Collaborate with cross-functional teams, including ML engineers and product managers, to deliver efficient and reliable metadata solutions.
- Optimize and refactor existing systems to improve performance, scalability, and reliability.
- Ensure data accuracy, integrity, and quality through automated validation and monitoring processes.
- Participate in code reviews, maintaining high-quality standards and best practices across the codebase.
- Manage and maintain data pipelines, along with associated security and infrastructure.
Required Qualifications
- 5+ years of professional software engineering experience.
- Proficiency in Python, Scala, Ruby, or similar languages.
- Experience designing and building distributed systems at scale.
- Hands-on experience deploying and optimizing solutions using ECS, EKS, or AWS Lambda.
- Familiarity with infrastructure-as-code tools like Terraform or similar.
- Experience working with a public cloud provider (AWS, Azure, or Google Cloud).
- Familiarity with data processing frameworks such as Spark or Databricks for large-scale workloads.
- Proven ability to test, profile, and optimize systems for performance, scalability, and reliability.
- Bachelor’s degree in Computer Science or equivalent professional experience.
Preferred Qualifications
- Experience working with LLMs or integrating ML models into production systems.
Benefits & Perks
- Healthcare Insurance Coverage (Medical/Dental/Vision): 100% paid for employees
- 12 weeks paid parental leave
- Short-term/long-term disability plans
- 401k/RSP matching
- Onboarding stipend for home office peripherals + accessories
- Learning & Development allowance and programs
- Quarterly stipend for Wellness, WiFi, etc.
- Mental Health support & resources
- Free subscription to the Scribd Inc. suite of products
- Referral Bonuses and Book Benefit
- Sabbaticals and company-wide events
- Team engagement budgets, Vacation & Personal Days, and Paid Holidays (+ winter break)
- Flexible Sick Time and Volunteer Day
- Access to Employee Resource Groups fostering an inclusive workplace
- Access to AI Tools: Free access to best‑in‑class AI tools to boost productivity