Senior/Staff Software Engineer - Infra

LlamaIndex9 months ago
San Francisco, California, United States
Hybrid
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

Join us and help shape the future of AI by architecting next-generation knowledge systems. At LlamaIndex, our Infra team builds and scales the core infrastructure powering a high-volume data platform for AI applications. We value integrity, innovation, drive, and technical expertise while fostering a collaborative environment that elevates and empowers every team member. Located in downtown San Francisco, we offer a hybrid-friendly culture where impact, growth, and real results are at the heart of our mission.

Key Responsibilities

  • Collaborate with engineering teams to build and maintain foundational systems that empower developers and support rapid growth.
  • Design and implement scalable infrastructure solutions for various deployment models including SaaS, single-tenant, and private deployments.
  • Manage and optimize cloud resources and Kubernetes clusters for cost-effectiveness and performance.
  • Optimize and improve release and deployment processes to enhance efficiency and reliability.
  • Ensure compliance with relevant regulations and implement robust security measures across different environments.

Required Qualifications

  • 5+ years of engineering experience.
  • Experience on Platform or Infrastructure teams with significant projects involving infrastructure components (e.g., Terraform/CDKTF, Kubernetes, test infrastructure, release management, observability).
  • Proven ability in optimizing cloud resource utilization.
  • Skilled in tuning Kubernetes clusters and cloud resources for cost and performance efficiency.
  • A commitment to building and shaping LlamaIndex’s evolving engineering culture.
  • Ability to balance speed and pragmatism to develop appropriate solutions at each stage of growth.

Preferred Qualifications

  • Experience building out infrastructure from the ground up at a fast-growing startup.
  • Familiarity with observability tools such as Prometheus, Grafana, and New Relic.
  • Knowledge of GitOps tools like ArgoCD and Flux for continuous deployment.
  • Experience with security compliance and audits in cloud environments (e.g., SOC2).
  • Proficiency with Python, Postgres, and multi-cloud deployments.

Benefits & Perks

  • Competitive base salary and equity compensation
  • Comprehensive medical/dental/vision coverage for you and your family
  • Unlimited paid time off policy
  • Daily catered lunch and snacks in the San Francisco office

Required Skills

Observability Tools (Prometheus, Grafana, New Relic)
Cloud Infrastructure Management
Python
Release Management
Terraform/CDKTF
Kubernetes
Security Compliance
Scalable System Design
GitOps (ArgoCD, Flux)
Cost Optimization