Senior/Staff Software Engineer - Infra
LlamaIndex9 months ago
San Francisco, California, United States
Hybrid
Full-time
Junior Level (1-3 years)
Job Description
Position Overview
Join us and help shape the future of AI by architecting next-generation knowledge systems. At LlamaIndex, our Infra team builds and scales the core infrastructure powering a high-volume data platform for AI applications. We value integrity, innovation, drive, and technical expertise while fostering a collaborative environment that elevates and empowers every team member. Located in downtown San Francisco, we offer a hybrid-friendly culture where impact, growth, and real results are at the heart of our mission.
Key Responsibilities
- Collaborate with engineering teams to build and maintain foundational systems that empower developers and support rapid growth.
- Design and implement scalable infrastructure solutions for various deployment models including SaaS, single-tenant, and private deployments.
- Manage and optimize cloud resources and Kubernetes clusters for cost-effectiveness and performance.
- Optimize and improve release and deployment processes to enhance efficiency and reliability.
- Ensure compliance with relevant regulations and implement robust security measures across different environments.
Required Qualifications
- 5+ years of engineering experience.
- Experience on Platform or Infrastructure teams with significant projects involving infrastructure components (e.g., Terraform/CDKTF, Kubernetes, test infrastructure, release management, observability).
- Proven ability in optimizing cloud resource utilization.
- Skilled in tuning Kubernetes clusters and cloud resources for cost and performance efficiency.
- A commitment to building and shaping LlamaIndex’s evolving engineering culture.
- Ability to balance speed and pragmatism to develop appropriate solutions at each stage of growth.
Preferred Qualifications
- Experience building out infrastructure from the ground up at a fast-growing startup.
- Familiarity with observability tools such as Prometheus, Grafana, and New Relic.
- Knowledge of GitOps tools like ArgoCD and Flux for continuous deployment.
- Experience with security compliance and audits in cloud environments (e.g., SOC2).
- Proficiency with Python, Postgres, and multi-cloud deployments.
Benefits & Perks
- Competitive base salary and equity compensation
- Comprehensive medical/dental/vision coverage for you and your family
- Unlimited paid time off policy
- Daily catered lunch and snacks in the San Francisco office
Required Skills
Observability Tools (Prometheus, Grafana, New Relic)
Cloud Infrastructure Management
Python
Release Management
Terraform/CDKTF
Kubernetes
Security Compliance
Scalable System Design
GitOps (ArgoCD, Flux)
Cost Optimization