Senior Cloud Support Engineer

Crusoe4 days ago
San Francisco, CA
Remote
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

Crusoe’s mission is to accelerate the abundance of energy and intelligence. At Crusoe Cloud, we’re driving the AI revolution with sustainable, low-cost GPU compute power. As a Senior Cloud Support Engineer, you will be the primary point of contact for technical support—empowering customers to leverage cutting-edge technology for advancements in AI/ML, physics simulations, and computational biology while contributing to a more sustainable future. This full-time role is a unique opportunity to drive meaningful innovation and work on complex challenges alongside a talented team.

Key Responsibilities

  • Customer Support:Provide exceptional technical support to customers via Zendesk, meeting SLAs and maintaining high CSAT (95%+).
  • On-Call Rotation:Participate in a 24/7 on-call rotation to ensure timely resolution of critical issues.
  • Troubleshooting:Diagnose and resolve issues related to VMs, hardware failures, and scaling tests using CLI and internal tools.
  • Alert Triage & Maintenance:Manage alert triage, plan for maintenance windows, and conduct node delivery testing.
  • Collaboration:Work closely with SRE, Networking, and Storage teams from initial triage through root cause analysis (RCA) delivery.
  • Global Teamwork:Adhere to global collaboration and handoff processes for ticketing and on-call procedures.
  • Knowledge Sharing:Develop onboarding materials, knowledge base documentation, and standard operating procedures (SOPs).

Required Qualifications

  • Education/Experience:Bachelor’s degree in IT, Computer Science, Engineering or 4+ years of equivalent technical experience.
  • Linux Proficiency:Strong command-line skills in Linux environments.
  • Version Control:Proficiency with Git for code management and collaboration.
  • Customer Support Experience:5+ years in customer support, ideally within cloud, storage, or networking environments.
  • Cloud Technologies:Experience with container orchestration (e.g., Kubernetes), workload management (e.g., Slurm, Terraform), and monitoring tools (e.g., Grafana).
  • Public Cloud Knowledge:Familiarity with platforms such as AWS, Azure, or GCP.
  • Communication Skills:Excellent communication and customer service skills with the ability to prioritize competing escalations.
  • HPC Knowledge:Understanding of HPC technologies including Infiniband, RDMA, RoCE, and SDN.

Preferred Qualifications

  • Certifications:CKA, CKAD, CKS, KCNA, AWS Machine Learning – Specialty, Data Analytics – Specialty, Solutions Architect – Professional, Developer – Associate, NVIDIA AI Infrastructure and Operations, among others.
  • Cloud Expertise:Deep understanding of specific cloud platforms and services.
  • Automation Skills:Experience with automation tools and scripting languages.
  • Problem-Solving:Proven ability to analyze complex technical issues and develop effective solutions.
  • Collaboration & Mentorship:Demonstrated skill in mentoring, training, and onboarding colleagues.
  • Passion for Sustainability:A strong interest in contributing to a more sustainable future through technology.

Benefits & Perks

  • Industry competitive pay
  • Restricted Stock Units in a fast-growing, well-funded technology company
  • Health insurance options including HDHP and PPO plans, plus vision and dental coverage for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance and short-term/long-term disability
  • Teladoc services
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Subscription to the Calm app
  • MetLife Legal services
  • Company-paid commuter benefit of $200 per pay period

Compensation

Compensation will be paid between $125,000 and $151,000 + Bonus.Restricted Stock Unitsare included in all offers. Salary will be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and market alignment.

Required Skills

AWS
Customer Support
Linux CLI
HPC Technologies
GCP
Slurm
Grafana
Kubernetes
Git
Infiniband
RoCE
Azure
RDMA
Software Defined Networking (SDN)
Terraform