Engineering - SRE Platforms - Software Engineer - Vice President - Dallas
Goldman Sachs4 months ago
Dallas, TX, United States
On-site
Full-time
Junior Level (1-3 years)
Job Description
Position Overview
Goldman Sachs is seeking a talented and motivated Site Reliability Engineering Manager to join our team. As a leader within the firm’s Technology division, you will be responsible for overseeing the Site Reliability Engineering (SRE) function, ensuring the stability and reliability of critical applications and infrastructure. You will manage a team of SRE engineers who work closely with developers, infrastructure engineers, and operations teams to build and maintain highly available systems.
- CORPORATE TITLE: Vice President
- OFFICE LOCATION(S): Dallas
- JOB FUNCTION: Software Engineering
- DIVISION: Engineering Division
Key Responsibilities
- Manage a team of Site Reliability Engineers responsible for ensuring the reliability, availability, and performance of critical applications and infrastructure.
- Develop and implement best practices for Site Reliability Engineering, including incident management, monitoring, automation, and capacity planning.
- Collaborate with development teams to design and build highly available and scalable systems.
- Work with infrastructure teams to ensure that critical infrastructure components are operating optimally and can support business needs.
- Develop and maintain Service Level Agreements (SLAs) and Service Level Objectives (SLOs) to ensure that critical systems meet business requirements.
- Manage and prioritize workload for the SRE team, ensuring alignment with business priorities.
- Develop and maintain relationships with key stakeholders across the organization to ensure that the SRE function is aligned with business goals.
Required Qualifications
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in Site Reliability Engineering, with at least 3 years in a management role.
- Strong leadership skills with the ability to manage a team of engineers.
- Experience with cloud computing platforms such as AWS or Azure.
- Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation.
- Experience with containerization technologies such as Docker and Kubernetes.
- Strong problem-solving skills with the ability to troubleshoot complex issues.
Benefits & Perks
- Healthcare & Medical Insurance: A wide range of health and welfare programs tailored to your office location including medical, dental, and various insurance coverages.
- Holiday & Vacation Policies: Competitive vacation policies with generous time-off entitlements to help you recharge.
- Financial Wellness & Retirement: Support for retirement planning, education, and financial wellness resources including live financial education sessions.
- Health Services: Comprehensive medical advocacy, counseling, and on-site health centers (in select offices) to address your critical health needs.
- Fitness: Access to on-site fitness centers or reimbursement for fitness club memberships to encourage a healthy lifestyle.
- Child Care & Family Care: On-site child care, emergency backup care, and supportive programs for parents including counseling and transitional services.
Required Skills
Automation
Monitoring
Site Reliability Engineering
Incident Management
Infrastructure as Code (Terraform/CloudFormation)
Service Level Objectives
Team Leadership
Containerization (Docker/Kubernetes)
Capacity Planning
Cloud Computing (AWS/Azure)