Site Reliability Engineer

COCC3 months ago
Rocky Hill, CT, United States
Hybrid
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

COCC, an industry-leading fintech provider recognized among American Banker’s FinTech 100 and the Inc. 5,000 fastest growing companies, delivers innovative and comprehensive technology solutions across the Northeastern United States. Designated a Top Workplace in Connecticut and a nationally Certified Great Place to Work, we believe that our employees are the cornerstone of our success. We are seeking an experienced Infrastructure Engineer with strong Kubernetes expertise across both public cloud and on-premise environments. The ideal candidate will demonstrate deep understanding of network protocols (TCP/IP, HTTP/HTTPS, DNS), proficiency with Unix/Linux systems, and expertise in scripting languages (Python, Bash) for automation. Salary: $130K-$170K

Key Responsibilities

  • Manage and support Kubernetes clusters (on-premises and/or cloud) across production, staging, and development environments.
  • Ensure stability, scalability, and high availability of Kubernetes platforms.
  • Implement Kubernetes-native security controls (RBAC, NetworkPolicies, PodSecurityStandards).
  • Diagnose and resolve complex issues related to Kubernetes, container runtimes, and workloads.
  • Manage cluster and infrastructure configurations using tools such as Terraform, Helm, and Ansible.
  • Build, maintain, and troubleshoot CI/CD pipelines for Kubernetes deployments (preferably with GitLab, GitHub Actions, or similar).
  • Implement and maintain Kubernetes monitoring and alerting systems (e.g., Prometheus, Grafana, Loki, ELK, OpenTelemetry).

Required Qualifications

  • Education: Bachelors degree in Computer Science or equivalent work experience and/or certifications
  • Kubernetes expertise in public clouds and private on-premise deployments
  • Understanding of network protocols (TCP/IP, HTTP/HTTPS, DNS)
  • Comfortable with scripting languages (Python, Bash) for automation
  • Proficiency with Unix/Linux systems, including performance and application troubleshooting
  • Ability to clearly define problems and implement innovative long-term solutions
  • A drive for automating processes and streamlining operations
  • Experience responding to and being involved with production incidents

Preferred Qualifications

  • Familiarity with GitLab CI or equivalent
  • Experience with Terraform infrastructure as code
  • Experience with Prometheus/Grafana/Loki or equivalent monitoring solutions
  • Understanding of Windows system administration

Benefits & Perks

  • Hybrid schedules and ample paid time off allowing work/life balance and flexibility
  • Customized training and onboarding to support you in your first year at COCC
  • Robust employee development programs aligned with career pathing objectives
  • Cutting-edge training and educational resources from vendors like SANS, PluralSight, and CBTNuggets
  • Generous PTO offerings, benefits, and competitive compensation
  • On-site fitness centers, wellness incentives, and lifestyle spending accounts
  • Tuition Reimbursement
  • One-on-one career coaching
  • DEIB initiatives championing inclusion and encouraging you to bring your whole self to work
  • Financial planning assistance with certified professionals
  • Peer recognition programs

Required Skills

Helm
CI/CD pipelines
Kubernetes
Bash scripting
Python scripting
Terraform
Ansible
Grafana
Prometheus
Unix/Linux systems
Network protocols (TCP/IP, HTTP/HTTPS, DNS)