ML Engineer (LLM Deployment & GPU)

IntraEdge6 months ago
Phoenix, Arizona, United States
On-site
Full-time
Junior Level (1-3 years)

Job Description

Position Overview

We are looking for an ML Engineer with hands-on experience in deploying large language models (LLMs) on GPU infrastructure. This role combines ML engineering with DevOps, focusing on scalable deployments, API integration, and optimization of LLM performance.

Key Responsibilities

  • Deploy and optimize LLMs on GPU-based infrastructure.
  • Build and manage APIs for model serving (Python-based).
  • Implement CI/CD, monitoring, and scaling for ML models.
  • Collaborate on prompt engineering and model optimization.
  • Manage containerized workloads (Docker/Kubernetes).

Required Qualifications

  • 4–5 years of ML/DevOps engineering experience.
  • Strong in Python, APIs, and LLM architecture.
  • Experience with GPU deployments and cloud platforms (AWS/GCP/Azure).
  • Familiarity with prompt engineering and inference optimization.
  • Machine Learning: 5 years of experience (Required).

Benefits & Additional Information

  • Pay: $60.00 - $75.00 per hour
  • Job Type: Full-time
  • Expected Hours: 40 per week
  • Location: In person; Ability to Relocate: Phoenix, AZ 85003 (Relocate before starting work)

Required Skills

GPU Infrastructure
Model Optimization
ML Engineering
DevOps
Python
LLM Deployment
Kubernetes
Cloud Platforms
API Integration
Prompt Engineering
Docker
CI/CD