ML Engineer (LLM Deployment & GPU)
IntraEdge6 months ago
Phoenix, Arizona, United States
On-site
Full-time
Junior Level (1-3 years)
Job Description
Position Overview
We are looking for an ML Engineer with hands-on experience in deploying large language models (LLMs) on GPU infrastructure. This role combines ML engineering with DevOps, focusing on scalable deployments, API integration, and optimization of LLM performance.
Key Responsibilities
- Deploy and optimize LLMs on GPU-based infrastructure.
- Build and manage APIs for model serving (Python-based).
- Implement CI/CD, monitoring, and scaling for ML models.
- Collaborate on prompt engineering and model optimization.
- Manage containerized workloads (Docker/Kubernetes).
Required Qualifications
- 4–5 years of ML/DevOps engineering experience.
- Strong in Python, APIs, and LLM architecture.
- Experience with GPU deployments and cloud platforms (AWS/GCP/Azure).
- Familiarity with prompt engineering and inference optimization.
- Machine Learning: 5 years of experience (Required).
Benefits & Additional Information
- Pay: $60.00 - $75.00 per hour
- Job Type: Full-time
- Expected Hours: 40 per week
- Location: In person; Ability to Relocate: Phoenix, AZ 85003 (Relocate before starting work)
Required Skills
GPU Infrastructure
Model Optimization
ML Engineering
DevOps
Python
LLM Deployment
Kubernetes
Cloud Platforms
API Integration
Prompt Engineering
Docker
CI/CD