Job Title: DevOps Engineer (LLM & AI Infrastructure)
Experience Required: 3–4 Years
Role Overview:
We are looking for a DevOps Engineer who understands not just systems, but intelligence at scale — someone who can bridge infrastructure with modern AI workflows. This role focuses on building, managing, and optimizing environments for Large Language Models, image models, and high-performance GPU workloads.
Key Responsibilities:
Design, deploy, and manage scalable infrastructure for AI/ML workloads
Work with GPU-based systems and optimize performance for model training and inference
Build and maintain CI/CD pipelines for ML models and backend services
Deploy and manage containerized applications using Docker and Kubernetes
Handle orchestration of distributed systems and model serving pipelines
Optimize CUDA environments and GPU utilization
Manage cloud and on-prem compute clusters for AI workloads
Collaborate with ML engineers to productionize LLMs and LoRA-based fine-tuned models
Monitor system performance, logs, and reliability across services
Required Skills:
Strong proficiency in Python
Solid understanding of DevOps principles and infrastructure automation
Hands-on experience with Docker and Kubernetes
Experience with GPU systems, CUDA, and high-performance computing
Understanding of how Large Language Models (LLMs) work
Familiarity with LoRA (Low-Rank Adaptation) and model fine-tuning concepts
Knowledge of model deployment and inference pipelines
Experience with cloud platforms (AWS / GCP / Azure)
Good to Have:
Experience with model serving frameworks (Triton, TorchServe, vLLM, etc.)
Familiarity with Replicate or similar model hosting platforms
Knowledge of distributed training and inference optimization
Exposure to vector databases and retrieval systems