Candidate Skill:
AI Architecture, GenAI, Python, Spark, Kubernetes, MLOps, LLM, RAG, Cloud, Terraform
Job Description:
We are looking for a Platform Architect (AI/GenAI) to design and lead scalable, enterprise-grade AI/ML and LLM platforms. This role focuses on building robust data pipelines, optimizing LLM performance, and architecting cloud-native AI systems with strong MLOps and DevOps practices. Key Responsibilities AI/ML & Data Platform Architecture Design and build scalable data pipelines for AI/ML workloads Collaborate with cross-functional teams to integrate and optimize ML models Develop NLP solutions (classification, sentiment analysis, topic modeling) LLM & GenAI Systems Design and operate LLM inference architectures (GPU optimization, quantization) Build RAG-based systems, chatbots, and semantic search solutions Perform prompt engineering and fine-tuning of LLMs Implement model optimization techniques (quantization, distillation) MLOps & DevOps Implement CI/CD pipelines (GitHub Actions, Jenkins) Manage Infrastructure as Code (Terraform, CloudFormation, Pulumi) Automate model training, deployment, and monitoring workflows Track experiments using MLflow or DVC Cloud & Infrastructure Deploy and manage systems on AWS / Azure / GCP Work with Kubernetes, Docker, Helm, and service mesh Ensure secure, scalable, and high-performance cloud architectures Model Evaluation & Optimization Design and execute A/B testing frameworks Optimize models for performance, cost, and scalability Required Skills Technical Skills Advanced Python programming Strong experience with Apache Spark and large-scale data processing Proficiency in SQL and data querying Experience with ML frameworks (PyTorch, TensorFlow, Hugging Face) Software Engineering Strong system design and architecture skills Experience with microservices and distributed systems Knowledge of Go or Rust (preferred) Ability to create Low-Level Design (LLD) DevOps & MLOps Hands-on with Terraform / CloudFormation / Pulumi Experience with CI/CD pipelines (GitHub Actions, Jenkins) Expertise in Docker, Kubernetes, Helm, service mesh Experience with model serving (TorchServe, TensorFlow Serving, vLLM, FastAPI) GenAI & LLM Expertise Experience with RAG, vector databases, and LLM applications Strong knowledge of prompt engineering and model optimization Hands-on experience building chatbots, recommendation systems Cloud & Security Strong experience with AWS / Azure / GCP Knowledge of cloud networking and security best practices Good to Have Experience in healthcare AI domain Knowledge of advanced LLM optimization techniques Exposure to translation systems and multilingual AI Soft Skills Strong leadership and architecture mindset Excellent problem-solving and analytical skills Ability to mentor teams and drive technical strategy