Lead System Engineer - Microsoft Azure with Gen AI

EPAM Systems, Inc. -
Pune, Maharashtra

Apply Now

Job details

Benefits

Health insurance
Paid time off

Qualifications

CI/CD
Azure
Cloud architecture
Law
Kubernetes
Enterprise Software
DevOps
System architecture
English
Windows
Master's degree
OS Kernels
Docker
Continuous improvement
Terraform
Continuous integration
Scripting
Computer networking
Linux
AI
Leadership
Communication skills

Full job description

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

We are hiring a Lead Cloud Engineer to design, build, and drive enterprise-scale Microsoft Azure platforms with deeply integrated AI capabilities, including advanced Generative AI and Agentic AI solutions.

In this role, you will serve as a technical authority responsible for shaping cloud architecture strategy, leading DevOps and platform engineering transformations, and pioneering the adoption of next-generation AI technologies across the organization. You will collaborate closely with executive stakeholders, engineering teams, and business units to deliver secure, scalable, and resilient systems that power mission-critical workloads and unlock new AI-driven business value.

Responsibilities

Architect end-to-end cloud and AI solutions on Azure, ensuring alignment with enterprise strategy, compliance standards, and long-term scalability goals
Design highly scalable, secure, and resilient systems capable of supporting mission-critical workloads across multiple business units and geographies
Lead DevOps transformation initiatives and establish modern platform engineering practices that improve developer productivity and operational efficiency
Define and enforce best practices for CI/CD pipelines, Infrastructure as Code, security controls, and governance frameworks across the engineering organization
Develop multi-region, high-availability architectures with robust disaster recovery, failover strategies, and performance optimization
Spearhead the adoption of AI-driven solutions across business units by identifying high-impact use cases and guiding implementation from concept to production
Design and deliver enterprise-grade Generative AI platforms, integrating LLMs, agentic workflows, and retrieval-augmented generation capabilities at scale
Drive AI strategy and adoption by partnering with leadership to define roadmaps, evaluate emerging tools, and align AI investments with business outcomes
Foster a culture of innovation through proof-of-concept initiatives, technology evaluations, and the introduction of emerging cloud and AI technologies
Collaborate with stakeholders, architects, and leadership teams to translate complex business requirements into actionable technical designs
Mentor and guide engineering teams, providing technical leadership, code reviews, and architectural guidance to elevate overall team capability
Own the lifecycle of large-scale enterprise systems, including design, deployment, monitoring, optimization, and continuous improvement

Requirements

8+ years of progressive experience in cloud engineering, systems architecture, and large-scale platform delivery
At least 1 year of relevant leadership experience
Expert-level proficiency in Microsoft Azure, including compute, storage, identity, networking, and platform services
Deep expertise in Kubernetes, Azure Kubernetes Service (AKS), and Docker container orchestration for production workloads
Advanced knowledge of cloud networking, security architecture, and governance frameworks across enterprise environments
Skills in large-scale Infrastructure as Code design and implementation using Terraform, Bicep, and ARM templates
Background in DevOps maturity models, CI/CD pipeline design, and platform engineering practices that enable self-service developer experiences
Competency in OS administration across Windows and Linux environments, along with proficiency in scripting languages for automation
Expertise in Generative AI fundamentals, Agentic AI concepts (multi-agent systems, orchestration), and Agentic Workflows for enterprise use cases
Understanding of Retrieval-Augmented Generation (RAG) architectures at scale, including vector databases, embeddings, and prompt engineering
Hands-on experience with Azure AI Foundry and LLM integrations such as OpenAI and Claude for production-grade applications
Familiarity with AI orchestration frameworks, including Semantic Kernel (preferred), LangChain/LangGraph, and CrewAI
Strong architecture and leadership skills with a proven track record of owning and evolving large-scale enterprise systems end-to-end
Proficient communication skills in English (B2 level or higher)

We offer

Opportunity to work on technical challenges that may impact across geographies
Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
Opportunity to share your ideas on international platforms
Sponsored Tech Talks & Hackathons
Unlimited access to LinkedIn learning solutions
Possibility to relocate to any EPAM office for short and long-term projects
Focused individual development
Benefit package:
- Health benefits
- Retirement benefits
- Paid time off
- Flexible benefits
Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Apply Now

Jobseeker tools

Employer Tools

Browse

Stay Connected