EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are hiring a Lead Cloud Engineer to design, build, and drive enterprise-scale Microsoft Azure platforms with deeply integrated AI capabilities, including advanced Generative AI and Agentic AI solutions.
In this role, you will serve as a technical authority responsible for shaping cloud architecture strategy, leading DevOps and platform engineering transformations, and pioneering the adoption of next-generation AI technologies across the organization. You will collaborate closely with executive stakeholders, engineering teams, and business units to deliver secure, scalable, and resilient systems that power mission-critical workloads and unlock new AI-driven business value.
Responsibilities
-
Architect end-to-end cloud and AI solutions on Azure, ensuring alignment with enterprise strategy, compliance standards, and long-term scalability goals
-
Design highly scalable, secure, and resilient systems capable of supporting mission-critical workloads across multiple business units and geographies
-
Lead DevOps transformation initiatives and establish modern platform engineering practices that improve developer productivity and operational efficiency
-
Define and enforce best practices for CI/CD pipelines, Infrastructure as Code, security controls, and governance frameworks across the engineering organization
-
Develop multi-region, high-availability architectures with robust disaster recovery, failover strategies, and performance optimization
-
Spearhead the adoption of AI-driven solutions across business units by identifying high-impact use cases and guiding implementation from concept to production
-
Design and deliver enterprise-grade Generative AI platforms, integrating LLMs, agentic workflows, and retrieval-augmented generation capabilities at scale
-
Drive AI strategy and adoption by partnering with leadership to define roadmaps, evaluate emerging tools, and align AI investments with business outcomes
-
Foster a culture of innovation through proof-of-concept initiatives, technology evaluations, and the introduction of emerging cloud and AI technologies
-
Collaborate with stakeholders, architects, and leadership teams to translate complex business requirements into actionable technical designs
-
Mentor and guide engineering teams, providing technical leadership, code reviews, and architectural guidance to elevate overall team capability
-
Own the lifecycle of large-scale enterprise systems, including design, deployment, monitoring, optimization, and continuous improvement
Requirements
-
8+ years of progressive experience in cloud engineering, systems architecture, and large-scale platform delivery
-
At least 1 year of relevant leadership experience
-
Expert-level proficiency in Microsoft Azure, including compute, storage, identity, networking, and platform services
-
Deep expertise in Kubernetes, Azure Kubernetes Service (AKS), and Docker container orchestration for production workloads
-
Advanced knowledge of cloud networking, security architecture, and governance frameworks across enterprise environments
-
Skills in large-scale Infrastructure as Code design and implementation using Terraform, Bicep, and ARM templates
-
Background in DevOps maturity models, CI/CD pipeline design, and platform engineering practices that enable self-service developer experiences
-
Competency in OS administration across Windows and Linux environments, along with proficiency in scripting languages for automation
-
Expertise in Generative AI fundamentals, Agentic AI concepts (multi-agent systems, orchestration), and Agentic Workflows for enterprise use cases
-
Understanding of Retrieval-Augmented Generation (RAG) architectures at scale, including vector databases, embeddings, and prompt engineering
-
Hands-on experience with Azure AI Foundry and LLM integrations such as OpenAI and Claude for production-grade applications
-
Familiarity with AI orchestration frameworks, including Semantic Kernel (preferred), LangChain/LangGraph, and CrewAI
-
Strong architecture and leadership skills with a proven track record of owning and evolving large-scale enterprise systems end-to-end
-
Proficient communication skills in English (B2 level or higher)
We offer
-
Opportunity to work on technical challenges that may impact across geographies
-
Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
-
Opportunity to share your ideas on international platforms
-
Sponsored Tech Talks & Hackathons
-
Unlimited access to LinkedIn learning solutions
-
Possibility to relocate to any EPAM office for short and long-term projects
-
Focused individual development
-
Benefit package:
-
Health benefits
-
Retirement benefits
-
Paid time off
-
Flexible benefits
-
Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)