EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a skilled and passionate Lead Systems Engineer with Data DevOps/MLOps expertise to drive innovation and efficiency across our data and machine learning operations.
Responsibilities
-
Design, deploy, and manage CI/CD pipelines for seamless data integration and ML model deployment
-
Establish robust infrastructure for processing, training, and serving machine learning models using cloud-based solutions
-
Automate critical workflows such as data validation, transformation, and orchestration for streamlined operations
-
Collaborate with cross-functional teams, including data scientists and engineers, to integrate ML solutions into production environments
-
Improve model serving, performance monitoring, and reliability in production ecosystems
-
Ensure data versioning, lineage tracking, and reproducibility across ML experiments and workflows
-
Identify and implement opportunities to improve scalability, efficiency, and resilience of the infrastructure
-
Enforce rigorous security measures to safeguard data and ensure compliance with relevant regulations
-
Debug and resolve technical issues in data pipelines and ML deployment workflows
Requirements
-
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
-
8+ years of experience in Data DevOps, MLOps, or related disciplines
-
Expertise in cloud platforms such as Azure, AWS, or GCP
-
Skills in Infrastructure as Code tools like Terraform, CloudFormation, or Ansible
-
Proficiency in containerization and orchestration technologies such as Docker and Kubernetes
-
Hands-on experience with data processing frameworks including Apache Spark and Databricks
-
Proficiency in Python with familiarity with libraries including Pandas, TensorFlow, and PyTorch
-
Knowledge of CI/CD tools such as Jenkins, GitLab CI/CD, and GitHub Actions
-
Experience with version control systems and MLOps platforms including Git, MLflow, and Kubeflow
-
Understanding of monitoring and alerting tools like Prometheus and Grafana
-
Strong problem-solving and independent decision-making capabilities
-
Effective communication and technical documentation skills
Nice to have
-
Background in DataOps methodologies and tools such as Airflow or dbt
-
Knowledge of data governance platforms like Collibra
-
Familiarity with Big Data technologies such as Hadoop or Hive
-
Showcase of certifications in cloud platforms or data engineering tools
We offer
-
Opportunity to work on technical challenges that may impact across geographies
-
Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
-
Opportunity to share your ideas on international platforms
-
Sponsored Tech Talks & Hackathons
-
Unlimited access to LinkedIn learning solutions
-
Possibility to relocate to any EPAM office for short and long-term projects
-
Focused individual development
-
Benefit package:
-
Health benefits
-
Retirement benefits
-
Paid time off
-
Flexible benefits
-
Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)