Master Data Management Developer:
MDM Data Engineer to join our AI & Data practice. The candidate will be responsible for building and maintaining end-to-end data pipelines for a global Pharma Customer. This is a hands-on engineering role requiring strong PySpark, AWS, and Reltio MDM skills.
KEY RESPONSIBILITIES
- Develop PySpark-based ETL pipelines for Pre-MDM and Post-MDM processing layers (L0–L4)
- Ingest data from multiple pharma data sources
- Integrate with Reltio Cloud MDM using Data Loader and REST APIs for data load and export
- Manage S3 folder structures, file handling, and Athena query objects
- Schedule, monitor, and troubleshoot jobs using Control-M
- Perform data quality checks and root cause analysis on pipeline failures
- Collaborate with BA and MDM configuration teams to implement change requests
REQUIRED SKILLS & EXPERIENCE
- 3–6 years of experience in data engineering
- Strong hands-on PySpark – production ETL development
- Python – REST API integration, scripting, automation
- SQL – advanced querying, data profiling, DQ validation
- AWS cloud services
- Git / Bitbucket – version control and code review process
GOOD TO HAVE
- Experience with Reltio Cloud MDM – data load, export, tenant configuration
- Familiarity with Control-M or equivalent job scheduling tools
- Pharma domain knowledge – HCP/HCO, Sales, Claims, or Alignment data
- Understanding of MDM concepts – match & merge, survivorship, data governance