Proven experience designing and implementing ETL pipelines in Databricks / Spark and Delta Lake.
Experience - 8+ years
- Strong knowledge of OMOP CDM and experience mapping datasets to OMOP; familiarity with CDISC SDTM is a plus.
- Expertise in data modelling, partitioning, performance tuning, and best practices for large clinical/RWD datasets.
- Experience with vocabulary services and terminology mapping (OHDSI/Athena, UMLS, or similar).
- Experience integrating AI/NLP components into data pipelines (entity extraction, mapping suggestions) is desirable.
- Familiarity with testing frameworks for data (Great Expectations, Deequ), CI/CD, infrastructure as code, and orchestration tools (Databricks Jobs, Airflow).
- Good communication skills and experience working with domain experts to capture requirements. Preferred
- Prior experience in pharma or clinical research environments.
- Knowledge of data governance, privacy regulations and secure handling of patient data.
- Experience with Unity Catalog, Databricks Delta Sharing, and cloud infrastructure (Azure/AWS).
Pay: ₹509,356.57 - ₹1,866,829.63 per year
Work Location: In person