Job Title: Data Engineer
Experience: 4+ Years
Work Shift: 2:00 PM – 10:00 PM
Domain Requirement: Pharmaceutical / Healthcare / Life Sciences
About the Role
We are seeking an experienced Healthcare Data Engineer / Clinical Data Scientist to support the development of high-quality real-world evidence (RWE) datasets. The ideal candidate will work closely with clinical Subject Matter Experts (SMEs) to implement clinical rules, engineer patient-level datasets, and integrate structured and unstructured healthcare data sources. This role requires strong expertise in healthcare data, clinical event algorithms, and large-scale data processing.
Key ResponsibilitiesClinical Rule Implementation
- Translate SME-designed clinical rules into scalable, reproducible data pipelines operating against centralized healthcare data lakes.
- Implement protocol-driven clinical event definitions and business logic.
Structured & Unstructured Data Integration
- Engineer patient-level features using medical claims, pharmacy claims, EMR/EHR data, laboratory results, and NLP-derived outputs.
- Integrate structured and unstructured healthcare data to improve clinical data completeness and accuracy.
Disease-Specific Dataset Development
- Build and maintain disease-focused datasets, including:
- Cohort identification and construction
- Index date determination
- Treatment sequencing
- Clinical event labeling
- Outcome tracking
Line of Therapy (LOT) Algorithm Development
- Design and implement line-of-therapy algorithms that address real-world treatment complexities, including:
- Combination regimens
- Treatment gaps
- Dose modifications
- Switching patterns
- Off-label therapy usage
NLP Signal Integration
- Incorporate NLP-derived clinical signals such as:
- Diagnosis mentions
- Disease staging
- Biomarker results
- Disease progression indicators
- Combine NLP outputs with structured claims and EMR data to enhance dataset quality.
Data Quality Assurance
- Develop and maintain data quality validation frameworks and reports.
- Conduct sample-level audits to verify clinical logic and dataset accuracy.
- Identify anomalies and recommend corrective actions.
Cross-Functional Collaboration
- Partner closely with clinicians, epidemiologists, data scientists, and SMEs.
- Participate in iterative reviews to refine clinical rules, identify logic gaps, and improve data outputs.
Required Qualifications
- 4+ years of experience in Data Science, Health Data Engineering, Biostatistics, Real-World Data (RWD), Health Economics & Outcomes Research (HEOR), or related life sciences domains.
- Strong proficiency in SQL and Python (preferred) or R.
- Hands-on experience working with:
- Medical Claims Data
- Pharmacy Claims Data
- EMR/EHR Data
- Experience building patient-level healthcare datasets at scale.
- Familiarity with NLP-generated outputs and integrating unstructured clinical information into structured datasets.
- Proven experience implementing clinical event algorithms from protocol-level specifications.
- Strong analytical, problem-solving, and data validation skills.
Preferred Qualifications
- Experience working with cloud-based data platforms such as Snowflake, Redshift, or Databricks.
- Familiarity with OMOP CDM or other healthcare data models.
- Experience with oncology datasets or specialty disease areas.
- Knowledge of Real-World Evidence (RWE) methodologies and healthcare analytics.
- Ability to work effectively within a pod-based, highly collaborative clinical environment.
Technical Skills
- SQL
- Python / R
- Healthcare Claims Data
- EMR/EHR Data
- NLP Integration
- Clinical Data Modeling
- Cohort Building
- Line of Therapy Algorithms
- Data Quality & Validation
- Snowflake / Databricks / Redshift (Preferred)
- OMOP CDM (Preferred)
Application Process
Interested candidates can share their updated resume to:
[email protected]
[email protected]
Pay: ₹70,000.00 - ₹90,000.00 per month
Work Location: Remote