About IHX
IHX is building India’s most trusted health tech infrastructure platform—powering
real-time, consent-led claims and data exchange between insurers and 30,000+
hospitals across 1,200+ cities. With $1B+ in claims processed yearly, IHX sits at the
center of the next-gen health insurance stack.
About the Role
We are building the next-generation automated health-insurance claims processing platform, leveraging AI/ML, deep learning, NLP, OCR, and LLM-powered intelligence. As a Lead Data Scientist, you will drive the design, development, deployment, and optimization of AI models that power large-scale claims decisioning across multiple regions. This is a high-impact leadership role where you will work independently, set technical direction, mentor a diverse team, and ensure reliable production performance of mission-critical models.
Key Responsibilities
Model Development, Deployment & Production Support
-
Design, develop, train, and validate models used in automated health-claims processing.
-
Own the end-to-end machine learning pipelines including data ingestion, feature engineering, modeling, validation, deployment, and monitoring.
-
Monitor model performance, drift, SLAs, and stability in real-time production environments.
-
Lead root-cause analysis, bug resolution, and continuous improvement of production models.
-
Build state-of-the-art models including classical ML, deep learning, NLP, OCR, LLMs, transformers, and generative AI.
-
Implement scalable model serving and continuous training strategies using modern MLOps tools.
Build Efficient, Scalable & High-Accuracy Models
-
Optimize models for accuracy, latency, memory footprint, and infrastructure cost.
-
Implement model compression, distillation, and quantization when required to meet SLAs.
-
Ensure solutions perform reliably across heterogeneous real-world datasets and regions.
Implement End-to-End ML Pipelines
-
Architect and implement automated ML pipelines covering structured, unstructured, and document based data.
-
Build feature engineering, model training, validation, and retraining workflows.
-
Implement CI/CD for ML, model versioning, and automated retraining strategies.
-
Work closely with engineering teams to operationalize ML using MLOps best practices.
Expertise in a Wide Range of AI Techniques
-
Hands-on experience with classical ML models including tree-based models, linear models, clustering, and anomaly detection.
-
Experience with deep learning architectures such as CNNs, RNNs, and Transformers.
-
Strong background in NLP and LLM-based solutions for extraction, summarization, classification, and claim interpretation.
-
Experience building OCR pipelines for document parsing, form extraction, and image understanding.
-
Experience applying generative AI for reasoning, rule extraction, and claim scenario understanding.
-
Ability to evaluate and select the most appropriate technique for each problem.
Work Independently on High-Scale Business Use Cases
-
Own ML modules deployed across multiple geographies, regulations, and insurance ecosystems.
-
Ensure scalability and robustness for high-volume claims processing workloads.
-
Collaborate with product, engineering, and operations teams to translate business requirements into ML solutions.
Strong Technical Acumen
-
Deep understanding of data structures, machine learning algorithms, and modern AI architectures.
-
Proficiency in Python, ML frameworks such as PyTorch and TensorFlow, and cloud platforms including AWS, GCP, or Azure.
-
Familiarity with distributed systems, microservices, APIs, and containerized deployments.
-
Ability to conduct architecture reviews and guide engineering teams on ML integration.
-
Experience building scalable data pipelines and feature stores.
-
Define data quality standards, metadata tracking, and experiment management practices.
-
Lead by example with strong individual contributions on critical projects.
-
Write high-quality, production-ready Python code using frameworks such as PyTorch, Hugging Face, LangChain, or Ollama.
-
Conduct rigorous model validation, interpretability analysis, and bias detection.
Team Leadership, Mentoring & Collaboration
-
Lead, mentor, and inspire data scientists, ML engineers, and analysts.
-
Foster a culture of ownership, experimentation, innovation, and continuous learning.
-
Collaborate cross-functionally with product, engineering, quality, and operations teams.
-
Demonstrate empathy, flexibility, and leadership in fast-paced environments.
Required Qualifications
-
Engineering degree is mandatory.
-
8+ years of experience in data science or machine learning, with 3–5 years in a leadership role.
-
Proven experience building and deploying ML models in production at scale.
-
Strong foundation in statistics, machine learning fundamentals, optimization, and deep learning.
-
Expertise in NLP, transformers, LLM fine-tuning, embeddings, computer vision, OCR, time-series modeling, and predictive modeling.
-
Advanced proficiency in Python, SQL, ML frameworks, and cloud platforms.
-
Demonstrated success leading teams and delivering enterprise-scale AI systems.
Preferred Qualifications
-
Experience in health-insurance or health-claims processing ecosystems.
-
Understanding of regulatory and compliance constraints in healthcare data.
-
Knowledge of healthcare data standards such as HL7, FHIR, ICD, CPT, and SNOMED.
-
Experience with MLOps tools including MLflow, Kubeflow, Airflow, Docker, and CI/CD pipelines.