Job Description: Senior AI/NLP Engineer (Medical Text Analytics)
We are Hiring Senior Developer.
- Experience: 2-3 years of software engineering experience with a strong focus on Python and Data/Machine Learning pipelines.
- Immediate joiner/ 15 days notice accepted
About the Role
We are seeking a Senior AI/NLP Engineer to optimize and enhance our automated clinical coding pipeline. Our system processes medical documents (PDFs/Text) to extract structured clinical entities and leverages semantic search and Large Language Models (LLMs) to accurately predict CPT and ICD-10 codes.
You will be responsible for refactoring the existing codebase, eliminating technical debt, optimizing vector search performance, and fine-tuning prompt engineering for local LLM deployments.
Required Qualifications
- NLP & Deep Learning: Proven experience working with NLP libraries (spaCy, NLTK) and Transformer models (Hugging Face, Pipeline API).
- Vector Search: Hands-on experience building and optimizing semantic search engines using FAISS and SentenceTransformers.
- Local LLMs: Familiarity with deploying, running, and prompting open-source LLMs locally (e.g., using Ollama, vLLM).
- Data Processing: Strong proficiency with pandas, complex RegEx, and text manipulation. Experience extracting data from PDFs (e.g., fitz/PyMuPDF) is highly desirable.
- Database Skills: Solid understanding of SQL and hands-on experience integrating Python with MS SQL Server.
- Windows Ecosystem: Experience deploying Python scripts using Windows Task Scheduler or as Windows Services.
- Software Craftsmanship: Strong advocate for clean code, configuration management, version control (Git), and CI/CD practices.
Key Responsibilities
- Pipeline Optimization: Optimize and enhance the existing Python-based extraction and analysis pipelines. Refactor code to improve modularity, remove redundant legacy components, and manage configuration centrally.
- NLP & Semantic Search: Maintain and optimize FAISS-powered nearest-neighbor search systems using SentenceTransformers (all-MiniLM-L6-v2).
- LLM Integration: Manage and optimize local LLM deployments (e.g., Mistral via Ollama), focusing on prompt engineering to accurately extract clinical entities and reason about medical codes.
- Text Extraction: Support and improve robust text extraction processes from medical PDFs using PyMuPDF and spaCy for sentence segmentation and header detection.
- Technical Debt Reduction: Identify and resolve hardcoded paths, duplicated logic, and inactive code paths across all pipeline modules.
- Database Integration: Optimize interactions with MS SQL Server databases using pyodbc for fetching chart/patient metadata and logging pipeline results.
- Deployment: Package and deploy Python scripts as scheduled tasks on Windows Server environments.
Preferred Qualifications
- Experience working with medical data, unstructured clinical notes, or medical coding standards (ICD-10, CPT, HL7).
Experience with extremely fast text processing libraries like FlashText and RapidFuzz
For more info call:9500049243
Mail us to : [email protected]
Pay: ₹200,000.00 - ₹700,000.00 per year
Work Location: In person