AI Engineer — LLM Systems (Clinical AI)
Location: On-site — New Delhi · Full-time · Immediate joiners preferred
We're building OneRx — a clinical decision-support platform that gives doctors evidence-grounded answers, trial matches, drug & pricing data, and regulatory information, every claim backed by a citation they can trust at the point of care. It's regulated medical software, so our bar for correctness isn't "good enough for a demo" — a confidently wrong answer is a patient-safety problem, not just a quality miss.
We're looking for a hands-on AI Engineer to own LLM systems end-to-end: agentic pipelines, the Python backend, retrieval and grounding, and — critically — the prompt and eval layer where clinical reliability is won or lost. Build-from-scratch, high-ownership role on a lean founding team, working directly with the founder on architecture and product.
What you'll do
- Design multi-agent / multi-step pipelines in LangGraph
- Write production Python — FastAPI services, REST APIs, background workers, streaming (SSE) paths
- Build and harden RAG over messy real-world clinical sources
- Own prompt engineering as a core discipline — system prompts, few-shot, CoT, structured JSON/XML outputs — designed, versioned, and tested
- Build the eval harness that keeps it honest — ground-truth sets, regression suites, shadow-mode before enforcement, accuracy scorecards
- Drive down hallucination so every clinical claim is grounded — including knowing when to refuse or block an answer rather than fill a gap
- Route across providers for the right cost/latency/quality per task
- Engineer for determinism and cost — correct cache keys, token budgets, latency targets
- Containerize and ship with Docker on Azure
Must-haves
- Strong prompt engineering — hands-on command of system-prompt design, few-shot, CoT, structured outputs, prompt versioning, and an eval-driven workflow (promptfoo, LangSmith, or your own). This is the core of the role, not a bonus.
- Strong, production-grade Python — clean services, not notebooks
- LangGraph / LangChain — you've actually built agentic, multi-node pipelines
- RAG in practice — vector DBs (Pinecone / FAISS / Qdrant), retrieval, grounding, and getting citations right on imperfect sources
- FastAPI + REST API development
- LLM APIs — Anthropic, OpenAI / Azure OpenAI, Hugging Face
- Docker and containerized deployment
Nice to have
- vLLM or other serving frameworks; streaming (SSE/WebSockets)
- Postgres / MongoDB / Redis in production
- MLOps / observability tooling
- Experience in a domain where being wrong has real consequences and outputs must be auditable (healthcare, fintech, legal)
You'll fit if you
- Have 2–4 years building real AI/ML products (an exceptional 1-year portfolio counts)
- Can show shipped LLM systems — agents, RAG, or assistants real users actually used, ideally on GitHub
- Can show prompt craft: prompts, eval suites, or measurable quality gains
- Design before you code, and care that the thing is correct, not just that it runs
- Move fast from idea to working system and own ambiguity on a small team
Why join
- Build a high-impact clinical AI product from the ground up, where quality genuinely matters
- Direct line to the founder on architecture and product strategy
- Lean, execution-focused team; high ownership, fast iteration
How to apply
Email resume + GitHub/portfolio to [email protected]
Subject: AI Engineer Application – [Your Name]
Job Types: Full-time, Permanent
Pay: ₹30,000.00 - ₹40,000.00 per month
Benefits:
- Commuter assistance
- Food provided
- Health insurance
- Internet reimbursement
Application Question(s):
- minimum 2 year as an AI developer
Experience:
- AI: 2 years (Required)
- Machine learning: 1 year (Required)
Work Location: In person