Role Overview
We are looking for an AI / Prompt Engineer to design, refine, and operationalise high-impact AI interactions across our learning, assessment, and competency platforms. This role will be central to developing structured prompts, fine-tuning model behaviour, and creating domain-aware AI workflows that support examination preparation, scenario-based training, automated feedback, and knowledge generation for maritime professionals.
Working closely with the Co-Founders and the Senior Full-Stack Developer, you will design, test, and continuously improve the prompt systems and AI pipelines that power our automated assessment evaluation, question generation, and English coaching products.
This is not a role for someone who treats prompt engineering as an afterthought or a workaround. It is a technical discipline in its own right, and we are looking for someone who approaches it with rigour, curiosity, and a systematic mindset. If you enjoy working at the frontier of applied AI and care deeply about the difference between an output that is merely plausible and one that is demonstrably correct, this role is for you.
Key Responsibilities
- Design, iterate, and maintain structured prompt systems that instruct AI models to evaluate maritime exam answers against STCW competency standards with consistency and accuracy.
- Build and version-control prompt libraries covering question generation, answer evaluation, feedback delivery, and English coaching dialogue flows.
- Develop evaluation frameworks to measure AI output quality across multiple dimensions — including factual accuracy, appropriate tone, domain relevance, and consistency across repeated runs.
- Work directly with the Co-Founders to extract and encode maritime domain knowledge into reusable, structured prompt components.
- Continuously research emerging AI model capabilities and test new approaches that could improve product performance or expand its scope.
- Identify, document, and systematically mitigate failure modes — including hallucination, misclassification, and prompt sensitivity issues — and maintain clear records of what was tried and why.
- Collaborate with the Full-Stack Developer to ensure AI pipelines integrate cleanly and efficiently into the product architecture.
- Maintain thorough documentation of all prompts, versions, test results, and rationale in shared team repositories.
Essential Requirements
- Bachelor's degree in Computer Science, Artificial Intelligence, Data Science, Linguistics, or a closely related field
- Minimum 2 years of hands-on commercial experience working with large language model APIs — including OpenAI GPT, Anthropic Claude, or comparable models
- Demonstrated ability to design and refine structured, effective prompts for complex evaluation and generation tasks — not limited to simple chatbot or single-turn interactions
- Experience designing and running systematic evaluation protocols to assess AI output quality across multiple dimensions
- Proficiency in Python for scripting, data handling, and API integration
- Proven ability to work with non-technical subject matter experts and translate domain knowledge into precise prompt structures
- Strong analytical mindset — comfortable diagnosing AI failures critically and iterating systematically on the basis of evidence
Strong written English for prompt authoring and technical documentation
Desirable / Advantageous
- Familiarity with retrieval-augmented generation (RAG) architectures or vector database platforms such as Pinecone or Weaviate
- Experience in educational assessment, competency evaluation, or regulated training contexts
- Knowledge of fine-tuning techniques or model evaluation methodologies
- Exposure to industries where AI output accuracy has genuine operational or safety consequences
- Contributions to AI or NLP communities — published research, open source projects, or an active presence on Hugging Face or similar platforms
- Certification or coursework in AI/ML from recognised providers such as DeepLearning.AI or Coursera
Awareness of AI safety principles and responsible deployment practices in production systems
Job Type: Full-time
Pay: ₹300,000.00 - ₹600,000.00 per year
Experience:
- APIs: 2 years (Required)
- Python: 2 years (Required)
Work Location: In person