About the role
We are looking for a mid-level Generative AI Engineer to design, build, and deploy production-grade AI systems — spanning retrieval-augmented generation pipelines, multi-agent orchestration, structured document extraction, and conversational AI.
You will work closely with stakeholders and clients to translate business requirements into robust, observable AI solutions on Azure and AWS cloud platforms.
Key responsibilities
Design and implement end-to-end RAG pipelines — chunking strategies, embedding models, hybrid retrieval, and re-ranking — optimised for latency and accuracy.
Build multi-agent systems using orchestration frameworks (LangChain, LlamaIndex, Semantic Kernel), including tool use, memory management, and agent-to-agent communication.
Develop LLM-based structured extraction workflows from unstructured documents (PDFs, forms, contracts) using prompt engineering and fine-tuning where applicable. Implement compliance guardrails, content filtering, and output validation layers to meet enterprise safety and regulatory requirements.
Integrate conversational AI interfaces and chatbot backends via FastAPI-based REST services with robust session and context management.
Explore and prototype multimodal AI capabilities (vision + text) for document understanding and content generation use cases.
Instrument AI systems with evaluation and observability tooling (LangSmith, Ragas, custom tracing) and iterate on quality metrics.
Build and maintain MLOps pipelines — model versioning, CI/CD for AI workloads, deployment on Docker/Kubernetes.
Engage directly with client stakeholders to gather requirements, present solution designs, and communicate trade-offs clearly.
Mentor junior engineers through code reviews and design discussions.
Required skills
RAG Systems — hybrid retrieval, re-ranking, chunking strategies Agentic / Multi-Agent AI — LangChain, LlamaIndex, Semantic Kernel LLM Fine-tuning — LoRA, PEFT, prompt engineering Vector Databases — OpenSearch, Pinecone ML Frameworks — PyTorch, TensorFlow, Scikit-learn Python — primary engineering language FastAPI / REST API development Azure AI Stack — Azure OpenAI, AI Search, Databricks AWS AI Stack — Bedrock, SageMaker, OpenSearch MLOps / CI-CD for AI workloads Docker / Kubernetes — containerised AI deployments AI Evaluation & Observability — LangSmith, Ragas, tracing Specialisation areas Structured data extraction — documents, forms, contracts Compliance & guardrails — content filtering, output validation Conversational AI — chatbots, session and context management Knowledge Graph integration — entity linking, graph-augmented retrieval Multimodal AI — vision + text, document understanding
Qualifications 3–5 years of hands-on software engineering experience, with at least 2 years focused on AI/ML or GenAI system development.
Demonstrable experience shipping at least one production RAG or agentic AI system. Strong Python engineering skills — well-structured, testable, production-ready code.
Solid understanding of transformer architectures, embedding models, and LLM inference.
Experience working with Azure or AWS AI services in production environments.
Ability to communicate technical decisions clearly to non-technical stakeholders.
Bachelor's or Master's degree in Computer Science, Engineering, or equivalent practical experience.
Nice to have
Experience with knowledge graph platforms (Neo4j, AWS Neptune) and graph augmented RAG patterns.
Exposure to multimodal models (GPT-4o, Claude 3, Gemini) for vision + text pipelines. Familiarity with data engineering patterns — Delta Lake, Spark, event-driven ingestion. Contributions to open-source AI/ML projects or published technical writing.
Work Location: In person