Role Purpose
As an AI Engineer at Prevalent AI, you will independently design, build, optimize , and deploy production-grade Generative AI systems across our Exposure Management and Data Fabric platforms. You will own end-to-end AI components such as RAG pipelines, multi-agent workflows, LLM-backed APIs, guardrail-enforced inference flows, and cloud-native AI integrations, while collaborating closely with platform, backend, and product teams.
This role is suited for engineers with hands-on, real-world experience building and operating GenAI systems in production, who can take ownership of design decisions, performance tuning, and reliability of AI-driven features.
Key Accountabilities
- Design, build, and own production-ready GenAI systems, including RAG pipelines, embedding workflows, vector search architectures, tool-using agents, and LLM-integrated microservices.
- Create and manage MCP servers and associated tools, integrating and orchestrating them via AI agents
- Develop and maintain Fast API -based AI services integrated with LLMs, vector databases, cloud inference endpoints, and orchestration layers.
- Architect and implement agentic AI pipelines using frameworks such as Lang Chain , Lang Graph , ADK, Crew AI , or other relevant agent-based frameworks for multi-step reasoning, tool orchestration, autonomous agents, and structured LLM workflows.
- Integrate and operate cloud-based AI services using Google ADK (Gemini / Vertex AI), AWS Bedrock, or Azure OpenAI, including model selection, endpoint configuration, and cost-aware inference.
- Apply advanced prompt engineering strategies (structured prompting, React , CoT , few-shot, tool-calling) and systematically reduce hallucinations and failure modes.
- Implement and contribute to LLM fine-tuning workflows ( LoRa , QLoRA , PEFT), including dataset preparation, training, evaluation, and deployment considerations.
- Design and enforce AI guardrails using frameworks such as NeMo Guardrails or Guardrails AI to ensure policy-compliant, safe, and explainable outputs.
- Lead model evaluation and optimization, focusing on latency, accuracy, robustness, hallucination mitigation, and cost efficiency.
- Own testing and deployment of AI services, including unit tests, integration tests, CI/CD pipelines, and environment-specific configurations (cloud/on-prem).
- Produce and maintain high-quality technical documentation covering prompts, workflows, vector schemas, architectural decisions, and API contracts.
Collaborate with cross-functional teams to translate product requirements into scalable, reliable AI solutions and mentor junior engineers when needed.
Skills & Experience
Must have skills:
- Strong hands-on experience with LangChain and LangGraph for building and operating complex LLM workflows and agentic systems.
- Proven experience designing and deploying Retrieval-Augmented Generation (RAG) pipelines using embedding models and vector databases such as FAISS, Pinecone, Chroma, or equivalent.
- Solid backend engineering experience with FastAPI , including async APIs, dependency injection, authentication, and service observability.
- Practical experience with LLM fine-tuning approaches ( LoRA , QLoRA , PEFT) and understanding of when to fine-tune vs prompt vs retrieve.
- Advanced understanding of prompt engineering, including CoT , React , tool calling, schema-based prompting, and prompt versioning strategies.
- Experience implementing AI safety and guardrails, including output validation, policy enforcement, and prompt injection mitigation.
- Hands-on exposure to cloud AI platforms such as Google ADK / Vertex AI, AWS Bedrock, or Azure OpenAI in production environments.
Strong Python skills with experience using Transformers, Hugging Face, embedding models, and inference optimization techniques.
Good to have skills:
- Exposure to FastMCP or similar frameworks is an added advantage.
- Good understanding of LLM evaluation metrics, hallucination control strategies, and real-world failure patterns.