Generative AI Technical Architect

Prodapt -
Chennai, Tamil Nadu

Apply Now

Job details

2 hours ago

Qualifications

Elasticsearch
Node.js
Law
Kubernetes
Salesforce
Master's degree
Databases
SQL
OS Kernels
Neo4j
Redis
SDKs
Metadata
AI
Graph databases
Python

Full job description

Overview:

We are looking for a hands-on **Generative AI Technical Architect** who will own the end-to-end architecture of enterprise-scale, knowledge-intensive, agentic AI systems. This is a high-impact role focused on building production-grade Retrieval-Augmented Generation (RAG), Corrective/Controllable-Augmented Generation (CAG), multi-agent frameworks, long-term memory systems, NL2SQL engines, and Small Language Model (SLM)-powered edge/agent deployments using modern ecosystems (LangChain, LlamaIndex, CrewAI, AutoGen, Haystack, DSPy, etc.).

Responsibilities:

Architect and own the enterprise GenAI platform with advanced RAG/CAG pipelines (hybrid search, re-ranking, query rewriting, hypothetical document embeddings (HyDE), parent-child retrieval, knowledge graph + vector fusion).
Design and scale multi-agent / agentic workflows (reasoning + acting, tool use, multi-agent collaboration, hierarchical agents, long-running agents with persistence).
Build production-grade long-term and short-term memory systems (vector stores with metadata filtering, session summarization, entity memory, reflection/memory consolidation).
Lead architecture of enterprise Knowledge Bases (ingestion pipelines, chunking strategies, metadata enrichment, incremental updates, multi-tenant KB isolation).
Own NL2SQL / Text-to-SQL architecture (schema linking, few-shot prompting, self-correction, execution feedback loops, SQL guardrails, multi-database support).
Design and deploy Small Language Models (SLM) for on-device, low-latency, or cost-sensitive agent use cases (Phi-3, Gemma-2B, Mistral-7B, Llama-3.1-8B quantized, TinyLlama, MobileBERT variants).
Define the standard GenAI framework stack (LangChain / LlamaIndex / LangGraph / CrewAI / AutoGen / Microsoft Semantic Kernel / Haystack / DSPy) and create internal libraries/SDKS for the entire organization.
Build observability, tracing, and evaluation frameworks for RAG (RAGAS, TruLens, DeepEval), agents (AgentOps), and NL2SQL accuracy.
Establish governance: prompt injection defense, output sanitization, PII redaction, citation verification, hallucination detection, and enterprise guardrails.
Performance engineering: latency optimization (speculative decoding, caching, batching, query routing), cost optimization (SLM routing, fallback strategies), and multi-region deployment.
Drive GenAI platform roadmap, conduct architecture reviews, and mentor senior engineers building RAG/agent products.

Requirements:

1+ years building and shipping production RAG/CAG systems used by 100K+ daily active users.
Deep expertise in modern retrieval techniques: dense (ColBERT, Splade, bge, e5), sparse (BM25, SPLADE), hybrid, re-ranking (cross-encoders, Cohere Rerank, bge-reranker), sentence transformers, and late interaction models.
Proven track record designing and scaling agentic systems with tool calling, planning (ReAct, Plan-and-Execute, Reflexion), and multi-agent orchestration.
Hands-on experience with vector databases at scale (Pinecone, Weaviate, Milvus, Zilliz, Qdrant, PGVector, Redis, Vespa, Elasticsearch with vector support).
Expert in LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, and DSPy — including custom node creation, memory modules, and production deployment patterns.
Production NL2SQL systems (accuracy >92% on Spider/BIRD benchmarks in real enterprise schemas).
Deployed SLMs in production (quantized 4-bit/8-bit, ONNX/TensorRT-LLM export, edge deployment).
Strong Python, async frameworks, FastAPI, graph databases (Neo4j, FalkorDB for knowledge graphs), and Kubernetes.

### Preferred (Significant Advantage)

Previously defined the GenAI/RAG/agent stack for a unicorn or large enterprise (Jasper, Glean, Adept, Cresta, Moveworks, Salesforce Einstein, Microsoft Copilot team, etc.).
Contributions to LangChain, LlamaIndex, Haystack, or RAGAS open-source repositories.
Built enterprise knowledge bases processing 10M+ documents with sub-second retrieval latency.
Experience with controllable generation (CAG), guided generation (Outlines, Guidance, LMQL), and structured output enforcement.

If you have architected and shipped multiple enterprise RAG + Agent + NL2SQL + Memory systems that are live in production today, and you live and breathe LangChain/LlamaIndex every day — this is your role.

Apply Now

Jobseeker tools

Employer Tools

Browse

Stay Connected