We are seeking a highly skilled Senior AI Engineer with deep expertise in Agentic frameworks, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, MLOps/LLMOps, and end-to-end GenAI application development. In this role, you will design, develop, fine-tune, deploy, and optimize state-of-the-art AI solutions across diverse enterprise use cases including AI Copilots, Summarization, Enterprise Search, and Intelligent Tool Orchestration.
Develop and Fine-Tune LLMs (e.g., GPT-4, Claude, LLaMA, Mistral, Gemini) using instruction tuning, prompt engineering, chain-of-thought prompting, and fine-tuning techniques.
Build RAG Pipelines: Implement Retrieval-Augmented Generation solutions leveraging embeddings, chunking strategies, and vector databases like FAISS, Pinecone, Weaviate, and Qdrant.
Implement and Orchestrate Agents: Utilize frameworks like MCP, OpenAI Agent SDK, LangChain, LlamaIndex, Haystack, and DSPy to build dynamic multi-agent systems and serverless GenAI applications.
Deploy Models at Scale: Manage model deployment using HuggingFace, Azure Web Apps, vLLM, and Ollama, including handling local models with GGUF, LoRA/QLoRA, PEFT, and Quantization methods.
Integrate APIs: Seamlessly integrate with APIs from OpenAI, Anthropic, Cohere, Azure, and other GenAI providers.
Ensure Security and Compliance: Implement guardrails, perform PII redaction, ensure secure deployments, and monitor model performance using advanced observability tools.
Optimize and Monitor: Lead LLMOps practices focusing on performance monitoring, cost optimization, and model evaluation.
Work with AWS Services: Hands-on usage of AWS Bedrock, SageMaker, S3, Lambda, API Gateway, IAM, CloudWatch, and serverless computing to deploy and manage scalable AI solutions.
Contribute to Use Cases: Develop AI-driven solutions like AI copilots, enterprise search engines, summarizers, and intelligent function-calling systems.
Cross-functional Collaboration: Work closely with product, data, and DevOps teams to deliver scalable and secure AI products.
3-5 years of experience in AI/ML roles, focusing on LLM agent development, data science workflows, and system deployment.
Demonstrated experience in designing domain-specific AI systems and integrating structured/unstructured data into AI models.
Proficiency in designing scalable solutions using LangChain and vector databases.
Deep knowledge of LLMs and foundational models (GPT-4, Claude, Mistral, LLaMA, Gemini).
Strong expertise in Prompt Engineering, Chain-of-Thought reasoning, and Fine-Tuning methods.
Proven experience building RAG pipelines and working with modern vector stores (FAISS, Pinecone, Weaviate, Qdrant).
Hands-on proficiency in LangChain, LlamaIndex, Haystack, and DSPy frameworks.
Model deployment skills using HuggingFace, vLLM, Ollama, and handling LoRA/QLoRA, PEFT, GGUF models.
Practical experience with AWS serverless services: Lambda, S3, API Gateway, IAM, CloudWatch.
Strong coding ability in Python or similar programming languages.
Experience with MLOps/LLMOps for monitoring, evaluation, and cost management.
Familiarity with security standards: guardrails, PII protection, secure API interactions.
Use Case Delivery Experience: Proven record of delivering AI Copilots, Summarization engines, or Enterprise GenAI applications.