Job Title: AI/ML Architect
Location: Mumbai (with potential travel to multiple global office locations including the USA, UAE, Saudi Arabia, and Singapore)
Experience Level: 10- 12 years (6–8 years in engineering/data science roles and at least 2-3+ years in a dedicated AI/ML architecture role)
About the Role
We are seeking a highly experienced Enterprise Data and AI Architect to design, implement, and scale enterprise-grade AI solutions across industries such as BFSI, Retail, Manufacturing, and Healthcare. This is a strategic, hands-on leadership role designed to transform legacy data silos into modern Data Mesh or Data Lakehouse architectures. The ideal candidate will bridge the gap between "Proof of Concept" (PoC) and full-scale enterprise production by deploying cutting-edge Machine Learning, Generative AI, and Agentic AI systems. You will collaborate closely with product teams, pre-sales, and executive stakeholders to translate complex business problems into secure, scalable, and autonomous AI solutions.
Key Responsibilities
AI Strategy & Solution Architecture
- Define and evolve the enterprise AI architecture to align with business, data, and technology strategies.
- Ability to translate complex business challenges into scalable AI-driven technical architectures
- Design end-to-end AI and ML architectures spanning data ingestion, pre-processing, model training, deployment, and monitoring.
- Architect scalable, secure, and compliant AI deployment models (Cloud and Hybrid) and intelligent automation solutions.
- Develop reference architectures and reusable patterns for generative AI, Agentic AI, predictive models, conversational systems, and intelligent automation.
- Evaluate and select the appropriate AI technologies, frameworks, cloud services, LLM orchestration frameworks, vector databases, and tooling.
Generative AI, LLM & Agentic AI Implementation
- Architect and build GenAI solutions and LLM-powered applications using RAG, fine-tuning, and prompt engineering.
- Design and implement Agentic AI systems capable of autonomous decision-making, multi-step task execution, and function-calling in production.
- Establish custom orchestration frameworks for multi-agent collaboration, memory management, planning, evaluation harnesses, and tool invocation.
- Work across major AI foundries and APIs, including Azure OpenAI, AWS Bedrock, Google Vertex AI, Anthropic Claude SDK, and OpenAI APIs.
- Design RAG pipelines using vector databases and meticulously tune embeddings to ensure high retrieval quality.
- Optimize prompts, implement response validation mechanisms, and systematically reduce hallucination risks.
- Writing clean and maintainable code, Agile methodologies, Git version control, and testing methodologies
Machine Learning Model Development
- Design, develop, deploy, and optimize advanced ML algorithms, leveraging techniques in supervised/unsupervised learning, deep learning, reinforcement learning, NLP, and computer vision.
- Select appropriate mathematical and statistical modeling techniques, handling comprehensive data preparation, wrangling, feature engineering, and dimensionality reduction.
- Train and deploy ML models utilizing distributed computing frameworks and tools like PyTorch, TensorFlow, MLflow, SageMaker, Azure ML, or Kubeflow.
Data Engineering, Cloud Governance & Integration
- Overhaul legacy data silos into modern Data Lakehouse or Data Mesh architectures to support predictive analytics and real-time business intelligence.
- Partner with data architects to define robust data pipelines, data governance, feature stores, and complex ETL/ELT processes.
- Design secure and performant integration patterns to connect AI agents and models dynamically with ERP, CRM, workflow engines, APIs, and microservices.
MLOps, LLMOps & Production Operations
- Define and implement comprehensive MLOps/LLMOps standards, including CI/CD pipelines, model versioning, lifecycle observability, and rollback processes.
- Establish rigorous monitoring frameworks spanning model drift detection, performance tracking, telemetry, and establishing KPIs for production AI systems.
- Monitor autonomous agent performance, define evaluation metrics (task success rate, reasoning validation, error handling), and track decision audit trails.
- Drive continuous improvement of model performance, computational efficiency, and operational cost.
AI Governance, Security & Ethics
- Establish enterprise AI governance standards, ensuring compliance with data privacy policies (PII handling), role-based access, and IT security protocols (e.g., AWS IAM, GuardDuty).
- Implement "Responsible AI" guardrails tailored for regulated industries, including fairness checks, bias detection, and safety filters.
- Design human-in-the-loop mechanisms for agent oversight, define autonomy thresholds, and establish escalation protocols.
- Ensure complete traceability, automation audit trails, and explainability in AI-driven decisions.
Technical Leadership, Collaboration & Research
- Partner with business, product, engineering, and compliance teams to translate complex business problems into viable AI/ML approaches.
- Create and deliver high-level technical presentations to executive stakeholders using data visualization tools to communicate complex strategies and business impact.
- Provide technical guidance, architectural oversight, and mentorship to junior engineers.
Required Skills & Qualifications
Technical Core & Programming:
- Mandatory: Strong programming proficiency in Python & Libraries such as Pandas, NumPy, and Scikit-learn
- Solid mathematical & Statistical foundation covering linear algebra, calculus, probability theory, and statistics.
- Database: SQL for relational databases and NoSQL for unstructured data handling
- Developing scalable microservices via RESTful APIs, FastAPI, gRPC Server
Machine Learning & Deep Learning:
- Proficiency with ML frameworks such as TensorFlow, PyTorch, and Scikit-learn.
- Hands-on experience implementing algorithms like decision trees, random forests, SVMs, and neural networks.
- Expertise in deep learning architectures (CNN, RNN, GAN).
- Expertise in Natural Language Processing (NLP) and recommendation systems
- Hands-on experience with Multimodal NLP implementation, such as Computer Vision (OpenCV) and Speech-to-Text
Generative AI, LLMs & Agentic Orchestration (Must Have):
- Extensive experience interacting/integration with Foundation LLM Models ( such as GPT, Claude, Gemini, LLaMA, Mistral, and DeepSeek)
- Proven ability in Model Fine-Tuning & Customization
- Experience with LLM frameworks like LangChain and LlamaIndex.
- Proven expertise in designing multi-agent orchestration using LangGraph, CrewAI, AutoGen, or equivalent tools.
- Hands-on experience with AI foundries and Copilots: OpenAI, Azure AI Foundry, MSFT Copilot Studio Agent Builder, AWS Bedrock, Vertex AI, Anthropic Claude, and low-code/no-code platforms
Data Platforms, Vector Databases & Infrastructure:
- Mandatory Skills: Data Warehousing.
- Deep expertise in the AWS Ecosystem (S3, SageMaker, Redshift, Glue, Bedrock, EKS) alongside Azure or GCP exposure.
- Experience with large-scale data processing tools like Snowflake, Databricks, or Apache Spark.
- Implementation of RAG using Vector Databases (Pinecone, FAISS, Chroma, Azure AI Search, etc).
- Moderate-level background in Infrastructure as Code (IaC) via Terraform or AWS CloudFormation.
- Proficiency in architectural patterns (microservices, event-driven, API-first design) and Database knowledge (SQL/NoSQL)
MLOps & Observability Tools:
- Experience across the ML/LLM lifecycle using MLflow, Kubernetes, and evaluation/observability tooling.
- Proficiency in data visualization tools to analyze and present model insights
Experience & Educational Requirements
Education:
- Bachelor’s or Master’s degree in Computer Science, AI, Data Science, Engineering, or a related technical field.
Minimum Experience Requirements:
- 10–12 years of overall professional experience.
- Minimum 4 years of experience in application development, engineering, or solution delivery roles.
- Minimum 2+ years dedicated to AI architecture roles.
- Minimum 3 years of hands-on experience in AI/ML engineering, data science, or AI solution architecture.
- Minimum 2 years designing and implementing scaled Agentic AI and Generative AI solutions/platforms in active operations.
Preferred & "Good to Have" Qualifications:
- Certifications — AWS/Microsoft Solution/AI Architect, Agentic AI, RAG
- Proven track record managing Petabyte-scale data and enterprise-wide transformation or platform buildout programs.
- Strong understanding of AI compliance frameworks, enterprise data governance, privacy, security, and model risk management.
- Direct domain experience deploying AI in BFSI, Manufacturing, Healthcare, or Retail sectors
Pay: ₹3,000,000.00 - ₹3,500,000.00 per year
Work Location: In person