AI/ML Subject Matter Expert – Data Analytics & GenAI
Role Overview
We are seeking a highly experienced AI/ML Subject Matter Expert with strong Data Analytics and GenAI implementation expertise. The role requires a hands-on individual contributor who can design, build, and operationalize scalable analytics solutions, KPI computation frameworks, data validation pipelines, machine learning models, and governed GenAI capabilities over enterprise data.
The ideal candidate will bring deep expertise in Python, SQL, analytics engineering, machine learning, and Retrieval-Augmented Generation, and Agentic AI frameworks. They should be comfortable working with complex business metrics, enterprise data sources, and open-source LLM technologies to deliver reliable, explainable, and production-ready analytical solutions.
Key Responsibilities
Analytics Engineering & KPI Development The SME will design, develop, and maintain robust analytics code using Python, pandas, and NumPy to compute, validate, reconcile, and operationalize key business metrics including costing, margins, QBR metrics, and operational performance indicators.
Responsibilities will include building efficient data transformations, optimising performance through vectorization and memory management, and implementing repeatable data pipelines with appropriate testing, logging, validation, and reconciliation controls.
SQL & Data Extraction
The SME will develop advanced SQL queries to extract, transform, and shape data from enterprise systems and cloud data warehouse platforms. This includes complex joins, aggregations, window functions, query optimization, and support for governed metric definitions.
Generative AI / Ask-the-Data Prototype
The SME will implement a governed GenAI prototype that enables users to ask questions over structured and semi-structured enterprise data.
- Using Llama-family or comparable open-source models through Ollama, llama.cpp, vLLM, or similar inference frameworks.
- Building Retrieval-Augmented Generation and Agentic AI pipelines across structured and semi-structured data.
- Designing chunking, embedding, retrieval, and reranking approaches.
- Implement agentic workflows using frameworks such as CrewAI, LangGraph, AutoGen, LlamaIndex Agents, or equivalent production-ready agent orchestration frameworks.
- Producing structured responses such as tables, JSON, and drill-down-ready answers.
- Implementing guardrails for grounded responses, citations, traceability to source data, and safe handling of sensitive fields.
- Support evaluation of GenAI and agentic outputs for accuracy, groundedness, reliability, latency, and operational usability.
Machine Learning & Advanced Analytics
The SME will apply light-to-moderate machine learning techniques where appropriate, including anomaly detection, outlier identification, cost variance analysis, feed failure detection, simple forecasting, trend analysis, model evaluation, and error analysis.
Experimentation, Evaluation & Deployment
The SME will create reproducible experimentation workflows, including test question sets for LLM evaluation, accuracy and groundedness checks, latency profiling, and performance tuning.
The role will also involve packaging deliverables for deployment using Docker, configuration management, and producing clear technical documentation, runbooks, and handover materials
Required Skills & Experience Minimum 4+ years of hands-on experience in data science, analytics engineering, machine learning engineering, or a closely related individual contributor role.
Expert-level Python skills, particularly with:
1. pandas and NumPy
- data cleaning and transformation
- joins, merges, aggregations, and windowed calculations
- time-series data handling
- performance optimization, profiling, and memory management
2. Strong SQL expertise, including:
- complex joins
- aggregates
- window functions
- query tuning and optimization mindset
3. Solid understanding of statistics and machine learning fundamentals, including:
- feature engineering
- model evaluation metrics
- overfitting and validation concepts
- scikit-learn or equivalent ML libraries
4. Practical GenAI implementation experience, including:
- Llama models or comparable open-source LLMs
- Ollama or similar local inference tools
- RAG and Agentic AI frameworks such as LangChain, LlamaIndex, LangGraph, CrewAI, AutoGen, or equivalent
- embeddings and vector stores such as FAISS, pgvector, Weaviate, or Pinecone 5. Strong engineering discipline, including:
- unit testing and data testing
- logging and error handling
- Git-based development workflows
- CI basics
- Docker and environment management
Preferred Qualifications
- Experience with Snowflake or comparable modern cloud data platforms.
- dbt experience, including modeling, testing, and documentation.
- Experience working with enterprise semantic layers or governed metric definitions.
- Experience building lightweight APIs using FastAPI or similar frameworks.
- Familiarity with enterprise security concepts such as RBAC, data masking, sensitive data handling, and audit logging.
Typical Technology Stack
Python, pandas, NumPy, SQL, scikit-learn, Jupyter, Git, Docker, FastAPI, LangChain, LlamaIndex, Ollama, Llama-family models, FAISS, pgvector, Weaviate, Pinecone, Snowflake or equivalent cloud data warehouse.
-
This role is best suited for a senior practitioner who can operate as a technical subject matter expert, collaborate with business and technology stakeholders, and independently drive solution development from prototype through deployment-ready deliverables.