About Role
We are seeking a Machine Learning Engineer (MLE) to develop, deploy, and manage production-level Machine Learning and Generative AI solutions across underwriting, claims, sales, and customer engagement functions.
This role lies at the intersection of software engineering, applied machine learning, and emerging GenAI technologies, with a strong focus on delivering reliable ML features into production. You will collaborate closely with data scientists, MLOps engineers, and product teams to convert models and experimental work into scalable, monitored, and compliant services within a regulated BFSI environment.
What You’ll Do
- Design, build, and deploy machine learning and NLP models using Python and modern ML frameworks, ensuring scalability and reliability in production.
- Package models for training and inference using Docker and standardized ML templates; develop and deploy inference services using FastAPI and/or gRPC in line with platform best practices.
- Integrate ML services with downstream systems such as APIs, batch workflows, and real-time pipelines, supporting both online and offline inference scenarios.
- Develop, maintain, and automate end-to-end ML pipelines using tools like Airflow, Kubeflow Pipelines, or AWS Step Functions, covering feature engineering, model training, evaluation, and deployment.
- Work with model registries (e.g., MLflow, SageMaker) to enable model versioning, approvals, controlled releases, and experiment tracking for governance and reproducibility.
- Contribute to GenAI and LLM-based solutions, including RAG applications using Hugging Face models and vector databases, along with supervised fine-tuning (SFT), prompt engineering, and evaluation/versioning of prompts.
- Assist in building agentic AI workflows using frameworks such as LangChain, CrewAI, or similar, including tool integration, memory handling, and multi-step reasoning under senior guidance.
- Participate in A/B testing, canary releases, and shadow deployments for ML and GenAI systems, while monitoring performance metrics such as quality, latency, safety, and cost.
- Collaborate with data scientists, researchers, and platform/infrastructure teams to convert experiments into production-ready ML systems.
- Uphold high engineering standards by maintaining clear documentation, engaging in code reviews and design discussions, and continuously improving testing, CI/CD, and ML development processes.
Minimum Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related discipline, or equivalent hands-on industry experience.
- 3–5 years of professional experience as a Machine Learning Engineer or Software Engineer working on production ML systems, including delivering at least one ML or NLP model end-to-end.
- Strong expertise in Python, with experience developing and deploying services using FastAPI and/or gRPC, along with solid knowledge of data structures, algorithms, and core system concepts.
- Hands-on experience with machine learning workflows, including supervised learning or NLP models, feature engineering, training, fine-tuning, evaluation, and running models in production beyond experimentation.
- Understanding of batch and real-time inference patterns and their impact on system design and performance.
- Experience with cloud platforms such as AWS, GCP, or Azure, along with practical knowledge of Docker, basic Kubernetes concepts, and CI/CD pipelines for ML or backend systems.
- Strong foundation in deep learning (CNNs, RNNs, Transformers), including training and fine-tuning.
- Practical experience with PyTorch and/or TensorFlow (PyTorch preferred for modern LLM applications).
- Familiarity with GPU-based training and inference, including CUDA basics and mixed precision techniques.
- Hands-on experience in deep learning model training and fine-tuning using PyTorch and/or TensorFlow, with a solid understanding of Transformer architectures for NLP and GenAI use cases.
Preferred Qualifications
- Experience working with LLMs, RAG architectures, or conversational AI systems.
- Hands-on exposure to Supervised Fine-Tuning (SFT) or instruction-tuning approaches.
- Familiarity with agentic AI frameworks such as LangChain, CrewAI, AutoGen, or similar.
- Experience with vector databases like OpenSearch, Pinecone, or FAISS.
- Exposure to ML pipeline orchestration tools such as Airflow or Kubeflow.
- Understanding of model evaluation techniques, quality metrics, and data/model drift concepts.
- Experience with GPU-based training/inference and optimization techniques such as quantization, distillation, or deployment using ONNX/TorchScript
Work Location: Remote