Job Description
We’re hiring a Senior AI/ML Engineer with 5-7 years to lead the design, optimization, and deployment of advanced AI systems. This role goes beyond integration — you’ll architect, fine-tune, and scale LLMs, vision, and speech models, while guiding junior engineers and influencing the AI roadmap for GeekyAnts. You’ll work across core ML/DL, RAG systems, AI in Robotics/IoT, and inference optimization, ensuring production-grade reliability, explainability, and innovation.
Key Responsibilities
Architecture & System Design
- Architect and deploy end-to-end AI systems — from data pipelines to model serving.
- Design modular SDKs for multi-provider AI integration (OpenAI, Claude, Gemini, LLaMA).
- Lead decision-making on cloud vs self-hosted LLM deployment (Ollama, vLLM, TGI).
- Guide infrastructure design for scalability, observability, and cost efficiency using GPU clusters, Ray, or KServe.
- Collaborate with backend, MLOps, and infra teams to ensure high availability and low latency across AI workloads.
Core ML / DL Development
- Train and fine-tune models (CNN, RNN, Transformers) across text, vision, and speech domains.
- Implement LoRA / PEFT fine-tuning for custom LLMs, embedding models, and instruction-tuned variants.
- Work with open-source and proprietary model repositories (Hugging Face, Kaggle, Hugging Face Spaces).
- Optimize model architectures for inference performance, quantization, and memory efficiency.
- Conduct A/B testing, cross-validation, and human evaluation on model outputs.
- Build internal evaluation benchmarks and dataset management pipelines for consistent model scoring and comparison.
Data & Dataset Engineering
- Curate, clean, and version-control datasets for text, image, and audio modalities.
- Build pipelines for data labelling, augmentation, and validation using Airflow / Prefect.
- Create and manage feature stores, embedding repositories, and dataset registries.
- Leverage open datasets (e.g., Common Crawl, LAION, OpenImages, LibriSpeech) and integrate custom enterprise datasets.
- Ensure data governance, bias checks, and PII anonymization using Presidio or custom filters.
AI Ops & Deployment
- Automate model workflows with MLflow, Kubeflow, or Vertex AI for experiment tracking and versioning.
- Lead model deployment with vLLM, TGI, or TorchServe, ensuring optimized GPU/TPU utilization.
- Set up continuous evaluation pipelines for model drift, bias, and quality decay using EvidentlyAI and Prometheus.
- Leverage open datasets (e.g., Common Crawl, LAION, OpenImages, LibriSpeech) and integrate custom enterprise datasets.
- Drive adoption of model registries and model cards for transparency and reproducibility.
Team & Technical Leadership
- Mentor and review the work of AI/ML Engineers I & II.
- Collaborate with product, design, and research teams to translate business needs into AI roadmaps.
- Lead POCs and experiments for emerging AI verticals (e.g., multimodal, video, robotics, IoT intelligence).
- Present internal demos, AI reports, and architectural documentation to leadership and clients
Core Skills Required
- Programming: Expert-level Python, with a deep understanding of OOP, async, and design patterns
- Frameworks: PyTorch, TensorFlow, Hugging Face Transformers, LangChain,LlamaIndex.
- Model Ops: MLflow, KServe, TorchServe, vLLM, TGI.
- Data Stack: Airflow / Prefect, pgvector, Milvus, Pinecone, FOSS, PostgreSQL.
- Infra: Docker, Kubernetes, Ray, GPU servers, Cloud AI (Vertex AI, Bedrock, Azure).
- Evaluation & Metrics: Familiarity with BLEU, ROUGE, and latency/throughput metrics for AI models.
- Security: Secure Vaults, Microsoft Presidio, Fairlearn / AIF360 awareness for data and bias governance.
Good-to-Have Skills
- Experience with distributed training, quantization, and mixed-precision optimization.
- Experience with model compression, distillation, or low-rank adaptation for efficiency.
- Contribution to open-source AI frameworks or Hugging Face Spaces.
- Research exposure in LLM alignment, prompt optimization, or multimodal reasoning.
- Understanding of AI cost governance, observability, and MLOps automation.
Soft Skills
- Leadership and mentorship mindset with strong communication skills.
- Strategic thinker with the ability to drive architectural decisions.
- Ownership-driven approach to solving complex AI problems.
- Strong documentation and cross-team collaboration habits.
What You’ll Build
- Enterprise-scale RAG and Agentic Systems across domains and modalities.
- Self-hosted AI stack for multi-modal intelligence (text, image, voice).
- Reusable AI SDKs, dataset registries, and model inference frameworks powering the GeekyAnts AI ecosystem.
- Open-source contributions and internal model spaces that expand GeekyAnts’ AIfootprint.
Pay: ₹1,600,000.00 - ₹1,931,110.83 per year
Benefits:
- Cell phone reimbursement
- Flexible schedule
- Health insurance
- Paid sick time
- Provident Fund
Work Location: In person