AI Engineer — Voice & Language
4+ Years · Full-Time· Jaipur
About the Role
We're hiring a senior AI Engineer to design, build, and ship production AI systems — with strong emphasis on Voice AI. You'll own the full lifecycle: architecture, training, deployment, and monitoring across language and voice modalities.
What You'll Do
-
LLM & GenAI: Fine-tune and deploy LLMs; build RAG pipelines and agentic workflows (LangChain, LlamaIndex).
-
Voice Pipelines: Architect real-time ASR LLM TTS pipelines with <300 ms latency
- Voice Agents: Build production voice agents with turn-taking, barge-in handling, and emotion-aware dialogue.
-
Speech Fine-Tuning: Adapt ASR/TTS models for domain-specific accents, terminology, and speaking styles.
-
MLOps: Build reproducible ML pipelines (Kubeflow / MLflow); maintain CI/CD, monitoring, and model versioning.
-
Inference Optimization: Apply quantization (GGUF, GPTQ), distillation, and hardware-aware inference (TensorRT, vLLM) to cut cost and latency.
-
APIs & Services: Ship high-performance inference APIs in Python (FastAPI) or Go on Kubernetes.
-
Data & Evaluation: Curate text + speech corpora; define eval harnesses covering WER, MOS, latency P95, and safety.
Requirements
-
4+ yrs ML/software engineering; 2+ yrs on production AI systems
-
Strong Python; PyTorch or TensorFlow
-
LLM fine-tuning: LoRA / QLoRA / PEFT
End-to-end ML pipeline experience (train- serve)
-
Cloud (AWS / GCP / Azure) + Docker / Kubernetes
-
ASR & TTS integration in real-time streaming systems
-
VAD, noise suppression, and barge-in handling
-
Telephony APIs (Twilio, Vonage) or WebRTC experience
-
Whisper / wav2vec fine-tuning for domain adaptation
-
Audio-language models (AudioPaLM, Qwen-Audio, Gemini Audio)
-
Speaker diarization (pyannote.audio) or voice biometrics
-
Prosody control, SSML, expressive TTS synthesis
-
Multilingual ASR/TTS and code-switching pipelines
-
RLHF / Constitutional AI alignment
-
Vector DBs (Pinecone, Weaviate, pgvector)
-
Open-source contributions or published research
Tech Stack
Core
Python
PyTorch
FastAPI / Go
Kubernetes
MLflow
LLM & GenAI
OpenAI / HuggingFace
LangChain
LlamaIndex
vLLM
RAG / Agents
️ Voice AI
STT
TTS
WebRTC / WebSockets
pyannote.audio
Twilio / Vonage
️ Audio Processing
librosa / FFmpeg
Silero VAD
openWakeWord
SSML / Prosody
AEC / Noise Suppression