Role:Senior AI Engineer – Agentic Voice AI
Place of Posting:Gurgaon (Work from Office)
Yrs of Exp:3 - 7 Years
No. of Position:2
Job Summary
We are seeking a highly skilled AI Engineer with strong expertise in Agentic Voice AI systems to design, build, and deploy real-time conversational voice agents. The ideal candidate will have hands-on experience with Pipecat, Vapi, speech-to-text (STT), text-to-speech (TTS), LLM orchestration, and API-driven system integrations. This role requires deep technical ownership from architecture to production deployment.
Key Responsibilities
- Design and develop real-time, low-latency Agentic Voice AI systems for inbound and outbound calling use cases
- Build conversational pipelines using Pipecat (audio streaming, orchestration, event handling)
- Implement and manage voice agents using Vapi (call flows, tool calling, webhook handling)
- Develop agent logic with LLMs (OpenAI / Anthropic / Azure OpenAI / Gemini)
- Implement multi-step agent workflows (intent detection, tool execution, memory, and decision-making)
- Integrate STT services (Deepgram, Whisper, Google Speech, Azure Speech)
- Integrate TTS services (ElevenLabs, Azure TTS, Google TTS)
- Optimize audio streaming, turn-taking, barge-in handling, and silence detection
- Build and maintain REST and WebSocket APIs for real-time communication
- Integrate with third-party systems (CRM, ticketing systems, databases, internal tools)
- Implement secure authentication (OAuth, API keys, JWT)
- Handle webhook-based event systems and async workflows
- Deploy AI services using FastAPI / Flask
- Containerize applications using Docker
- Work with cloud platforms (AWS / GCP / Azure)
- Monitor performance, latency, and reliability of voice systems
- Implement logging, tracing, and error handling for production systems
- Analyze call logs and conversation transcripts to improve agent performance
- Implement prompt optimization, agent memory, and retrieval (RAG if required)
- Optimize cost, latency, and response quality
Required Skills & Qualifications:
- Core Technical Skills
- Strong proficiency in Python
- Hands-on experience with Pipecat and/or Vapi
- Experience building Agentic AI systems (tool calling, planning, memory)
- Experience with LLM APIs (OpenAI, Anthropic, Gemini, etc.)
- Strong understanding of real-time systems and audio streaming
Experience in below Voice AI Stack:
- STT: Deepgram, Whisper, Google, or Azure
- TTS: ElevenLabs, Azure TTS, Google TTS
- Telephony & Voice APIs (Vapi, Twilio, SIP concepts preferred)
- Backend & DevOps
- FastAPI / Flask
- REST APIs & WebSockets
- Docker & basic CI/CD
- Cloud deployment experience
Good to Have (Preferred):
- Experience with RAG pipelines (LangChain, LlamaIndex, vector databases)
- Knowledge of call center workflows (IVR, escalation, human handoff)
- Experience with n8n or workflow orchestration tools
- Understanding of latency optimization for real-time AI systems
- Prior experience building AI calling bots or voice assistants
Soft Skills:
- Strong problem-solving and debugging skills
- Ability to own features end-to-end
- Clear communication with technical and non-technical stakeholders
- Startup mindset with bias toward execution