Conversational AI Architect / Principal Engineer (Convogent)

Aivar Innovations
Bengaluru, Karnataka

Quick apply

Job details

Qualifications

Telephony
Azure
Go
Law
PCI
Master's degree
AWS
IVR
Redis
Natural language processing
S3
AI
TypeScript
Python

Full job description

About the Role

Convogent is Aivar's voice AI platform — and voice AI in the enterprise is genuinely hard. We're not building chatbots. Our agents handle real financial transactions, healthcare conversations, and logistics dispatch for enterprise customers across the US. That means sub-second latency, regulated-industry compliance, ASR error tolerance, barge-in handling, and LLM orchestration that doesn't hallucinate when the stakes are real.

We need someone to own the conversational intelligence layer end-to-end — set the architecture bar, write code alongside the team, and partner directly with enterprise customers on deployment. This is the most senior individual contributor seat on Convogent. You'll architect the system, write the hardest parts yourself, and raise the level of everyone around you.

What You'll Own

Dialogue architecture — turn-taking, barge-in, interruption handling, fallback policies, multi-turn state management
LLM orchestration — prompt design, function calling, structured outputs, tool use, RAG over enterprise knowledge bases, hallucination guardrails
Voice stack design — streaming ASR, streaming TTS, latency budgeting across the full pipeline (target: <800ms perceived response)
Telephony integration — Twilio Programmable Voice, Media Streams, SIP-level integration, IVR replacement architectures
Eval and observability — conversation evals, regression suites across LLM providers, call-level observability, failure-mode analytics
Compliance — PII redaction, call recording (HIPAA, PCI-DSS), human-in-the-loop escalation paths
Multi-tenant platform design — scalability, cost-per-call economics, deployment automation
Technical leadership — architecture RFCs, mentoring, enterprise customer engagement, hiring

Must-Have Requirements

Technical Depth

4+ years building conversational AI in production — voice or chat agents serving real users at scale (not POCs, hackathons, or demos)
Telephony fluency — hands-on experience with Twilio (Programmable Voice, Media Streams) or equivalent (Vonage, Plivo, Genesys, Amazon Connect). SIP-level understanding is a plus.
Production LLM engineering — function calling, structured outputs, conversation state, eval frameworks. Real experience with GPT-4 / Claude / Bedrock / Llama — not just "explored."
Voice pipeline expertise — streaming ASR (Deepgram, AssemblyAI, Whisper-large, Riva) and streaming TTS (ElevenLabs, Polly, Azure Neural). You can design the latency budget end-to-end.
AWS-native delivery — Lambda, EKS, Bedrock, Connect, Polly, Transcribe. Real architectures shipped, not just "familiar with AWS."
Strong Python (and ideally Go or TypeScript) — production-grade async code, not notebook engineering

Architect-Level

Has written design docs that shaped decisions across multiple teams
Has owned a system end-to-end through at least one major scale event or production incident
Can hold a technical conversation with both a junior engineer and an enterprise CTO

Nice-to-Have

Experience deploying voice AI in fintech, healthcare, or logistics
Open-source contributions to conversational AI, NLP, or speech projects
Conference talks, papers, or design docs published externally
Multi-modal model experience (voice + vision)
Cost optimization for LLM inference at scale

Key Technologies You'll Work With

Python, FastAPI, Twilio (Programmable Voice, Media Streams), Deepgram / AssemblyAI / Whisper, ElevenLabs / Polly, Anthropic Claude, AWS Bedrock, OpenAI, Llama, LangChain / LangGraph, Strands Agents, OpenSearch / Pinecone / pgvector, Redis, Kafka, AWS (EKS, Lambda, Connect, S3, Transcribe), Prometheus / Grafana, OpenTelemetry

What Success Looks Like (First 90 Days)

Audit the current Convogent voice pipeline end-to-end; produce a latency and reliability baseline
Ship at least one architectural improvement that measurably moves a customer-facing metric (latency, containment rate, or cost-per-call)
Stand up the conversation evals framework with regression coverage across our top 3 LLM providers
Set the architecture bar — RFC template, design review cadence, and engineering standards adopted by the Convogent team
Engage directly with at least 2 enterprise customers on deployment or technical strategy

Why Aivar, Why Now

Real backing, real revenue — AWS Preferred Partner. Backed by Bessemer Venture Partners and Sorin Investments. Enterprise customers shipping in production across fintech, healthcare, and logistics.
Hard problems — Voice AI for regulated enterprise transactions is genuinely difficult. If you've been waiting for a conversational AI problem worth your seniority, this is it.
Startup speed, enterprise depth — Weeks not quarters. Direct customer access. No five-layer hierarchy between you and the decisions that matter.
AI-first team — We use AI tools daily to compress how we build. You'd join a team that expects you to ship faster with AI, not just build AI products.
Founding-team energy on Convogent — You'd shape the platform, the team, and the technical culture from a position of real influence.

Quick apply

About the Role

What You'll Own

Must-Have Requirements

Technical Depth

Architect-Level

Nice-to-Have

Key Technologies You'll Work With

What Success Looks Like (First 90 Days)

Why Aivar, Why Now

Jobseeker tools

Employer Tools

Browse

Stay Connected