Convogent is Aivar's voice AI platform — and voice AI in the enterprise is genuinely hard. We're not building chatbots. Our agents handle real financial transactions, healthcare conversations, and logistics dispatch for enterprise customers across the US. That means sub-second latency, regulated-industry compliance, ASR error tolerance, barge-in handling, and LLM orchestration that doesn't hallucinate when the stakes are real.
We need someone to own the conversational intelligence layer end-to-end — set the architecture bar, write code alongside the team, and partner directly with enterprise customers on deployment. This is the most senior individual contributor seat on Convogent. You'll architect the system, write the hardest parts yourself, and raise the level of everyone around you.
-
Dialogue architecture — turn-taking, barge-in, interruption handling, fallback policies, multi-turn state management
-
LLM orchestration — prompt design, function calling, structured outputs, tool use, RAG over enterprise knowledge bases, hallucination guardrails
-
Voice stack design — streaming ASR, streaming TTS, latency budgeting across the full pipeline (target: <800ms perceived response)
-
Telephony integration — Twilio Programmable Voice, Media Streams, SIP-level integration, IVR replacement architectures
-
Eval and observability — conversation evals, regression suites across LLM providers, call-level observability, failure-mode analytics
-
Compliance — PII redaction, call recording (HIPAA, PCI-DSS), human-in-the-loop escalation paths
-
Multi-tenant platform design — scalability, cost-per-call economics, deployment automation
-
Technical leadership — architecture RFCs, mentoring, enterprise customer engagement, hiring
-
4+ years building conversational AI in production — voice or chat agents serving real users at scale (not POCs, hackathons, or demos)
-
Telephony fluency — hands-on experience with Twilio (Programmable Voice, Media Streams) or equivalent (Vonage, Plivo, Genesys, Amazon Connect). SIP-level understanding is a plus.
-
Production LLM engineering — function calling, structured outputs, conversation state, eval frameworks. Real experience with GPT-4 / Claude / Bedrock / Llama — not just "explored."
-
Voice pipeline expertise — streaming ASR (Deepgram, AssemblyAI, Whisper-large, Riva) and streaming TTS (ElevenLabs, Polly, Azure Neural). You can design the latency budget end-to-end.
-
AWS-native delivery — Lambda, EKS, Bedrock, Connect, Polly, Transcribe. Real architectures shipped, not just "familiar with AWS."
-
Strong Python (and ideally Go or TypeScript) — production-grade async code, not notebook engineering
- Has written design docs that shaped decisions across multiple teams
-
Has owned a system end-to-end through at least one major scale event or production incident
-
Can hold a technical conversation with both a junior engineer and an enterprise CTO
-
Experience deploying voice AI in fintech, healthcare, or logistics
-
Open-source contributions to conversational AI, NLP, or speech projects
-
Conference talks, papers, or design docs published externally
-
Multi-modal model experience (voice + vision)
-
Cost optimization for LLM inference at scale
Python, FastAPI, Twilio (Programmable Voice, Media Streams), Deepgram / AssemblyAI / Whisper, ElevenLabs / Polly, Anthropic Claude, AWS Bedrock, OpenAI, Llama, LangChain / LangGraph, Strands Agents, OpenSearch / Pinecone / pgvector, Redis, Kafka, AWS (EKS, Lambda, Connect, S3, Transcribe), Prometheus / Grafana, OpenTelemetry
-
Audit the current Convogent voice pipeline end-to-end; produce a latency and reliability baseline
-
Ship at least one architectural improvement that measurably moves a customer-facing metric (latency, containment rate, or cost-per-call)
-
Stand up the conversation evals framework with regression coverage across our top 3 LLM providers
-
Set the architecture bar — RFC template, design review cadence, and engineering standards adopted by the Convogent team
-
Engage directly with at least 2 enterprise customers on deployment or technical strategy
-
Real backing, real revenue — AWS Preferred Partner. Backed by Bessemer Venture Partners and Sorin Investments. Enterprise customers shipping in production across fintech, healthcare, and logistics.
-
Hard problems — Voice AI for regulated enterprise transactions is genuinely difficult. If you've been waiting for a conversational AI problem worth your seniority, this is it.
-
Startup speed, enterprise depth — Weeks not quarters. Direct customer access. No five-layer hierarchy between you and the decisions that matter.
-
AI-first team — We use AI tools daily to compress how we build. You'd join a team that expects you to ship faster with AI, not just build AI products.
-
Founding-team energy on Convogent — You'd shape the platform, the team, and the technical culture from a position of real influence.