Senior Backend Engineering & Architecture - 100% Remote
Experience: 8-10 years
Location: 100% Remote
Job Type: Contract
Duration: 6+ Months
Budget: 1.2 LPM
Role Overview
As Senior Python Developer, you will own the backend architecture of the platform — the
scalable, reliable foundation on which agents are created, deployed, and run. You will design
the core services, data layers, and execution infrastructure that make the Agent OS robust at
scale.
Key Responsibilities
- Design and own the backend architecture — services, APIs, data models, and the agent
execution / runtime engine.
- Build scalable, secure, high-performance Python services that power agent deployment
and runtime.
- Architect the systems that manage the full agent lifecycle: creation, versioning,
deployment, scaling, and monitoring.
- Design asynchronous, event-driven, and queue-based systems for reliable agent
execution at scale.
- Build robust integration layers connecting tools, connectors, databases, and the AI / ML
layer.
- Define and uphold standards for code quality, testing, observability, and security across
the backend.
- Optimize the platform infrastructure for performance, reliability, cost, and scalability.
Required Skills & Experience
- 8-10 years of backend development with deep, expert-level Python.
- Hands-on experience with the backend of agent-builder / Agent OS platforms — the
infrastructure that lets users build, deploy, and run AI agents reliably at scale.
- Strong experience designing distributed systems and microservices architecture.
- Expertise with modern Python frameworks (FastAPI, Django, or Flask) and async
programming (asyncio).
- Strong API design skills (REST / GraphQL) and experience with message queues / event
systems (Kafka, RabbitMQ, Celery, or similar).
- Solid database design across SQL and NoSQL, with experience in caching and vector
stores.
- Experience with containers and orchestration (Docker, Kubernetes) and cloud platforms
(AWS / Azure / GCP).
- Strong grasp of system design, scalability, security, and production reliability.
Nice to Have
- Familiarity with LLM orchestration, RAG pipelines, or model-serving infrastructure.
- Experience with multi-tenant SaaS architecture and infrastructure-as-code.
- Exposure to observability and tracing for distributed, long-running workloads.