Role Overview
We are seeking an experienced Istio Service Mesh Administrator to design, implement, and operate an enterprise‑grade service mesh for large‑scale microservices running on AWS EKS. The role requires deep expertise in Istio architecture, Kubernetes networking, zero‑trust security, observability, and automation/GitOps, with hands‑on responsibility for reliability, performance, and secure service‑to‑service communication across multiple environments.
Key Responsibilities
Service Mesh Architecture & Design
- Design and manage Istio service mesh on AWS EKS for 300+ microservices.
- Architect scalable mesh topology across Dev, QA, Prod, and DR environments.
- Manage Istio control plane lifecycle, including upgrades and revision‑based deployments.
- Define namespace isolation, sidecar injection strategies, and mesh standards.
Traffic Management & Routing
- Implement advanced routing strategies:
- Configure retries, timeouts, circuit breaking, and fault injection.
- Manage ingress and egress traffic, including outbound policies (ALLOW_ANY / REGISTRY_ONLY).
Security & Zero Trust
- Implement and enforce mTLS across all services.
- Configure PeerAuthentication and AuthorizationPolicy.
- Integrate mesh security with AWS IAM and IRSA.
- Secure external communication using Istio Egress Gateway.
- Ensure namespace‑level and service‑level isolation.
Observability & Monitoring
- Implement distributed tracing using Jaeger or OpenTelemetry.
- Operate Prometheus, Grafana, and Kiali dashboards.
- Define SLIs, SLOs, and alerting strategies.
- Troubleshoot latency issues, 503 errors, and mTLS failures.
Kubernetes & AWS Integration
- Strong understanding of Kubernetes internals:
- Manage high‑availability EKS clusters with autoscaling.
- Configure AWS Load Balancer Controller (ALB/NLB) for Istio Ingress Gateway.
- Optimize Envoy sidecar resource usage and overall cluster performance.
Reliability & Performance Engineering
- Implement resilience patterns:
- Optimize Envoy proxy performance and telemetry overhead.
- Troubleshoot high‑throughput production workloads.
- Support multi‑cluster / multi‑region service mesh (preferred).
Automation & DevOps
- Manage mesh configurations using GitOps tools (ArgoCD / Flux).
- Use Helm for deployment and versioning of mesh resources.
- Integrate CI/CD pipelines for automated policy rollouts.
- Automate mesh upgrades and rollout strategies.
Required Skills
- Kubernetes (5+ years) EKS preferred
- Istio (3+ years) in production
- Experience operating 200–300+ microservices architectures
- Deep knowledge of Kubernetes networking & service discovery
- Hands‑on expertise with Envoy proxy and Istio control plane components
- Strong troubleshooting skills in production environments
- Strong understanding of AWS networking:
- VPC, Subnets, Route Tables
Preferred Qualifications
- Experience with multi‑cluster service mesh architectures
- Integration of API Gateway with Istio
- Strong scripting skills (Bash / Python)