Design and manage AWS infrastructure (ECS/EKS, RDS/Aurora, S3, MSK/Kinesis, Lambda, CloudWatch)
Implement Infrastructure as Code (Terraform/CloudFormation) for automated deployments
Build and maintain CI/CD pipelines using GitHub Actions
Manage containerization and orchestration (Docker, ECR, ECS/EKS)
Ensure production observability (logging, metrics, tracing)
Define monitoring, alerting, and incident response processes
Enforce security and compliance (IAM, secrets management, encryption)
Operate and optimize data platforms (RDS/Aurora)
Lead capacity planning, cost optimization, and reliability improvements
Own disaster recovery and business continuity strategies
Collaborate with engineering teams on scalability and troubleshooting
Required Skills & Experience
5+ years of AWS experience in production environments
Expertise with ECS/EKS, EC2, S3, IAM
Strong experience with RDS/Aurora
Proficiency in Terraform and CI/CD pipelines
Strong Docker and Kubernetes knowledge
Experience with monitoring tools (Prometheus, Grafana)
Experience with IAM, VPC security, and secrets management
Proficiency in Python, Bash, or Go
Experience with PostgreSQL and cloud databases
Advanced Kubernetes or service mesh experience
Experience in fintech or regulated environments
Knowledge of cost optimization and chaos engineering
Strong communication and documentation skills
Ownership of incident response
Mentorship and coaching ability