Role Description:
Our mission at Booking.com is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have an amazing experience wherever and whenever they choose: mobile, web, and through partners and 3rd parties.
About the team - Cloud Ecosystems: The Cloud Ecosystems group builds, operates and enables use of private and public cloud infrastructure within Booking.com. Compute capabilities are provided on platform instances that are privately owned and centrally managed by Booking.com. These platform instances, and the workloads running on them, are hosted both in Booking datacenters (“on-premises”) and on public cloud infrastructure (AWS and GCP). The Cloud Ecosystems team in Bangalore is responsible for shaping the future of our internal cloud platforms by improving the security, governance, reliability and developer experience of our cloud environments. This includes building scalable internal platforms, strengthening cloud governance, and improving the paved-path experience for teams consuming cloud infrastructure. As part of this mission, we are strengthening our Google Cloud Platform (GCP) foundation by building a well-governed landing zone, improving security guardrails, introducing platform automation, and enabling safe adoption of GCP for AI, automation and business workloads. Key Job Responsibilities and Duties: The core premise for the Booking Platform Engineer lies in designing, building, and maintaining internal platforms, to provide developers with self-service tools and automation for deploying and running software efficiently and reliably. We code our way out of problems where operations are concerned, addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking.
You will impact millions of people all over the globe with your creative solutions You work in one of the biggest e-commerce companies in the world You will solve exciting problems at scale by writing and deploying code across tens of thousands of servers Ensuring an “everything as code” mindset for yourself and your team You will have the opportunity to collaborate with many of the world’s leading SREs, SWEs and Platform Engineers You will be free to launch your own ideas and solutions within our sophisticated production environment Here are some of the tools and technologies we use to achieve this: Python, Go, Terraform, Kubernetes, GCP, AWS, Prometheus, Elasticsearch, Kafka and other modern platform tooling
Design, develop and implement scalable GCP platform foundations that improve security, governance and developer experience
Build and evolve the GCP landing zone including project structures, IAM models, networking standards and organization policies
Deliver a fully integrated, end-to-end developer focused platform experience that makes common cloud use cases simple and secure
Build centralized infrastructure provisioning workflows and self-service platform capabilities
Embed security, governance and compliance controls directly into infrastructure pipelines through automation
Define secure default configurations and preventative guardrails for GCP services
Improve identity and access management practices including least privilege access and strong authentication controls
Define project lifecycle management practices including ownership, lifecycle reviews and cleanup automation
Improve observability, logging and operational readiness of GCP environments
Partner with security teams to improve threat detection context and reduce operational noise through better platform standardization
Improve infrastructure change management by introducing automated pipelines instead of manual configuration
Embed operational resilience and service reliability directly into the platform lifecycle
Build monitoring and alerting to supervise the health of platform services and support incident response when needed
Be an advocate of engineering standard processes and platform best practices
Share the on-call rotation and be an escalation contact for platform incidents
Contribute to Booking.com's growth through interviewing, onboarding or other recruitment efforts.
8+ years of hands-on experience in software, platform or site reliability engineering within the technology sector, coupled with expertise in building, operating and maintaining sophisticated and scalable systems.
Solid experience in at least one programming language. We use Java, Python, Go, Ruby, Perl
Experience with Infrastructure as Code technologies
Knowledge of cloud computing fundamentals
Solid foundation in Linux administration and troubleshooting
Understanding of Service level agreements and objectives
Additional experience in GCP / AWS solutions, Kubernetes, Networking, Security or Storage is desirable
Experience with developing cloud landing zones, especially with GCP is desirable
Supervising / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus
Good interpersonal skills
Proficient command of the English language, both written and spoken