Role Description:
Our mission at Booking.com is to create transformative, innovative, and personalized travel experiences for millions of customers all across the world. We want customers to have an amazing experience wherever and whenever they choose: mobile, web, and through partners and 3rd parties.
About the team - Cloud Ecosystems:
The Cloud Ecosystems group builds, operates and enables use of private and public cloud infrastructure within Booking.com. Compute capabilities are provided on platform instances that are privately owned and centrally managed by Booking.com. These platform instances, and the workloads running on them, are hosted both in Booking datacenters (“on-premises”) and on public cloud infrastructure (AWS).
The Cloud Ecosystems team in Bangalore is responsible for shaping the future of our Kubernetes-based container orchestration platform, our Virtual Machine as a Service offering (spanning private data centers and AWS EC2) and various disaster recovery solutions that ensure Booking.com’s RTO/RPO driven recoverability against large-scale risks such as ransomware attacks.
Key Job Responsibilities and Duties:
The core premise for the Booking Platform Engineer lies in designing, building, and maintaining internal platforms, to provide developers with self-service tools and automation for deploying and running software efficiently and reliably. We code our way out of problems where operations are concerned, addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here at Booking.
You will impact millions of people all over the globe with your creative solutions
You work in one of the biggest e-commerce companies in the world
You will solve exciting problems at scale by writing and deploying code across tens of thousands of servers
Ensuring an “everything as code” mindset for yourself and your team
You will have the opportunity to collaborate with many of the world’s leading SREs, SWEs and Platform Engineers
You will be free to launch your own ideas and solutions within our sophisticated production environment
Here are some of the tools and technologies we use to achieve this: Python, Go, Puppet, Kubernetes, Elasticsearch, Prometheus, HAProxy, Cassandra, Kafka etc
What you’ll be doing:
Collaborate with the team to support the development of highly distributed, large scale platforms that have a major impact on developers at Booking;
Deliver a fully integrated, end-to-end developer focused engineering experience to make common things simple and complex things possible;
Embed operational resilience and service reliability directly into the development lifecycle
Build effective monitoring to supervise the health of your system, and assist the on-call team to handle outages; Learn to troubleshoot system health issues under guidance
Build and run capacity tests to manage the growth of your systems;
Contribute to Business Continuity and IT Disaster Recovery (DR) planning into the full engineering lifecycle;
Contribute to embedding security and privacy directly into service bootstrapping and delivery pipelines by automating controls and reducing manual decision points;
Be an advocate of engineering standard processes;
Participate in the on-call rotation to gain operational experience and exposure to production environments
What you’ll bring:
1-3 years of hands-on experience in software, platform or site reliability engineering within the technology sector with a strong appetite for learning, a curious mindset and ambition to solve problems in a large scale environment
Solid experience in at least one programming language. We use Go, Python, Java, Ruby, Perl;
Experience with Infrastructure as Code technologies;
Knowledge of cloud computing fundamentals;
Knowledge of Linux administration and troubleshooting;
Strong problem solving skills and a proactive mindset towards troubleshooting issues and identifying root causes
Additional experience in AWS solutions, Kubernetes, Networking, Security or Storage are a plus
Supervising / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch are a plus;
Good interpersonal skills
Proficient command of the English language, both written and spoken