MonotaRO Technologies India Private Limited is a global technology company which is a subsidiary of MonotaRO Co., Ltd. Our mission is to empower enterprise clients with the smartest marketing platform, enabling seamless integration with our personalization engines and delivering cross-channel marketing capabilities. We are dedicated to enhancing customer engagement and experiences while focusing on increasing Lifetime Value (LTV) through consistent messaging across all channels. We are a world-class engineering team that encompasses front end (UI), back end (API / Java), and Big Data engineering to deliver compelling products.
You will be a key member of the data engineering team, responsible for shaping and delivering data products. You'll have the opportunity to shape the next generation of data analytics tech stack leveraging big data technologies. You will be working closely with business stakeholders, product managers, and engineering teams to meet the data requirements of various initiatives.
Objectives
The objective is to design, build, and maintain the data infrastructure that enables organizations to collect, store, and process data efficiently. This infrastructure allows data to be accessed and used for analysis by data scientists, analysts, and business users, ultimately supporting informed decision-making and organizational improvements.
Responsibilities
Build large-scale batch and real-time data pipelines with data processing frameworks like Spark on AWS / GCP.
Manage data flows and set up automation between various data sources.
Implement best practices in continuous integration and deliver data quality. Maintain data documentation and definitions.
Help drive optimization, testing, and tooling of the data products.
Qualifications
Bachelor's degree in Computer Science, Engineering, or a related field. Exceptional performance on platforms like LeetCode or HackerRank can be considered in place of formal education.
Minimum 2 years 5 to 10 years added value experience working on full life cycle Big Data production projects
Strong foundation in computer science principles, including data structures, algorithms, and software design.
Strong demonstrable skills in two of the following programming languages – Python or Scala.
Should have experience with AWS services like EMR, Lambda, S3, DynamoDb.
Should have experience with Databricks Notebooks and Jobs API.
Strong Experience in processing Big Data and analyzing the data using Spark, Map Reduce, Hadoop, Sqoop, Apache Airflow, HDFS, Hive, Zookeeper.
Familiar with containerization technologies like Docker, and data pipeline and workflow management tools such as Apache Airflow.
Intermediate to advanced knowledge of SQL. Relational SQL and NoSQL databases, including Postgres, MySql, Redshift, and Redis. Experience in SQL tuning, schema design, or analytical programming.
Comfortable working across a wide array of technologies, fast-paced, and results-oriented environment.
Proficient understanding of Git (code versioning tool)