Architect - Data Engineering

Altimetrik Corp -
Chennai, Tamil Nadu

Apply Now

Job details

Qualifications

RDBMS
Data modelling
Azure
Oracle
Doctoral degree
Computer Science
Big data
Spark
NoSQL
Selenium
Git
Apache Hive
MongoDB
Java
Master's degree
Databases
SQL
Database design
AWS
Analysis skills
Database management
Bachelor's degree
Machine learning
PostgreSQL
Scala
Neo4j
Scripting
Angular
ETL
Regression analysis
Business Administration
Fraud prevention and detection
Data science
Redshift
Chef
AI
MBA
Jenkins
Communication skills
Graph databases
Data warehouse
Python
MySQL
Data Science
Hadoop
Information Technology

Full job description

Job ID: 857992

18 - 20 Years

1 Opening

Chennai

Role description

Architect - Data Engineering

Job Overview: We are seeking a highly skilled Data Modeler with a strong background in big data technologies, particularly PySpark, and extensive experience in ETL processes. The ideal candidate will be responsible for designing, implementing, and maintaining robust data models that support our business needs and enhance our data analytics capabilities.

Key Responsibilities:

Data Modeling:

Design, develop, and optimize conceptual, logical, and physical data models.

Create and maintain data models to ensure data integrity and performance.

Collaborate with business stakeholders to understand data requirements and translate them into data models.

Big Data Technologies:

Utilize PySpark for large-scale data processing and transformation.

Implement and manage big data solutions on platforms such as Hadoop, Spark, and Hive.

Optimize and troubleshoot big data processing pipelines.

ETL Processes:

Develop, implement, and maintain ETL processes to ingest data from various sources.

Ensure ETL processes are efficient, scalable, and reliable.

Monitor ETL processes to ensure data quality and consistency.

Collaboration and Communication:

Work closely with data engineers, data scientists, and other stakeholders to ensure seamless data integration and utilization.

Document data models, ETL processes, and data pipelines for future reference and knowledge sharing.

Provide support and training to team members on data modeling and big data best practices.

Performance Tuning and Optimization:

Identify and implement opportunities for performance improvements in data models and ETL processes.
Monitor system performance and troubleshoot issues related to data processing and storage.

Qualifications:

Education: Bachelor’s of Engg. degree in computer science, Information Technology, Data Science, or related field.

Experience:

3+ years of experience in data modeling and database design.

3+ years of experience with big data technologies, including PySpark, Hadoop, Spark, Hive, etc.

Proven experience with ETL tools and processes.

Technical Skills:

Proficiency in SQL and database management systems (e.g., MySQL, PostgreSQL, Oracle).

Strong programming skills in Python, particularly with PySpark.

Experience with data warehousing solutions (e.g., Redshift, Snowflake).

Familiarity with cloud platforms (e.g., AWS, Azure, GCP) is a plus.

Experience with Continuous Integration and Automated Test tools such as PyTest, Jenkins, Artifactory, Git, Selenium, Chef desirable

Experience in Graph processing technologies and graph databases such as GraphX and Neo4j is a plus.

Soft Skills:

Strong analytical and problem-solving skills.

Excellent communication and collaboration abilities.

Detail-oriented with a commitment to data quality.

6 or more years of work experience with a Bachelors Degree or 4 or more years of relevant experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or up to 3 years of relevant experience with a PhD

Proven knowledge of successful design and development of data pipelines
Experience in creating data driven business solutions and solving data problems using a wide variety of technologies such as Hadoop, Hive, Spark, MongoDB, NoSQL, as well as traditional data technologies (RDBMS).

Experience developing large scale, enterprise class distributed pipelines that require high availability, low latency & strong data consistency computing.

Ability to program in one or more scripting languages such as Python and one or more programming languages such as Java or Scala

Design and development skills with Big Data technologies like Hadoop, Spark, Hive, Presto and Map Reduce

Experience with Continuous Integration and Automated Test tools such as Jenkins, Artifactory, Git, Selenium, Chef desirable

Experience in Graph processing technologies and graph databases such as GraphX and Neo4j is a plus.

Experience in implementing AI and ML methods is preferred, specifically techniques used in identity verification, fraud detection, or risk prediction scenarios such as Identity Graph, Decision Trees, Random Forests, Logistic Regression, Neural Networks, SVM, or Anomaly Detection algorithms.

Skills

Angular, Python, Hadoop, Data Engineering

Apply Now

Role description

Skills

Jobseeker tools

Employer Tools

Browse

Stay Connected