Role Overview:
We are seeking a highly skilled Senior Data Engineer with 5+ years of experience to join our team. This role is focused on designing and building robust data assets through high-performance data pipelines. You will be a key player in modernizing our data infrastructure, transitioning legacy codebases into clean, scalable architectures, and ensuring the highest standards of code quality through Test-Driven Development (TDD).
Key Responsibilities:
Data Pipeline Development: Design, develop, and maintain complex ETL/ELT pipelines to build high-value data assets.-
Legacy Modernization: Lead the code refactorization of legacy codebases, improving readability, maintainability, and performance.
-
System Optimization: Perform deep code optimization using Spark SQL and PySpark to handle large-scale datasets efficiently.
-
Quality Assurance: Implement a Test-Driven Development (TDD) approach, writing comprehensive unit tests to ensure functionality and catch bugs early in the lifecycle.
-
Complex Problem Solving: Isolate and resolve difficult bugs, including those related to performance bottlenecks, concurrency issues, and complex logic flaws.
-
Cloud Architecture: Design and deploy solutions utilizing the full AWS stack, explaining the trade-offs and benefits of specific services for various use cases.
Technical Requirements:
Core Programming & Data Engineering
5+ years of experience in hands-on programming with Python and PySpark.-
Expertise in Boto3 and various Python frameworks and libraries, adhering strictly to Python best practices (PEP 8).
-
Strong experience in Spark SQL and PySpark optimization techniques (e.g., partitioning, caching, broadcast joins).
Cloud & Infrastructure (AWS)
-
Deep architectural knowledge of AWS services, including: S3, EC2, Lambda, Redshift, CloudFormation
DevOps & Tools
-
Advanced understanding of Git (branching strategies, PR reviews).
-
Experience with JFrog Artifactory for dependency management and artifact storage.
-
Proficiency in CI/CD pipelines and automated testing frameworks.
Professional Attributes:
-
Analytical Mindset: Ability to debug complex, non-obvious issues in distributed systems.
-
Clean Coder: Passion for writing "clean code" and mentoring junior engineers on maintainability.
-
Architectural Thinking: Ability to explain the "why" behind choosing specific AWS components over others.