Data Engineer (GCM 4)
Role Summary:
Lead the cloud modernization of Clients big data pipelines by migrating Scala/Spark workloads to PySpark and Delta Lake, integrating with Oracle ADW/AIDP, and optimizing large scale data processing across OCI or equivalent platforms such as AWS/GCP. Ensure high performance, security, and reliability across complex client data flows.
Core Skills
7+ years working across data cloud platforms (Oracle ADW, Snowflake, Databricks, etc.)
Strong SQL expertise (CTEs, UDFs, MVs) and performance tuning
CI/CD automation using GitHub Actions; Infrastructure as Code experience
Big Data & Cloud Technologies
Apache Spark, Hadoop
Delta Lake
Oracle ADW/AIDP or equivalent like Databricks platform
Python, PySpark
Workflow orchestration using tools such as Airflow
Responsibilities
Modernize and rewrite ETL pipelines using PySpark and Delta Lake
Integrate data pipelines with Oracle ADW/AIDP and APEX services
Build scalable pipelines for structured and unstructured data
Implement data validation, logging, retries, and monitoring
Optimize workloads for performance and cost
Good to have
Oracle APEX Integration
PySQL / Python based Oracle connectivity
Experience consuming Oracle APEX APIs
Understanding of ADW RBAC, policies, and governance
Pay: ₹160,000.00 - ₹180,000.00 per month
Work Location: In person