dmg::media is home to some of the world’s most recognised news brands and processes billions of data points every month to power audience insights, targeting strategies, and data-driven decision making across the business. With an ambition to become the world’s leading publisher in the sophisticated application of data, dmg::media continues to invest heavily in modern data technologies and scalable analytics platforms.
As a Senior Data Engineer, you will be central to building dmg::media's next-generation data platform. You will design, develop, and maintain scalable data pipelines, ETL processes, real-time data streams, APIs, and insight tools delivering data products that drive measurable impact across both commercial and editorial functions. Working in a highly collaborative engineering environment, you will partner closely with data analysts, data scientists, and business stakeholders to build robust data solutions, optimise data architecture, and raise the quality bar for data products across the organisation.
-
Plan, design, and implement scalable data solutions using GCP and Databricks technologies.
-
Design, develop, and maintain modern data pipelines and real-time data streaming solutions.
-
Manage and optimise ETL processes to ensure efficient data processing, transformation, and integration.
-
Develop and maintain data marts to support defined business and data-driven use cases.
-
Pre-process, clean, and transform structured and unstructured data, while creating and optimising queries as required.
-
Establish and implement best practices, frameworks, and standards for data testing, validation, and quality assurance.
-
Contribute to the development of data-driven products, including user journey analysis and editorial intelligence tools.
-
Collaborate closely with cross-functional technology teams to align data initiatives with business and technical objectives.
-
Perform root cause analysis on internal and external data issues to solve business problems and identify improvement opportunities.
-
Build and maintain processes supporting data transformation, metadata management, dependency handling, and workload optimisation.
-
Analyse, manipulate, and extract actionable insights from large and complex datasets across multiple disconnected sources.
-
5+ years of hands-on experience as a Data Engineer, designing and building scalable data pipelines and cloud-based data processing solutions.
-
Strong hands-on expertise in Python and SQL, including data transformation, performance optimisation, data modelling, workflow automation and large-scale data processing.
-
Experience working with Google Cloud Platform (GCP) services such as BigQuery, Cloud Storage and Cloud Functions, or equivalent cloud-native data platforms, is mandatory.
-
Hands-on experience with Databricks, including notebooks, clusters, Spark/PySpark development, data transformation, and pipeline orchestration.
-
Strong understanding of distributed data processing frameworks, particularly PySpark/Spark DataFrames, for large-scale data transformation and analytics workloads.
-
Experience working with relational databases such as PostgreSQL, SQL Server, Oracle or similar platforms, including complex SQL development, query optimisation and data modelling.
-
Proven experience developing and maintaining ETL/ELT pipelines, including ingestion from APIs, databases, cloud storage platforms and third-party data sources.
-
Understanding of modern data architecture patterns such as Medallion Architecture (Bronze/Silver/Gold), Data Lake, Data Warehouse, and Lakehouse architectures.
-
Experience working with version control systems and CI/CD pipelines using tools such as GitHub Actions, GitLab CI/CD, Azure DevOps or equivalent platforms.
-
Experience working with modern data processing and analytics platforms such as Databricks, Snowflake, Redshift, Synapse Analytics or similar technologies is desirable.
-
Experience designing and implementing reusable Python libraries, utility packages, frameworks, and automation solutions for data engineering workflows.
-
Familiarity with cloud-based orchestration and workflow management tools such as Apache Airflow, Dataform, Azure components - Azure Data Lake, Azure SQL DW, Azure Synapse, etc. or equivalent platforms is good to have.
-
Experience integrating and processing data from diverse sources, including APIs, cloud storage, structured databases and file formats such as JSON, CSV and Parquet.
-
Exposure to containerization and cloud-native deployment technologies such as Docker and Kubernetes would be an advantage.
-
Any understanding of Web Analytics data, such as Adobe Analytics and Google Analytics, alongside any Ad tech like Google Ad Manager (working with programmatic and affiliate partners), is a plus, but not essential.
-
Strong problem-solving skills with the ability to independently analyse business requirements, propose scalable data solutions, and collaborate effectively with analysts and stakeholders.