Talend Big Data Administrator oversees the Data Governance DQ platform. They are responsible for installing and configuring the Talend Administration Center (TAC) and Talend Data Catalog (TDC) managing environments, scheduling ETL/ELT jobs, ensuring high availability of distributed processing frameworks, and supporting big data engines like Spark and Cloudera Edge Nodes.
Requirements
Platform Administration & Server Management
-
Install, configure, and upgrade Talend environments (TAC, TDC, Studio, JobServers, and runtimes) both native big data platform (Cloudera)
-
Manage physical and virtual execution servers, ensuring load balancing and fault tolerance across the cluster.
-
Integrate Talend with the broader Hadoop Ecosystem (e.g., HDFS, Hive, Spark) and cloud object storage (AWS S3, Object Storage on Cloudera).
-
Manage metadata repositories, configure secure connections to big data clusters, and implement security protocols like Kerberos or Apache Knox
Job Execution & Scheduling
-
Deploy, schedule, and monitor data integration, batch, and streaming tasks using the Job Conductor and Big Data Streaming Conductor.
-
Set JVM parameters and context variables to optimize data pipeline performance and resource utilization at the execution level.
-
Implement automated deployment pipelines and CI/CD integrations with tools like Git and Artifact Repositories
Monitoring & Troubleshooting
-
Track job execution statistics, handle failures, and set up customizable alerts for pipeline errors.
-
Troubleshoot complex distributed job failures, memory leaks, and performance bottlenecks across big data nodes
Security & User Access
-
Administer users, roles, and project authorizations.
-
Define access rights to restrict unauthorized modification or execution of project artifacts.
Qualifications & Essential Skills
-
Experience: Typically 4+ years of experience in data platform administration or data engineering.
-
Core Skills: Advanced knowledge of TAC (Talend Administration Center) and Talend Studio.
-
Big Data & Cloud: Strong understanding of big data frameworks (Apache Spark, Hadoop, Kafka)
-
Programming & Scripting: Proficiency in Java, Linux/Unix shell scripting, and writing complex SQL queries.
-
Soft Skills: Excellent problem-solving abilities, strong communication, and the capacity to collaborate with architects and infrastructure teams.