Introduction: A Career at HARMAN Automotive
We’re a global, multi-disciplinary team that’s putting the innovative power of technology to work and transforming tomorrow. At HARMAN Automotive, we give you the keys to fast-track your career.
-
Engineer audio systems and integrated technology platforms that augment the driving experience
-
Combine ingenuity, in-depth research, and a spirit of collaboration with design and engineering excellence
-
Advance in-vehicle infotainment, safety, efficiency, and enjoyment
About the Role
As a Data Engineer on the Innovation Team, you will design and build an end-to-end audio data collection and consumption platform that enables advanced analytics, audio information retrieval, and personalization capabilities across the Car Audio Division.Y ou will establish standardized, secure, and scalable pipelines to collect, curate, and govern data from internal engineering systems, companion applications, and connected vehicle telemetry. You will collaborate closely with Data Scientists, ML Engineers, Embedded/DSP Engineers, and Audio Experts to ensure high-quality data products are consistently available for experimentation, training, validation, analysis, and production monitoring—while meeting strict privacy, security, reliability, and automotive compliance requirements. You will also serve as a platform consultant, advising other departments on data architecture, data contracts, ingestion design, and downstream consumption, while ensuring adherence to strict privacy, safety, and reliability standards.
What You Will Do
-
Design and implement standardized data ingestion frameworks for batch and streaming sources (internal databases, user‑preference data, vehicle telemetry).
-
Define and maintain data models and data contracts (schemas, semantics, versioning rules) for audio-related files, user preferences and metadata including quality guardrails (schema validation, anomaly detection, etc.)
-
Develop and maintain data lakes / lakehouse architectures
-
Design and implement standardized API for data consumption.
-
Prepare and curate high‑quality datasets for AI/ML model training, validation, experimentation and statistical analysis.
-
Collaborate directly with ML Engineer and Data Scientist to optimize data formats for training performance and storage efficiency
-
Design cost‑aware policies for data retention, sampling, compression, and technology selection
-
Optimize pipeline execution times and resource usage (batch vs streaming, compute sizing, caching strategies)
-
Establish measurable KPIs for data cost efficiency, pipeline reliability, and performance
-
Ensure compliance with internal policies, OEM requirements, and regulatory constraints
-
Create clear documentation for pipelines, schemas, and architectural decisions
-
Mentor other engineers on practical data engineering, performance tuning, and cost‑efficient design
What You Need to Be Successful
-
5+ years of hands‑on experience building and operating large‑scale data pipelines and platforms
-
Strong proficiency in Python, PySpark and SQL for data processing and automation
-
Solid experience with cloud‑based data platforms (e.g., AWS S3, Azure, Databricks, object storage, distributed computing)
-
Proven ability to automate repetitive tasks and improve data hygiene
-
Proven experience implementing cost‑saving strategies in data platforms (storage tiering, compute optimization, pipeline tuning)
-
Deep understanding of data modeling, partitioning, indexing, and performance optimization
-
Experience handling large‑volume, high‑frequency or unstructured data (audio, signals, logs, telemetry)
-
Strong knowledge of data governance, and privacy best practices
-
Working knowledge of privacy‑preserving data handling (pseudonymization, anonymization, etc.), data‑quality checks, and data‑lineage tracking.
-
Ability to work independently and collaborate effectively in cross‑functional global teams
-
Strong communication and collaboration skills, with the ability to work effectively in an intercultural and cross‑functional team.
Bonus Points if You Have
- Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Machine Learning, Signal Processing, or a closely related field.
-
Experience working with audio, image or signal processing domains
-
Background working with automotive datasets or connected vehicle data
-
Experience deploying ML models to production (cloud or edge environments).
-
Experience with event-driven architectures (Kafka-like patterns, IoT ingestion patterns).
-
Experience programming with C++ and embedded systems
-
Familiarity with OTA update flows, artifact versioning, and safe rollback mechanisms.
-
Experience building reproducible working‑environment (docker, virtual environments)
-
Experience with privacy-aware designs: PII privacy, tokenization/pseudonymization, retention automation.
What Makes You Eligible
-
Advanced English communication skill, will be part of a Global team
-
Willing to work in an office at Bangalore.
What We Offer
- Flexible work environment, allowing for full-time remote work globally for positions that can be performed outside a HARMAN or customer location
-
Access to employee discounts on world-class products (JBL, HARMAN Kardon, AKG, and more)
-
Extensive training opportunities through our own HARMAN University
-
Competitive wellness benefits
-
Tuition reimbursement
-
“Be Brilliant” employee recognition and rewards program
-
An inclusive and diverse work environment that fosters and encourages professional and personal development.
#LI-AD3