ou will build and deploy machine-learning models directly on plant data to cut energy consumption, improve equipment reliability, and tighten product quality across our cement operations. This is a hands-on modelling role embedded with process, operations and reliability teams — your work will be measured in real, finance-validated savings (kcal/kg clinker, kWh/tonne, avoided downtime), not slide decks.
Key responsibilities
Build, validate and deploy ML models for process optimisation (kiln / pyro-process control, grinding & separator efficiency), predictive maintenance on critical rotating equipment, and quality / clinker-factor optimisation.
Work with high-frequency sensor and time-series data from plant historians, DCS and IIoT systems; engineer meaningful features from noisy, real-world industrial signals.
Partner with plant operators and process engineers to encode domain knowledge into models, and to take models safely from advisory recommendations toward closed-loop control.
Establish rigorous baselines and quantify impact with finance-grade discipline; defend results under scrutiny.
Work with the MLOps / platform team to productionise models and monitor them in live operation.
Communicate findings clearly to non-technical plant leadership.
Required qualifications (must-have):
Bachelor's or Master's in Engineering (Chemical, Mechanical, Electrical, Industrial), Statistics, Computer Science, or a related quantitative field.
3–6 years building and deploying ML models, including demonstrable experience in a manufacturing or process-industry environment (cement, steel, refining, chemicals, power, glass, mining, or similar).
Strong applied skills in time-series analysis, sensor/signal data, anomaly detection, regression and forecasting, with a solid statistics foundation.
Strong, idiomatic Python for data science (NumPy, pandas, SciPy, scikit-learn, statsmodels) with clean, tested, production-quality code; strong SQL.
Deep command of classical / traditional machine learning — regularised regression (Ridge, Lasso, ElasticNet), tree-based ensembles (Random Forest, Gradient Boosting — XGBoost / LightGBM / CatBoost), SVM, k-NN and Naive Bayes — with sound feature engineering, cross-validation and hyperparameter tuning.
Proven ability to wrangle messy industrial data and engineer features that work in production.
Comfortable on the plant floor — explaining models to engineers and operators and earning their trust.
Preferred (strong pluses):
· Hands-on experience with Industrial IoT (IIoT) and Operational Technology (OT) data — plant historians (OSIsoft PI / AVEVA, Aspen IP.21), OPC-UA, SCADA / DCS, time-series databases.
· Domain exposure to cement or heavy/process manufacturing (pyroprocessing, grinding, combustion, quality control).
Experience working with data from SAP (ERP — especially PM / PP / production & maintenance modules) and Salesforce (SFDC).
Familiarity with Advanced Process Control (APC) concepts and closed-loop deployment.
Deep learning for time series; physics-informed or hybrid (data + first-principles) modelling.
Technical skills:
Programming & engineering: idiomatic, production-quality Python — NumPy, pandas and SciPy for vectorised data work; clean, modular code with unit tests (pytest); OOP; virtual environments & packaging; Jupyter; Git. Strong SQL; PySpark for large datasets a plus.
Classical machine learning: hands-on depth across regularised regression, tree-based ensembles (Random Forest, XGBoost / LightGBM / CatBoost), SVM, k-NN and Naive Bayes; unsupervised methods — k-means, DBSCAN, hierarchical clustering and PCA / dimensionality reduction.
Statistical & modelling rigour: hypothesis testing, regression diagnostics, feature engineering & selection, cross-validation, hyperparameter tuning, class-imbalance handling, and disciplined error analysis.
Time-series & anomaly detection: classical methods (ARIMA / SARIMA, exponential smoothing, state-space models) and libraries (statsmodels, sktime, tsfresh, Prophet); anomaly detection (Isolation Forest, One-Class SVM).
Core libraries: scikit-learn, statsmodels, XGBoost / LightGBM, matplotlib / seaborn.
Platform & tooling
Cloud / lakehouse (Azure, AWS or Databricks); plant historian & OT connectors; Git-based workflows.
Pay: ₹600,436.05 - ₹1,975,141.93 per year
Work Location: In person