Data-efficient machine learning for decision-making in smart manufacturing

Mehta, Manan

Data-efficient machine learning for decision-making in smart manufacturing

Mehta, Manan

Permalink

https://hdl.handle.net/2142/124153

Description

Title

Data-efficient machine learning for decision-making in smart manufacturing

Author(s)

Mehta, Manan

Issue Date

2024-03-15

Director of Research (if dissertation) or Advisor (if thesis)

Shao, Chenhui

Doctoral Committee Chair(s)

Shao, Chenhui

Committee Member(s)

Ferreira, Placid M
King, William P
Wang, Pingfeng

Department of Study

Mechanical Sci & Engineering

Discipline

Mechanical Engineering

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Smart manufacturing
machine learning
data-efficient learning
federated learning

Abstract

Recent advances in sensing, metrology, information, and communication technologies have promoted the evolution of data-driven manufacturing. Efficient utilization of 'big data' is crucial for enabling intelligent decision-making in quality management, process control, machine health monitoring and prognostics, etc. Although modern sensor and measurement technologies have enabled the acquisition of high volumes of spatial and temporal data, it is generally cost- and resource-expensive for manufacturers to collect, store, label, and analyze these data. Additionally, directly applying vanilla machine learning algorithms to manufacturing data has critical bottlenecks because they require large quantities of high-quality labeled training data, cannot deal with unbalanced, heterogeneous, and heteroscedastic data, and cannot adaptively improve models for cost-effective decision-making. This dissertation develops novel methodologies to enhance data efficiency and learning performance of machine learning algorithms for a wide range of manufacturing applications like surface metrology, machining, additive manufacturing, and rotating machinery. Specifically, advances are achieved in the context of two machine learning paradigms - multi-task learning (MTL) and federated learning (FL). An adaptive sampling strategy is developed for MTL-based spatial modeling using Gaussian processes. The variance-based sampling strategy maximizes information gain from sequential measurements, thus improving data efficiency and cost-effectiveness through optimal information transfer across similar-but-not-identical manufacturing tasks. The effectiveness of this strategy is demonstrated using a surface shape prediction case study where it outperforms other state-of-the-art methods. A novel statistical framework is developed for multi-task response surface modeling with multi-resolution manufacturing data. This framework decomposes the response surface at each task into a task-specific trend and a residual local variability learned jointly across all tasks with a hierarchical Bayesian framework. The method is the first of its kind to account for multi-resolution data while learning multiple tasks together, thus enabling more accurate and robust modeling across an arbitrary number of design points, tasks, and data resolutions. FL is demonstrated as a promising paradigm for manufacturers to train models collaboratively without directly sharing their sensitive data. FL can simultaneously alleviate two conflicting constraints - data availability and data privacy - which have hindered the widespread adoption of advanced machine learning methods in manufacturing. FL methods are developed for three manufacturing applications including fine-scale defect detection in laser powder bed fusion, fault classification in rotating machinery, and feature prediction and part qualification for 3D printed parts. In all three studies, FL performance is comparable to centralized learning that does not preserve privacy, and better than individual learning where manufacturers train their own models independently. A greedy agglomerative client clustering framework is developed to deal with the data heterogeneity issue in FL. This framework automatically identifies and clusters similar clients during FL training and has several advantages over current state-of-the-art clustered FL methods. Excellent quantitative and qualitative clustering results are demonstrated through extensive experiments on four machine learning datasets and an industrial fault classification dataset.

Graduation Semester

2024-05

Type of Resource

Thesis

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Data-efficient machine learning for decision-making in smart manufacturing

Mehta, Manan

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In