Hochschulübergreifendes Promotionszentrum Angewandte Informatik (PZAI)
Refine
Currently, process control in automation technology is mostly regulated by fixed process parameters as a compromise between several identically constructed systems or by plant operators, who are often guided by intuition based on decades of experience. Some operators are not able to pass on their knowledge to the next generation due to societal developments, e.g. academization or increased desire for self-actualization. In contrast, the vision of Smart Factories includes intelligent machining processes that should ultimately lead to self-optimization and adaptation to uncontrollable variables. To consistently implement this vision of self-optimizing machines, a defined quality criterion must be automatically monitored and act as a feedback for continual, autonomous and safe optimization. The term safe refers to the compliance with process quality standards, which must always be maintained. In a very conservative branch such as automation technology, no risks whatsoever are allowed through random experiments for data generation in production operations, since, for example, an unscheduled downtime leads to serious financial losses. Furthermore, machine-driven decisions may at no time pose a threat. Thus, decisions under uncertainty may only be taken where the amount of uncertainty can be considered uncritical. Additionally, industrial applications require a guaranteed real-time capability in terms of reaction to ensure that the actions can be taken in time whenever needed. Since economic aspects are often crucial for decisions in industry, necessary experiments under laboratory conditions, for example, should also be as avoidable as possible, while the effort required for integration into a field application should be as simple as possible.
The aim of this work is the scientific investigation of the integration of learning feedback
for intelligent decision making in the control of industrial processes. The successful integration enables data-driven process optimization. To get closer to the vision of self-optimizing machines, safe optimization methods for industrial applications on the process level are investigated and developed. Here, considering the given restrictions of the automation industry is critical. This work addresses several fields including technical, algorithmic and conceptual aspects. The algorithmic refinements are essential for enabling a wider use of safe optimization for industrial applications. They allow, e.g., the automatic handling of the majority of hyper-parameters and the solution of complex problems by increased computational efficiency. Furthermore, the trade-off between exploration and exploitation of safe optimization in high-dimensional spaces is improved. To account for changeable states perceived via sensor data, contextual Bayesian optimization is modified so that safety requirements are met and real-time capability is satisfied. A software application for industrial safe optimization is implemented within a real-time capable control to be able to interact with other software modules to reach an intelligent decision. Further contributions cover recommendations regarding technical requirements with focus on edge control devices and the conceptual inclusion of machine learning to industrial process control.
To emphasize the application relevance and feasibility of the presented concepts, real world lighthouse projects are realized in the course of this work, indented to reduce skepticism and thus initiate the breakthrough of self-optimizing machines.
Abstract knowledge is deeply grounded in many computer-based applications. An important research area of Artificial Intelligence (AI) deals with the automatic derivation of knowledge from data. Machine learning offers the according algorithms. One area of research focuses on the development of biologically inspired learning algorithms. The respective machine learning methods are based on neurological concepts so that they can systematically derive knowledge from data and store it. One type of machine learning algorithms that can be categorized as "deep learning" model is referred to as Deep Neural Networks (DNNs). DNNs consist of multiple artificial neurons arranged in layers that are trained by using the backpropagation algorithm. These deep learning methods exhibit amazing capabilities for inferring and storing complex knowledge from high-dimensional data.
However, DNNs are affected by a problem that prevents new knowledge from being added to an existing base. The ability to continuously accumulate knowledge is an important factor that contributed to evolution and is therefore a prerequisite for the development of strong AIs. The so-called "catastrophic forgetting" (CF) effect causes DNNs to immediately loose already derived knowledge after a few training iterations on a new data distribution. Only an energetically expensive retraining with the joint data distribution of past and new data enables the abstraction of the entire new set of knowledge. In order to counteract the effect, various techniques have been and are still being developed with the goal to mitigate or even solve the CF problem. These published CF avoidance studies usually imply the effectiveness of their approaches for various continual learning tasks.
This dissertation is set in the context of continual machine learning with deep learning methods. The first part deals with the development of an application-oriented real-world evaluation protocol which can be used to investigate different machine learning models with regard to the suppression of the CF effect. In the second part, a comprehensive study indicates that under the application-oriented requirements none of the investigated models can exhibit satisfactory continual learning results. In the third part, a novel deep learning model is presented which is referred to as Deep Convolutional Gaussian Mixture Models (DCGMMs). DCGMMs build upon the unsupervised approach of Gaussian Mixture Models (GMMs). GMMs cannot be considered as deep learning method and they have to be initialized in a data-driven manner before training. These aspects limit the use of GMMs in continual learning scenarios.
The training procedure proposed in this work enables the training of GMMs by using Stochastic Gradient Descent (SGD) (as applied to DNNs). The integrated annealing scheme solves the problem of a data-driven initialization, which has been a prerequisite for GMM training. It is experimentally proven that the novel training method enables equivalent results compared to conventional methods without iterating their disadvantages. Another innovation is the arrangement of GMMs in form of layers, which is similar to DNNs. The transformation of GMMs into layers enables the combination with existing layer types and thus the construction of deep architectures, which can derive more complex knowledge with less resources.
In the final part of this work, the DCGMM model is examined with regard to its continual learning capabilities. In this context, a replay approach referred to as Gaussian Mixture Replay (GMR) is introduced. GMR describes the generation and replay of data samples by utilizing the DCGMM functionalities. Comparisons with existing CF avoidance models show that similar continual learning results can be achieved by using GMR under application-oriented conditions. All in all, the presented work implies that the identified application-oriented requirements are still an open issue with respect to "applied" continual learning research approaches. In addition, the novel deep learning model provides an interesting starting point for many other research areas.