Date of Award
5-1-2020
Degree Name
Doctor of Philosophy
Department
Electrical and Computer Engineering
First Advisor
Gupta, Lalit
Abstract
This dissertation focuses on the development of three classes of brain-inspired machine learning classification models. The models attempt to emulate (a) multi-sensory integration, (b) context-integration, and (c) visual information processing in the brain.The multi-sensory integration models are aimed at enhancing object classification through the integration of semantically congruent unimodal stimuli. Two multimodal classification models are introduced: the feature integrating (FI) model and the decision integrating (DI) model. The FI model, inspired by multisensory integration in the subcortical superior colliculus, combines unimodal features which are subsequently classified by a multimodal classifier. The DI model, inspired by integration in primary cortical areas, classifies unimodal stimuli independently using unimodal classifiers and classifies the combined decisions using a multimodal classifier. The multimodal classifier models are be implemented using multilayer perceptrons and multivariate statistical classifiers. Experiments involving the classification of noisy and attenuated auditory and visual representations of ten digits are designed to demonstrate the properties of the multimodal classifiers and to compare the performances of multimodal and unimodal classifiers. The experimental results show that the multimodal classification systems exhibit an important aspect of the “inverse effectiveness principle” by yielding significantly higher classification accuracies when compared with those of the unimodal classifiers. Furthermore, the flexibility offered by the generalized models enables the simulations and evaluations of various combinations of multimodal stimuli and classifiers under varying uncertainty conditions. The context-integrating model emulates the brain’s ability to use contextual information to uniquely resolve the interpretation of ambiguous stimuli. A deep learning neural network classification model that emulates this ability by integrating weighted bidirectional context into the classification process is introduced. The model, referred to as the CINET, is implemented using a convolution neural network (CNN), which is shown to be ideal for combining target and context stimuli and for extracting coupled target-context features. The CINET parameters can be manipulated to simulate congruent and incongruent context environments and to manipulate target-context stimuli relationships. The formulation of the CINET is quite general; consequently, it is not restricted to stimuli in any particular sensory modality nor to the dimensionality of the stimuli. A broad range of experiments are designed to demonstrate the effectiveness of the CINET in resolving ambiguous visual stimuli and in improving the classification of non-ambiguous visual stimuli in various contextual environments. The fact that the performance improves through the inclusion of context can be exploited to design robust brain-inspired machine learning algorithms. It is interesting to note that the CINET is a classification model that is inspired by a combination of brain’s ability to integrate contextual information and the CNN, which is inspired by the hierarchical processing of visual information in the visual cortex. A convolution neural network (CNN) model, inspired by the hierarchical processing of visual information in the brain, is introduced to fuse information from an ensemble of multi-axial sensors in order to classify strikes such as boxing punches and taekwondo kicks in combat sports. Although CNNs are not an obvious choice for non-array data nor for signals with non-linear variations, it will be shown that CNN models can effectively classify multi-axial multi-sensor signals. Experiments involving the classification of three-axis accelerometer and three-axes gyroscope signals measuring boxing punches and taekwondo kicks showed that the performance of the fusion classifiers were significantly superior to the uni-axial classifiers. Interestingly, the classification accuracies of the CNN fusion classifiers were significantly higher than those of the DTW fusion classifiers. Through training with representative signals and the local feature extraction property, the CNNs tend to be invariant to the latency shifts and non-linear variations. Moreover, by increasing the number of network layers and the training set, the CNN classifiers offer the potential for even better performance as well as the ability to handle a larger number of classes. Finally, due to the generalized formulations, the classifier models can be easily adapted to classify multi-dimensional signals of multiple sensors in various other applications.
Access
This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.