Date of Award
Doctor of Philosophy
Electrical and Computer Engineering
Neural networks based deep learning methods have gained significant success in several real world tasks: from machine translation to web recommendation, and it is also greatly improving the computer vision and the natural language processing. Compared with conventional machine learning techniques, neural network based deep learning do not require careful engineering and consideration domain expertise to design a feature extractor that transformed the raw data to a suitable internal representation. Its extreme efficacy on multiple levels of representation and feature learning ensures this type of approaches can process high dimensional data. It integrates the feature representation, learning and recognition into a systematical framework, which allows the learning starts at one level (i.e., being with raw input) and end at a higher slightly more abstract level. By simply stacking enough such transformations, very complex functions can be obtained. In general, high level feature representation facilitate the discrimination of patterns, and additionally can reduce the impact of irrelevant variations. However, previous studies indicate that deep composition of the networks make the training errors become vanished. To overcome this weakness, several techniques have been developed, for instance, dropout, stochastic gradient decent and residual network structures. In this study, we incorporates latent information into different network structures (e.g., restricted Boltzmann machine, recursive neural networks, and long short term memory). The conditional latent information reflects the high dimensional correlation existed in the data structure, and the typical network structure may not learn this kind of features due to limitation of the initial design (i.e., the network size the parameters). Similarly to residual nets, the conditional neural networks jointly learns the global features and local features, and the specifically designed network structure helps to incorporate the modulation derived from the probability distribution. The proposed models have been widely tested in different datasets, for instance, the conditional RBM has been applied to detect the speech components, and a language model based gated RBM has been used to recognize speech related EEG patterns. The conditional RNN has been tested in both general natural language modeling and medical notes prediction tasks. The results indicate that by introducing conditional branches in the conventional network structures, the latent features can be globally and locally learned.
This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.