Date of Award
Doctor of Philosophy
This dissertation examines various facets of the analysis of complex networks. In the first part, we study the resilience of networks, by examining various attempts to quantify resilience. Some of the measures studied are vertex attack tolerance, integrity, tenacity, toughness and scattering number. We prove empirically that, although these measures are NP-hard to calculate, they can be approximated to within reasonable amounts by a novel heuristic called Greedy-BC that relies on the graph-theoretic measure betweenness centrality. After verifying the accuracy of Greedy-BC, we test it on several well-known classes of networks: Barabasi-Albert networks, HOTNets and PLODs. Experiments determine that random-degree PLOD nets have the highest resilience, perhaps because of their random nature. The second part concerns clustering. We use the resilience measures and the Greedy-BC heuristic from part 1 to partition graphs. Many experiments are conducted with a generalized algorithm, NBR-Clust, using all discussed resilience measures, and expanding the data to a wide variety of real-life and synthetically generated networks. A parametrized resilience measure beta-VAT is used to detect noise, or outliers, in noisy data. Results are extended to another facet of network analysis -- that of cluster overlap. Attack sets of NBR-Clust are found to contain overlaps with high probability, and an algorithm is developed to identify them. One remaining problem with NBR-Clust is time complexity. The usefulness of the method is limited by the slowness of Greedy-BC, and particularly by the slowness of computing betweenness centrality. In an extensive series of experiments, we test several methods for approximating and speeding betweenness centrality calculations, and are able to reduce the time to cluster a 10,000-node graph from approximately 2 days with the original method of calculation, to a few minutes. In another exploration of the algorithmic aspects of resilience, we attempt to generalize some of the results obtained to hypergraphs. It is found that resilience measures like VAT and algorithms like Greedy-BC transfer well to a hypergraph representation. The final part of the dissertation reviews applications of the new clustering method. First NBR-Clust is used to cluster data on autism spectrum disorders. Because classifications of these disorders are vague, and the data noisy, the clustering properties of NBR-Clust are useful. Results hopefully lead to a better understanding of how to classify autism spectrum disorders. Second, we use NBR-Clust to examine gene assay data with the hope of identifying genes that confer resistance to powdery mildew disease in certain species of grapevines.
This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.