Date of Award
Master of Science
Since August 2010, Facebook has entered the self-reported positioning world by providing the check-in service to its users. This service allows users to share their physical location using the GPS receiver in their mobile devices such as a smart-phone, tablet, or smart-watch. Over the years, big datasets of recorded check-ins have been collected with increasing popularity of social networks. Analyzing the check-in datasets reveals valuable information and patterns in users’ check-in behavior as well as places check-in history. The analysis results can be used in several areas including business planning and financial decisions, for instance providing location-based deals. In this thesis, we leverage novel data mining methodology to learn from big check-in data and predict the next check-in place based on only places’ history and with no reference to individual users. To this end, we study a large Facebook check-in dataset. This dataset has a high level of noise in location coordinates due to multiple collection sources, which are users’ mobile devices. The research question is how we can leverage a noise impact reduction technique to enhance performance of prediction model. We design our own noise handling mechanism to deal with feature noise. The predictive model is generated by Random Forest classification algorithm in a shared-memory parallel environment. We represent how the performance of predictors is enhanced by minimizing noise impacts. The solution is a preprocessing feature noise cleansing approach implemented in R and works fast for big check-in datasets.
This thesis is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.