Date of Award
5-2015
Culminating Project Type
Thesis
Degree Name
Applied Statistics: M.S.
Department
Department of Mathematics and Statistics
College
College of Science and Engineering
First Advisor
Hui Xu
Second Advisor
David Robinson
Third Advisor
Richard Sundheim
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Abstract
Model selection is a challenging issue in high dimensional statistical analysis, and many approaches have been proposed in recent years. In this thesis, we compare the performance of three penalized logistic regression approaches (Ridge, Lasso, and Elastic Net) and three information criteria (AIC, BIC, and EBIC) on binary response variable in high dimensional situation through extensive simulation study. The models are built and selected on the training datasets, and their performance are evaluated through AUC on the validation datasets. We also display the comparison results on two real datasets (Arcene Data and University Retention Data). The performance differences among those approaches are discussed at the end.
Recommended Citation
Li, Zhengyi, "High Dimensional Model Selection and Validation: A Comparison Study" (2015). Culminating Projects in Applied Statistics. 1.
https://repository.stcloudstate.edu/stat_etds/1