Culminating Project Title
Date of Award
Culminating Project Type
Applied Statistics: M.S.
Department of Mathematics and Statistics
College of Science and Engineering
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Model selection is a challenging issue in high dimensional statistical analysis, and many approaches have been proposed in recent years. In this thesis, we compare the performance of three penalized logistic regression approaches (Ridge, Lasso, and Elastic Net) and three information criteria (AIC, BIC, and EBIC) on binary response variable in high dimensional situation through extensive simulation study. The models are built and selected on the training datasets, and their performance are evaluated through AUC on the validation datasets. We also display the comparison results on two real datasets (Arcene Data and University Retention Data). The performance differences among those approaches are discussed at the end.
Li, Zhengyi, "High Dimensional Model Selection and Validation: A Comparison Study" (2015). Culminating Projects in Applied Statistics. 1.