TY - JOUR
T1 - Integrating Data Transformation in Principal Components Analysis
AU - Maadooliat, Mehdi
AU - Huang, Jianhua Z.
AU - Hu, Jianhua
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-CI-016-04
Acknowledgements: We thank an associate editor and two anonymous referees for their constructive and thoughtful comments that helped us tremendously in revising the manuscript. Maadooliat and Hu were partially supported by the National Science Foundation (grants DMS-0706818), the National Institutes of Health (grants R01GM080503-01A1, R21CA129671), and the National Cancer Institute (grant CA97007). Huang was partially supported by the National Science Foundation (grants DMS-0606580, DMS-0907170). Huang and Maadooliat were partially supported by King Abdullah University of Science and Technology (grant KUS-CI-016-04).
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
PY - 2015/3/31
Y1 - 2015/3/31
N2 - Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.
AB - Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.
UR - http://hdl.handle.net/10754/598638
UR - http://www.tandfonline.com/doi/full/10.1080/10618600.2014.891461
UR - http://www.scopus.com/inward/record.url?scp=84926214200&partnerID=8YFLogxK
U2 - 10.1080/10618600.2014.891461
DO - 10.1080/10618600.2014.891461
M3 - Article
C2 - 25914514
SN - 1061-8600
VL - 24
SP - 84
EP - 103
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 1
ER -