Using Proper Transformations to Improve Precision Open Access
Downloadable ContentDownload PDF
In statistical inference, efficiency of an inferential procedure depends on the underlying assumptions made on the distributions from which the data are generated. To achieve efficiency, transformations are often applied to the original data so that the transformed data would follow certain distributions that are easier to work with. This dissertation includes three research projects about using proper semi-parametric and parametric transformations to improve statistical inferences such as hypothesis testing and parameter estimation.First, we proposed a marginal rank-based inverse normal transformation approach to normalize the marginal distribution of the data before employing multivariate test procedures such as Hotelling's T^2 test and Bonferroni's correction. Extensive simulation was conducted to demonstrate the ability of the proposed approach to adequately control the type I error rate as well as to increase the power of the test, with data particularly from asymmetric or heavy-tailed distributions.Secondly, we considered the application of parametric power transformation (such as Box-Cox transformation) approaches to improve the accuracy and efficiency for the estimation of Receiver Operating Characteristic (ROC) Curve and the area under the curve (AUC), under the assumption that the data from case and control groups can only be transformed to normal distribution via heterogeneous transformations. We showed that existing methods based on a common Box-Cox transformation are invalid in that they possess considerable asymptotic biases when the transformations of two groups are actually different. We moved on to propose a maximum likelihood estimation of the underlying ROC curve and its AUC, and to investigate its asymptotic performance compared to the nonparametric estimator that ignores the heterogeneous nature of the transformation. Along the way we derived the asymptotic bias and variance for each AUC estimator under consideration.Later, the Box-Cox transformation was introduced to the notorious inference problem on the distribution of ratio of two correlated variables, with the extension of moment estimation. We derived the exact density and moment functions for the ratio of two variables that can be transformed to correlated normal distribution via Box-Cox transformation with homogeneous/heterogeneous parameters, and graphically explored how the shape of the density may be affected by the change of parameters. Furthermore, the maximum likelihood estimation (MLE) of the ratio mean under homogeneous and heterogeneous Box-Cox transformations were proposed and compared with the commonly used simple sample mean in terms of their asymptotic bias and efficiency. Our results show that the bias of the estimation based on homogeneous transformation assumption is unavoidable when the transformation parameters of numerator and denominator of the ratio are actually different and the difference is considerable; the simple sample mean loses efficiency compared to our proposed MLE when the transformation parameters are close to 0.The methods are exemplified with data from a dietary intervention clinical trial for type I diabetic children and a case-control study concerning maternal folate metabolism biomarkers and infants born with neural tube defects.