Randomized response (RR) methods have long been suggested for protecting respondents' privacy in statistical surveys. However, how to set and achieve privacy protection goals have received little attention. We give a full development and analysis of the view that a privacy mechanism should ensure...
We propose an automatic shape-constrained non-parametric estimation methodology in least squares and quantile regression, where the regression function and its shape are simultaneously estimated and identified.We build the estimation based on the quadratic B-spline expansion with penalization...
When constructing a statistical model, nonlinearity detection has always been an interesting topic and a difficult problem. To balance precision of parametric modeling and robustness of nonparametric modeling, the semi-parametric modeling method has shown very good performance. The specific...
In statistical inference, efficiency of an inferential procedure depends on the underlying assumptions made on the distributions from which the data are generated. To achieve efficiency, transformations are often applied to the original data so that the transformed data would follow certain...
Covariate balance across treatment groups is desired for comparative studies because it reduces bias and improves the precision of measuring the treatment effect. However, covariate imbalance often exists in traditional randomized experiments. Recently, there are more and more studies that...
We propose novel screening methods targeting at genotype interactions associated with disease risks. The proposed method extends the maximum entropy conditional probability models (Miller et al, 2009) to survival outcomes over time, which are the most common form of disease occurrences in...
The National Science Foundation (NSF) Survey of Doctorate Recipients (SDR) collects information on a sample of individuals in the United States with PhD degrees. A significant portion of the sampled individuals appear in multiple survey years and can be linked across time. Survey weights in each...
Because of non-negativity or detection limit, data with fixed censored responses iscommon in econometrics and biometrics studies. When the response variable is fixedcensored, to explore the relationship between the response variable and predictor covariates,several estimation methods have been...
The intraclass correlation coefficient (ICC) has a lengthy history of application in several different fields of research. It is a widely recognized index of reliability among measurements. Two-stage group sequential designs are proposed to lead to savings in sample size, time and cost when...
The receiver operating characteristic (ROC) curve is a very useful tool for describing and comparing the diagnostic accuracy of biomarkers when the binary-scale gold standard is available.There are, however, many examples of diagnostic tests whose gold standards are continuous. Hence, several...
The thesis is concerned with the nonparametric estimation of the conditional distribution function with longitudinal data. Nonparametric estimation and inferences of conditional distribution functions with longitudinal data have important applications in biomedical studies, such as...
We represent the d-dimensional random vectors as vertices of a complete weighted graph and propose depth functions that are applicable to distributions in d-dimensional spaces and data on graphs. We explore the proximity graphs, examine their connection to existing depth functions, define a...
In the areas of missing data and causal inference, there is great interest in doubly robust (DR) estimators that involve both an outcome regression (OR) model and a propensity score (PS) model. These DR estimators are consistent and asymptotically normal if either model is correctly specified....
A triangle statistic is proposed for testing the equality of two multivariate continuous distribution functions (DFs) in high-dimensional settings based on sample interpoint distances. Given two independent d-dimensional random samples, a triangle can be formed by randomly selecting one...
Statistical methods are widely used in equal employment cases to analyze data submitted as evidence of discrimination. In this dissertation, statistical procedures are developed to resolve issues arising in actual discrimination cases. The first problem arose in the Alexander v. Milwaukee case,...