Balancing A Large Number of Covariates via Covariate-Adaptive Randomization Open Access
Downloadable ContentDownload PDF
Covariate balance across treatment groups is desired for comparative studies because it reduces bias and improves the precision of measuring the treatment effect. However, covariate imbalance often exists in traditional randomized experiments. Recently, there are more and more studies that involve a large number of covariates, making the issue more severe. Therefore, we propose three new covariate-adaptive randomizations that can balance a large number of covariates for different scenarios. We find that our designs have substantial advantages over traditional randomizations in terms of covariate balance, computational efficiency and accuracy of the treatment effect estimate, especially with a large sample size or a large number of covariates.In literature, Qin, Li and Hu proposed a covariate-adaptive randomization (CAM) for balancing over covariates. This design can provide good balance for a large number of covariates and is more computationally efficient than traditional methods. However, it balances all covariates equally. If covariates vary in a priori importance, more important covariates are desired to have better balance. Rerandomization to balance tiers of covariates (RRT), proposed by Morgan and Rubin, offers unequal balance of covariates based on their importance. Following similar spirits of these two designs, we propose a new covariate-adaptive randomization. This new design has the flexibility of unequal balance among covariates. As the number of covariates increases, it is more computational efficient than CAM and RRT. In addition, the estimate of treatment effect under the proposed method attains its minimum variance asymptotically. A simulation study is conducted as a further evidence of the advantages of the proposed design. In addition, we extend CAM and the proposed design to multiple treatment groups. The extended randomizations share similar properties with the original designs for two treatment groups. That is, covariate imbalance between each pair of treatment groups decreases as the sample size increases. By improving covariate balance, the estimate of treatment effect under both extended designs is more precious than traditional designs and attains its optimal precision asymptotically. We also conduct a simulation study as a comparison between these two designs. In practice, missing values often exist in covariates. Therefore, we propose four imputation methods that can be applied in different scenarios based on missing mechanisms and availability of covariate information before the randomization. A comprehensive comparison among these methods is conducted based on simulation studies.We also apply these three proposed covariate-adaptive randomizations to two real-world problems. These designs provide substantial improvements in covariate balance and estimate's precision than the complete randomization and RRT.