Constrained Maximum Entropy Models for Selecting Genotype Interactions Associated with Interval-Censored Failure Times and Methods for Power Calculation in a Three-arm Four-step Clinical Bioequivalence Study Open Access
Downloadable ContentDownload PDF
We propose novel screening methods targeting at genotype interactions associated with disease risks. The proposed method extends the maximum entropy conditional probability models (Miller et al, 2009) to survival outcomes over time, which are the most common form of disease occurrences in electronic health records and clinical trials. It estimates the conditional distribution of the failure time given individual genotype by maximizing its entropy subject to constraints linking genotype interactions to phenotype intervals. EM algorithm is employed to handle various types of censoring when the exact interval of the disease occurrence is not observed. Stepwise greedy search is proposed to screen a large number of candidate constraints. Constraints in the model with the smallest Minimum Descriptive Length (MDL) are selected. Extensive simulations show that five or so quantile-dependent intervals are sufficient in categorizing event occurrences into different risk groups. When sample sizes are too small or the genotype interactions are very complicated, our method may not detect the exact form of the interactions but still provide useful hints in possibly related SNPs. Finally, a GWAS study in type 1 diabetes patients is used to illustrate our method. Novel SNP interactions associated with neuropathy are identified.A clinical endpoint bioequivalence (BE) study aims to establish BE between a generic drug (TEST) and an innovator drug (REF). A placebo (PLA) is usually included in order to demonstrate the sensitivity of the trials. BE is established through two superiority tests (TEST vs. PLA, REF vs. PLA) and an equivalence test (TEST vs. REF). The latter is an interval test, which is equivalent to two one-sided tests (TOST). Therefore, an overall BE test is composed of three arms (TEST, REF, PLA) and four-step tests (TEST vs. PLA, REF vs. PLA, and TEST vs. REF). Chang et al (2014) calculated power for a three-arm, three-step (TEST vs. PLA and TEST vs. REF) clinical BE study by the expectation of a multivariate normal distribution conditional upon a Chi-square distribution, which we call a Z-ChiSquare method. In this dissertation, we extended the Z-ChiSquare method to a three-arm, four-step (TEST vs. PLA, REF vs. PLA, and TEST vs. REF) clinical BE study; We also proposed an exact calculation of power using a multivariate non-central t distribution, which we call Exact-t. Simulation shows that the Exact-t method is more accurate than the Z-ChiSquare method when sample size is small, and is computationally more efficient. We also applied the Exact-t method to find an optimal ratio of active drugs (TEST and REF) vs. PLA to allocate sample size in order to attain the maximum power for a given total sample size. This will help pharmaceutical sponsors cut costs and maximize the utility of enroll subjects.