Unification of Randomized Response Designs and Certain Aspects of Post-Randomization for Statistical Disclosure Control Open Access
Downloadable ContentDownload PDF
We discuss two closely related topics - randomized response (RR) surveys and post-randomization - that are concerned with survey respondents' privacy protection. We present a common framework for various RR surveys of dichotomous populations with polychotomous responses. The unified approach addresses both respondents' privacy and statistical efficiency and is helpful for fair comparison of various procedures. We discuss unbiased estimation of the proportion (π) of a population that belongs to a sensitive group, based on RR survey data. We develop an approach for comparing RR procedures, taking both respondents' protection and statistical efficiency into account. For any given RR design with three or more response categories, we can find RR procedures with a binary response variable, which provide the same respondents' protection and at least as much statistical information. This result suggests that RR surveys of dichotomous populations should use only binary response variables. We also present some results for RR surveys of polychotomous populations. The second topic is post-randomization (PRAM), which is a statistical disclosure control technique for categorical variables. The PRAM stochastically transforms each record in a data set using pre-selected probabilities. We focus on a special case of PRAM, known as invariant PRAM, and introduce the notion of a strongly invariant PRAM. The invariant PRAM is attractive in that in the strong situation, the PRAMed data can be analyzed without any adjustment for post-randomization. We review methods for constructing invariant PRAM matrices, clarify certain misconceptions about invariant PRAMs and discuss estimation from an invariantly PRAMed data set. Finally, we examine the effectiveness of PRAM for limiting statistical disclosure.