Creating Useful Integrated Data Sets to Inform Public Policy Open Access
Downloadable ContentDownload PDF
The costs of traditional primary data collection have risen dramatically over the past decade. For example, the cost of the decennial census of population and housing, conducted by the U.S. Census Bureau, has risen from $6 billion in 2000 to an estimated $14.5 billion in 2010. Other surveys and censuses conducted by the government have also risen in costs. Yet some of the same data are collected by other federal agencies and contained in administrative records such as Medicare and tax records. Sharing of administrative record data between federal agencies has the potential to increase the information that is available for policy makers while saving money. Significant policy issues related to safeguarding privacy and confidentiality, as well as questions about data quality have resulted in barriers that slow down or stop record sharing. This research used two exploratory case studies to examine the creation of integrated data sets among three government agencies, the Internal Revenue Service (IRS), the U.S. Census Bureau (Census), and the Centers for Medicare and Medicaid Services (CMS). It identified the policy issues raised by the creation of such data pools and examined how these issues are approached in a decentralized governmental statistical system, such as that found in the United States. The creation of new, combined data sets and the related policy issues were examined through five dimensions, legal, technical, organizational, perceptual, and human. The study found that each agency involved in sharing administrative records is governed by a different set of statutes and regulations that only partially overlap. This patchwork of laws and regulations greatly slows down the initiation of record sharing projects. Each agency has its own distinct internal processes for approving and tracking record sharing projects, but there are no mature government-wide shared processes or criteria for reviewing or approving projects involving multiple agencies. The current processes are slow and burdensome and discourage initiation of new projects. Action and research are needed to take advantage of the benefits of integrated data sets in furthering the goals of public policy.