Anonymization
Statistical offices all over the world are put under pressure to publish microdata obtained from sample surveys, censuses and administrative data collection systems. Decision-makers and researchers need such data for various purposes, ranging from purely academic research to impact evaluation of policies and programs. This must be done in such a way that the confidentiality of the information provided by respondents is preserved.
This section describes the principles associated with microdata anonymization, and presents various techniques used for measuring the disclosure risk, for anonymizing data, and for assessing the resulting information loss. Links to available tools and guidelines are also provided.
Anonymization is typically required for the production of public use files, and to a lesser extent for generating licensed files. But anonymization is only one of many solutions to minimize the risk of disclosure when distributing microdata. Other legal and organizational measures contribute to this endeavour as well. For datasets that are provided to selected bona fide users, the legal agreement may include a higher level of security than anonymization alone (see the section on formulating a data dissemination policy).
