Back to Top

Anonymization practices

No international standard defines the methods for anonymizing data, acceptable levels of risk, or recommended measures of information loss. How much and the type of protection required is specific to each dataset, depending on the sensitivity and “commercial value" of the content, and to each specific legal and cultural environment. It is therefore useful to document some practices. This is, however, not an easy task, as agencies that anonymize their datasets do not communicate much on the methods implemented and the levels of risk in the data they disseminate.

This limited access to knowledge combined with a lack of experience in using the tools and methods makes it difficult for many agencies to implement “optimal” solutions. By optimal we mean; meet their obligations towards privacy protection but also their obligation to release data useful for policy monitoring and evaluation. In order to bridge this gap in practical guidelines The World Bank completed a project funded by the Knowledge for Change Program II, which sought to build a knowledge base through experimentation on a diverse set of microdata. This knowledge was then be translated into a practice guide for public release. The practice guide fills a critical gap by documenting research conducted at the World Bank through a large-scale evaluation of anonymization techniques, and (ii) translating these results into practical guidelines. This practice guide was released in 2015. It is being updated regularly in order to add new methods and to keep up with new features available in the open source R software package sdcMicro which the guides uses for its practical examples. A current version of the guide can be found here.