Analysis
The objective of conducting household surveys is to analyze the collected data and produce statistics. Analytical techniques used range from computing descriptive statistics to multivariate modeling. The analysis of household survey data differs from standard statistical analysis in two important ways: the use of sampling weights and the computation of standard errors that are adjusted for sampling design. Sampling weights, the inverse probability of selection, are used to ensure the statistics produced describe the population rather than the simply households surveyed. And, because household survey data are typically based on stratified multistage samples, the standard errors of statistics must be adjusted for sample design effects. This adjustment is important in order not to overstate the precision of statistics produced from household survey data.
The Replication Standard
Good information on how statistics are produced facilitates understanding, replication and quality. It is important to document the microdata files on which the statistics are based as well as the process and analytical methods used to generate them. See the microdata documentation page for more tools and guidelines to facilitate compilation of this metadata. Good practice for documenting household survey analysis is to comply with the replication standard which, succinctly, is defined as follows: sufficient information exists with which to understand, evaluate, and build upon household survey analysis if a third party could replicate the results without any additional information from the producer.
For example, consider the production of poverty estimates from an initial survey microdata file that contains the necessary data on household composition, consumption and prices. Applying the replication standard requires compiling metadata on every step undertaken to transform the raw microdata file into an estimate of the percentage of people with consumption expenditures below the poverty line. This would include, but is not limited to, any data cleaning, imputation or other change to the number and quality of records in the initial microdata file as well as the definitions of every constructed variable using the variables in the initial microdata file (such as adult equivalent household size, the poverty line, the price deflators and baskets, and the household consumption aggregate). This metadata could come in the form of published reports, code written in statistical software, and explanatory notes. In this particular example, one could also provide both the initial microdata set as well as the final replication data set that includes constructed variables at the household level. The microdata management toolkit enables full documentation of the microdata files and integrating additional metadata such as reports and statistical software code.
The ISHN is working towards compiling a set of good practice guidelines and tools for household survey analysis on specific statistics and topics such as education, health, sanitation, remittances and poverty. This material will be integrated with the Question Bank application and will include links to specialized analytical tools and recommended readings whenever available. Household survey analysts who are interested in contributing and sharing experience and tools via the IHSN website are encouraged to contact the IHSN secretariat.
King, Gary. "Replication, Replication,"PS: Political Science and Politics, with comments from nineteen authors (Article: PDF) and a response, "A Revised Proposal, Proposal," Vol. XXVIII, No. 3 (September, 1995): Pp. 443--499, copy at http://gking.harvard.edu/files/abs/replication-abs.shtml (Article: PDF)
Guidelines and Resources for Household Survey Analysis
There are many resources available for household survey analysts in the form of books, reports and on-line statistical software user forums. Below we provide a non-exhaustive list of general guidelines and reference material for household survey analysis.
The Analysis of Household Surveys. A Microeconometric Approach to Development Policy
Angus S. Deaton (1997)
Baltimore MD, Johns Hopkins University Press for The World Bank
See in particular:
- Chapter 1. The design and Content of Household Surveys
- Chapter 2. Econometric Issues for Survey Data
Household Sample Surveys in Developing and Transition Countries
United Nations Department of Economic and Social Affairs, Statistics Division (2005)
Study in Methods Series F No. 96, United Nations, New York.
See in particular:
- Chapter XVI. Presenting Simple Descriptive Statistics from Household Survey Data
- Chapter XVIII. Multivariate Methods for Index Construction
- Chapter XIX. Statistical Analysis of Survey Data
- Chapter XX. More Advanced Approaches to the Analysis of Survey Data
- Chapter XXI. Sampling Error Estimation for Survey Data
Commercial Software Packages for Survey Analysis
Stata is an integrated and programmable statistical package for microdata analysis. The software is widely used and supported through on-line course, user-forums, manuals and a few thousand published books. Typing “stata survey analysis” in an internet search engine will provide a wealth of information. Below is a list with a few selected resources.
The Boston College Department of Economics - Statistical Software Components supports a website that contains useful and free material for Stata users as well as a large collection of Stata programs (ado files) produced for many different purposes by the community of users. These programs are usually provided with the corresponding help file and a search engine allows easy location of programs. This initiative enables Stata users to save considerable time and improve the quality of programming code.
Stata for Surveys (by Leidi et al., May 2005) is a free introductory guide designed to support the use of Stata for the analysis of survey data. The original impetus for this training manual was based on a request from the Kenya National Bureau of Statistics (KNBS).
The UCLA Academic Technology Services website contains a collection of resources on survey data analysis with Stata, including free on-line tutorials and videos which cover a range of introductory to advanced topics.
The Stata Journal is a quarterly publication containing articles about statistics, data analysis, teaching methods, and effective use of Stata's language. The Journal publishes reviewed papers together with shorter notes and comments, regular columns, book reviews, and other material of interest to researchers applying statistics in a variety of disciplines.
Alternative commercial software packages for statistical analysis include SPSS, SAS and R.
Free Software for Household Survey Analysis
ADePT is a software platform for Automated Economic Analysis. It is developed by a team at Research Department of the World Bank to provide survey analysts with an easy to use interface (powered by an integrated set of Stata programs) that allows users to produce various statistics, tables and graphics typically used in reports based on household survey data.
