Back to Top


This phase is comprised of six sub-processes:

  • 2.1.Design outputs – This sub-process contains the detailed design of statistical outputs to be produced, including the related development work and preparation of the systems and tools used in phase 7 (Disseminate). Wherever possible, outputs should be designed to follow existing standards. Inputs to this process may include metadata from similar or previous collections, international standards, and information about practices in other statistical organizations from sub-process 1.1 (Determine need for information).
  • 2.2.Design variable descriptions – This sub-process defines the statistical variables to be collected via the data collection instrument, as well as any other variables derived from them in sub-process 5.5 (Derive new variables and statistical units) and any classifications that will be used. It is expected that existing national and international standards will be followed wherever possible. This sub-process may need to run in parallel with sub-process 2.3, Design data collection methodology, as the definition of the variables to be collected and the choice of data collection instrument may be inter-dependent to some degree. Preparation of metadata descriptions of collected and derived variables and classifications is a necessary precondition for subsequent phases.
  • 2.3.Design data collection methodology - This sub-process determines the most appropriate data collection method(s) and instrument(s). Actual activities in this sub-process will vary according to the type of collection instruments required, and can include computer-assisted interviewing, paper questionnaires, administrative data interfaces, and data integration techniques. This sub-process includes design of question-and-response templates in conjunction with the variables and classifications designed in sub-process 2.2, Design variable descriptions. It also includes design of any formal agreements relating to data supply, such as memoranda of understanding, and confirmation of the legal basis for the data collection. This sub-process is enabled by tools such as question libraries to facilitate the reuse of questions and related attributes; questionnaire tools to enable quick and easy compilation of questions into formats suitable for cognitive testing; and agreement templates to standardize terms and conditions. This sub-process also includes the design of process-specific provider management systems.
  • 2.4.Design frame and sample methodology - This sub-process identifies and specifies populations of interest, defines a sampling framework (and, where necessary, the register from which it is derived), and determines the most appropriate sampling criteria and methodology, which may include complete enumeration. Common sources are administrative and statistical registers, censuses, and sample surveys. This sub-process describes how these sources can be combined if needed. Analysis of whether the framework assesses the target population should be performed and a sampling plan made: the actual sample is created by sub-process 4.1 (Select sample), using the methodology specified in this sub-process.

    The purpose of the handbook is to include in one publication sample survey design issues for convenient referral by practicing national statisticians, researchers, and analysts involved in sample survey work and activities. Methodologically sound techniques that are grounded in statistical theory are presented, implying the use of probability sampling at each stage of the sample selection process.
  • 2.5.Design statistical processing methodology - This sub-process designs the statistical processing methodology to be applied during phase 5 (Process) and Phase 6 (Analyze). This can include specification of routines for coding, editing, imputing, estimating, integrating, validating, and finalizing data sets.
  • 2.6.Design production systems and workflow - This sub-process determines the workflow from data collection to archiving, providing an overview of all processes required within the entire statistical production process and ensuring that they fit together efficiently without gaps or redundancies. Various systems and databases are needed throughout the process. Since the general principle is to reuse processes and technology across many statistical business processes, existing systems and databases should be examined first to determine whether they are fit for this specific process; then, if any gaps are identified, new solutions should be designed. This sub-process also considers how staff will interact with systems, and who will be responsible for what and when.