Descriptions of key issues in survey research and questionnaire design are highlighted in the following sections. Modes of data collection approaches are described together with their advantages and disadvantages. Descriptions of commonly used sampling designs are provided and the primary sources of survey error are identified. Terms relating to the topics discussed here are defined in the Research Glossary.
Survey research is a commonly-used method of collecting information about a population of interest. The population may be composed of a group of individuals (e.g., children under age five, kindergarteners, parents of young children) or organizations (e.g., early care and education programs, k-12 public and private schools).
There are many different types of surveys, several ways to administer them, and different methods for selecting the sample of individuals or organizations that will be invited to participate. Some surveys collect information on all members of a population and others collect data on a subset of a population. Examples of the former are the National Center for Education Statistics' Common Core of Data and the Administration for Children and Families' Survey of Early Head Start Programs (PDF).
A survey may be administered to a sample of individuals (or to the entire population) at a single point in time (cross-sectional survey), or the same survey may be administered to different samples from the population at different time points (repeat cross-sectional). Other surveys may be administered to the same sample of individuals at different time points (longitudinal survey). The Survey of Early Head Start Programs is an example of a cross-sectional survey and the National Household Education Survey Program is an example of a repeat cross-sectional survey. Examples of longitudinal surveys include the Head Start Family and Child Experiences Survey and the Early Childhood Longitudinal Study, Birth and Kindergarten Cohorts.
Regardless of the type of survey, there are two key features of survey research:
The American Association for Public Opinion Research (AAPOR) offers recommendations on how to produce the best survey possible: Best Practices for Survey Research.
AAPOR also provides guidelines on how to assess the quality of a survey: Evaluating Survey Quality in Today's Complex Environment.
The two most common types of survey questions are closed-ended questions and open-ended questions.
A well designed questionnaire is more than a collection of questions on one or more topics. When designing a questionnaire, researchers must consider a number of factors that can affect participation and the responses given by survey participants. Some of the things researchers must consider to help ensure high rates of participation and accurate survey responses include:
Questionnaires and the procedures that will be used to administer them should be pretested (or field tested) before they are used in a main study. The goal of the pretest is to identify any problems with how questions are asked, whether they are understood by individuals similar to those who will participate in the main study, and whether response options in close-ended questions are adequate. For example, a parent questionnaire that will be used in a large study of preschool-age children may be administered first to a small (often non-random) sample of parents in order to identify any problems with how questions are asked and understood and whether the response options that are offered to parents are adequate.
Based on the findings of the pretest, additions or modifications to questionnaire items and administration procedures are made prior to their use in the main study.
See the following for more information about questionnaire design:
Surveys can be administered in four ways: through the mail, by telephone, in-person or online. When deciding which of these approaches to use, researchers consider: the cost of contacting the study participant and of data collection, the literacy level of participants, response rate requirements, respondent burden and convenience, the complexity of the information that is being sought and the mix of open-ended and close-ended questions.
Some of the main advantages and disadvantages of the different modes of administration are summarized below.
Increasingly, researchers are using a mix of these methods of administration. Mixed-mode or multi-mode surveys use two or more data collection modes in order to increase survey response. Participants are given the option of choosing the mode that they prefer, rather than this being dictated by the research team. For example, the Head Start Family and Child Experience Survey (2014-2015) offers teachers the option of completing the study's teacher survey online or using a paper questionnaire. Parents can complete the parent survey online or by phone.
See the following for additional information about survey administration:
In child care and early education research as well as research in other areas, it is often not feasible to survey all members of the population of interest. Therefore, a sample of the members of the population would be selected to represent the total population.
A primary strength of sampling is that estimates of a population's characteristics can be obtained by surveying a small proportion of the population. For example, it would not be feasible to interview all parents of preschool-age children in the U.S. in order to obtain information about their choices of child care and the reasons why they chose certain types of care as opposed to others. Thus, a sample of preschoolers' parents would be selected and interviewed, and the data they provide would be used to estimate the types of child care parents as a whole choose and their reasons for choosing these programs. There are two broad types of sampling:
Nonprobability sampling: The selection of participants from a population is not determined by chance. Each member of the population does not have a known or given chance of being selected into the sample. Findings from nonprobability (nonrandom) samples cannot be generalized to the population of interest. Consequently, it is problematic to make inferences about the population. Common nonprobability sampling techniques include convenience sampling, snowball sampling, quota sampling and purposive sampling.
Probability sampling: The selection of participants from the population is determined by chance and with each individual having a known, non-zero probability of selection. It provides accurate descriptions of the population and therefore good generalizability. In survey research, it is the preferred sampling method.
Three forms of probability sampling are described here:
Simple Random Sampling
This is the most basic form of sampling. Every member of the population has an equal chance of being selected. This sampling process is similar to a lottery: the entire population of interest could be selected for the survey, but only a few are chosen at random. For example, researchers may use random-digit dialing to perform simple random sampling for telephone surveys. In this procedure, telephone numbers are generated by a computer at random and called to identify individuals to participate in the survey.
Stratified sampling is used when researchers want to ensure representation across groups, or strata, in the population. The researchers will first divide the population into groups based on characteristics such as race/ethnicity, and then draw a random sample from each group. The groups must be mutually exclusive and cover the population. Stratified sampling provides greater precision than a simple random sample of the same size.
Cluster sampling is generally used to control costs and when it is geographically impossible to undertake a simple random sample. For example, in a household survey with face-to-face interviews, it is difficult and expensive to survey households across the nation using a simple random sample design. Instead, researchers will randomly select geographic areas (for example, counties), then randomly select households within these areas. This creates a cluster sample, in which respondents are clustered together geographically.
Survey research studies often use a combination of these probability methods to select their samples. Multistage sampling is a probability sampling technique where sampling is carried out in several stages. It is often used to select samples when a single frame is not available to select members for a study sample. For example, there is no single list of all children enrolled in public school kindergartens across the U.S. Therefore, researchers who need a sample of kindergarten children will first select a sample of schools with kindergarten programs from a school frame (e.g., National Center for Education Statistics' Common Core of Data) (Stage 1). Lists of all kindergarten classrooms in selected schools are developed and a sample of classrooms selected in each of the sampled schools (Stage 2). Finally, lists of children in the sampled classrooms are compiled and a sample of children is selected from each of the classroom lists (Stage 3). Many of the national surveys of child care and early education (e.g., the Head Start Family and Child Experiences Survey and the Early Childhood Longitudinal Survey-Kindergarten Cohort) use a multistage approach.
Multistage, cluster and stratified sampling require that certain adjustments be made during the statistical analysis. Sampling or analysis weights are often used to account for differences in the probability of selection into the sample as well as for other factors (e.g., sampling frame, undercoverage, and nonresponse). Standard errors are calculated using methodologies that are different from those used for a simple random sample. Information on these adjustments is provided by the National Center for Education Statistics through its Distance Learning Dataset Training System.
See the following for additional information about the different types of sampling approaches and their use:
Estimates of the characteristics of a population using survey data are subject to two basic sources of error: sampling error and nonsampling error. The extent to which estimates of the population mean, proportion and other population values differ from the true values of these is affected by these errors.
Sampling error is the error that occurs because all members of the population are not sampled and measured. The value of a statistic (e.g., mean or percentage) that is calculated from different samples that are drawn from the same population will not always be the same. For example, if several different samples of 5,000 people are drawn at random from the U.S. population, the average income of the 5,000 people in those samples will differ. (In one sample, Bill Gates may have been selected at random from the population, which would lead to a very high mean income for that sample.)
Researchers use a statistic called the standard error to measure the extent to which estimated statistics (percentages, means, and coefficients) vary from what would be found in other samples. The smaller the standard error, the more precise are the estimates from the sample. Generally, standard errors and sample size are negatively related, that is, larger samples have smaller standard errors.
Nonsampling error includes all errors that can affect the accuracy of research findings other than errors associated with selecting the sample (sampling error). They can occur in any phase of a research study (planning and design, data collection, or data processing). They include errors that occur due to coverage error (when units in the target population are missing from the sampling frame), nonresponse to surveys (nonresponse error), measurement errors due to interviewer or respondent behavior, errors introduced by how survey questions were worded or by how data were collected (e.g., in-person interview, online survey), and processing error (e.g., errors made during data entry or when coding open-ended survey responses). While sampling error is limited to sample surveys, nonsampling error can occur in all surveys.
Measurement error is the difference between the value measured in a survey or on a test and the true value in the population. Some factors that contribute to measurement error include the environment in which a survey or test is administered (e.g., administering a math test in a noisy classroom could lead children to do poorly even though they understand the material), poor measurement tools (e.g., using a tape measure that is only marked in feet to measure children's height would lead to inaccurate measurement), rater or interviewer effects (e.g., survey staff who deviate from the research protocol).
Measurement error falls into two broad categories: systematic error and random error. Systematic error is the more serious of the two.
Occurs when the survey responses are systematically different from the target population responses. It is caused by factors that systematically affect the measurement of a variable across the sample.
For example, if a researcher only surveyed individuals who answered their phone between 9 and 5, Monday through Friday, the survey results would be biased toward individuals who are available to answer the phone during those hours (e.g., individuals who are not in the labor force or who work outside of the traditional Monday through Friday, 9 am to 5 pm schedule).
Random error is an expected part of survey research, and statistical techniques are designed to account for this sort of measurement error. It is caused by factors that randomly affect measurement of the variable across the sample.
Random error occurs because of natural and uncontrollable variations in the survey process, i.e., the mood of the respondent, lack of precision in measures used, and the particular measures/instruments (e.g., inaccuracy in scales used to measure children's weight).
For example, a researcher may administer a survey about marital happiness. However, some respondents may have had a fight with their spouse the evening prior to the survey, while other respondents' spouses may have cooked the respondent's favorite meal. The survey responses will be affected by the random day on which the respondents were chosen to participate in the study. With random error, the positive and negative influences on the survey measures are expected to balance out.
See the following for additional information about the different types and sources of errors:
Administrative data are an important source of information for social science research. For example, school records have been used to track trends in student academic performance. Administrative data generally refers to data collected as part of the management and operations of a publicly funded program or service. Today, use of administrative data is becoming increasingly common in research about child care and early education. These data often are a relatively cost-effective way to learn more about the individuals and families using a particular service or participating in a particular program, but they do have some important limitations.
The advantages and disadvantages of using administrative data are described here. Issues pertaining to the access to such data are discussed. Terms relating to administrative data and its use in research studies are defined in the Research Glossary.
Administrative data are collected to manage services and comply with government reporting regulations. Because the original purpose of the data is not research, this presents several challenges.
Researchers interested in using administrative data for the purpose of research should expect to invest considerable time learning about the details of the administrative data system, the specific data elements being used, the data entry process and standards, and changes in the data system and data definitions over time. It also takes time to transform administrative data into research datasets that can be used in statistical analyses.
Important issues usually confront researchers who have decided to use administrative data records in their research. Among the most important of these issues are:
Sampling is not often done when administrative data are used for research purposes since information are available on the entire population of recipients. However, in order to ensure the protection of subject confidentiality a subsample from the full population may be selected. Studies that combine the use of survey research and administrative data records may also select only a sample of the population in order to minimize data collection costs.
The Joint Center for Poverty Research offers many recommendations on using administrative data (PDF). It recognizes the following centers as having successfully used administrative data in their research efforts:
More resources on administrative data integration, analyses, management, confidentiality, and security can be found here: Working with Administrative Data. Also, see Profiles in Success of Statistical Uses of Administrative Data (PDF) for more information on the use of administrative data.
Field research is a qualitative method of research concerned with understanding and interpreting the social interactions of groups of people, communities, and society by observing and interacting with people in their natural settings. The methods of field research include: direct observation, participant observation, and qualitative interviews. Each of these methods is described here. Terms related to these and other topics in field research are defined in the Research Glossary.
Direct observation is a method of research where the researcher watches and records the activities of individuals or groups engaged in their daily activities. The observations may be unstructured or structured. Unstructured observations involve the researcher observing people and events and recording his/her observations as field notes. Observations are recorded holistically and without the aid of a predetermined guide or protocol. Structured observation, on the other hand, is a technique where a researcher observes people and events using a guide or set protocol that has been developed ahead of time.
Other features of direct observation include:
Participant observation is a field research method whereby the researcher develops an understanding of a group or setting by taking part in the everyday routines and rituals alongside its members. It was originally developed in the early 20th century by anthropologists researching native societies in developing countries. It is now the principal research method used by ethnographers -- specialists within the fields of anthropology and sociology who focus on recording the details of social life occurring in a setting, community, group, or society. The ethnographer, who often lives among the members for months or years, attempts to build trusting relationships so that he or she becomes part of the social setting. As the ethnographer gains the confidence and trust of the members, many will speak and behave in a natural manner in the presence of the ethnographer.
Data from participant observation studies can take several forms:
There are a number of advantages and disadvantages to direct and participant observation studies. Here is a list of some of both. While the advantages and disadvantages apply to both types of studies, their impact and importance may not be the same across the two. For example, researchers engaged in both types of observation will develop a rich, deep understanding of the members of the group and the setting in which social interactions occur, but researchers engaged in participant observation research may gain an even deep understanding. And, participant observers have a greater chance of witnessing a wider range of behaviors and events than those engaged in direct observation.
Advantages of observation studies (observational research):
Disadvantages of observation studies:
Qualitative interviews are a type of field research method that elicits information and data by directly asking questions of individuals. There are three primary types of qualitative interviews: informal (conversational), semi-structured, and standardized, open-ended. Each is described briefly below along with advantages and disadvantages.
Advantages of informal interviewing:
Disadvantages of informal interviewing:
Advantages of semi-structured interviewing:
Disadvantages of semi-structured interviewing:
Advantages of standardized interviewing:
Disadvantages of standardized interviewing:
Both standardized and semi-structured interviews involve formally recruiting participants and are typically tape-recorded. The researcher should begin with obtaining informed consent from the interviewee prior to starting the interview. Additionally, the researcher may write a separate field note to describe the interviewee's reactions to the interview, or events that occurred before or after the interview.
See the following for additional information about field research and qualitative research methods.