Quasi-Experimental Design | Definition, Types & Examples

Published on July 31, 2020 by Lauren Thomas . Revised on January 22, 2024.

Like a true experiment , a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable .

However, unlike a true experiment, a quasi-experiment does not rely on random assignment . Instead, subjects are assigned to groups based on non-random criteria.

Quasi-experimental design is a useful tool in situations where true experiments cannot be used for ethical or practical reasons.

Quasi-experimental design vs. experimental design

Table of contents

Differences between quasi-experiments and true experiments, types of quasi-experimental designs, when to use quasi-experimental design, advantages and disadvantages, other interesting articles, frequently asked questions about quasi-experimental designs.

There are several common differences between true and quasi-experimental designs.

True experimental design Quasi-experimental design
Assignment to treatment The researcher subjects to control and treatment groups. Some other, method is used to assign subjects to groups.
Control over treatment The researcher usually . The researcher often , but instead studies pre-existing groups that received different treatments after the fact.
Use of Requires the use of . Control groups are not required (although they are commonly used).

Example of a true experiment vs a quasi-experiment

However, for ethical reasons, the directors of the mental health clinic may not give you permission to randomly assign their patients to treatments. In this case, you cannot run a true experiment.

Instead, you can use a quasi-experimental design.

You can use these pre-existing groups to study the symptom progression of the patients treated with the new therapy versus those receiving the standard course of treatment.

Many types of quasi-experimental designs exist. Here we explain three of the most common types: nonequivalent groups design, regression discontinuity, and natural experiments.

Nonequivalent groups design

In nonequivalent group design, the researcher chooses existing groups that appear similar, but where only one of the groups experiences the treatment.

In a true experiment with random assignment , the control and treatment groups are considered equivalent in every way other than the treatment. But in a quasi-experiment where the groups are not random, they may differ in other ways—they are nonequivalent groups .

When using this kind of design, researchers try to account for any confounding variables by controlling for them in their analysis or by choosing groups that are as similar as possible.

This is the most common type of quasi-experimental design.

Regression discontinuity

Many potential treatments that researchers wish to study are designed around an essentially arbitrary cutoff, where those above the threshold receive the treatment and those below it do not.

Near this threshold, the differences between the two groups are often so minimal as to be nearly nonexistent. Therefore, researchers can use individuals just below the threshold as a control group and those just above as a treatment group.

However, since the exact cutoff score is arbitrary, the students near the threshold—those who just barely pass the exam and those who fail by a very small margin—tend to be very similar, with the small differences in their scores mostly due to random chance. You can therefore conclude that any outcome differences must come from the school they attended.

Natural experiments

In both laboratory and field experiments, researchers normally control which group the subjects are assigned to. In a natural experiment, an external event or situation (“nature”) results in the random or random-like assignment of subjects to the treatment group.

Even though some use random assignments, natural experiments are not considered to be true experiments because they are observational in nature.

Although the researchers have no control over the independent variable , they can exploit this event after the fact to study the effect of the treatment.

However, as they could not afford to cover everyone who they deemed eligible for the program, they instead allocated spots in the program based on a random lottery.

Although true experiments have higher internal validity , you might choose to use a quasi-experimental design for ethical or practical reasons.

Sometimes it would be unethical to provide or withhold a treatment on a random basis, so a true experiment is not feasible. In this case, a quasi-experiment can allow you to study the same causal relationship without the ethical issues.

The Oregon Health Study is a good example. It would be unethical to randomly provide some people with health insurance but purposely prevent others from receiving it solely for the purposes of research.

However, since the Oregon government faced financial constraints and decided to provide health insurance via lottery, studying this event after the fact is a much more ethical approach to studying the same problem.

True experimental design may be infeasible to implement or simply too expensive, particularly for researchers without access to large funding streams.

At other times, too much work is involved in recruiting and properly designing an experimental intervention for an adequate number of subjects to justify a true experiment.

In either case, quasi-experimental designs allow you to study the question by taking advantage of data that has previously been paid for or collected by others (often the government).

Quasi-experimental designs have various pros and cons compared to other types of studies.

  • Higher external validity than most true experiments, because they often involve real-world interventions instead of artificial laboratory settings.
  • Higher internal validity than other non-experimental types of research, because they allow you to better control for confounding variables than other types of studies do.
  • Lower internal validity than true experiments—without randomization, it can be difficult to verify that all confounding variables have been accounted for.
  • The use of retrospective data that has already been collected for other purposes can be inaccurate, incomplete or difficult to access.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

Thomas, L. (2024, January 22). Quasi-Experimental Design | Definition, Types & Examples. Scribbr.

Is this article helpful?

Lauren Thomas

Lauren Thomas

Learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001). Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952). But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate without receiving psychotherapy. This suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here:


Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980). They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Han Eysenck

In a classic 1952 article, researcher Hans Eysenck pointed out the shortcomings of the simple pretest-posttest design for evaluating the effectiveness of psychotherapy.

Wikimedia Commons – CC BY-SA 3.0.

Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979). Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.5 “A Hypothetical Interrupted Time-Series Design” shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Figure 7.5 A Hypothetical Interrupted Time-Series Design

A Hypothetical Interrupted Time-Series Design - The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not

The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

Discussion: Imagine that a group of obese children is recruited for a study in which their weight is measured, then they participate for 3 months in a program that encourages them to be more active, and finally their weight is measured again. Explain how each of the following might affect the results:

  • regression to the mean
  • spontaneous remission

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin.

Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324.

Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146.

Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press.

Research Methods in Psychology

Quasi-Experimental Design: Definition, Types, Examples

Appinio Research · 19.12.2023

Quasi-Experimental Design Definition Types Examples

Ever wondered how researchers uncover cause-and-effect relationships in the real world, where controlled experiments are often elusive? Quasi-experimental design holds the key. In this guide, we'll unravel the intricacies of quasi-experimental design, shedding light on its definition, purpose, and applications across various domains. Whether you're a student, a professional, or simply curious about the methods behind meaningful research, join us as we delve into the world of quasi-experimental design, making complex concepts sound simple and embarking on a journey of knowledge and discovery.

What is Quasi-Experimental Design?

Quasi-experimental design is a research methodology used to study the effects of independent variables on dependent variables when full experimental control is not possible or ethical. It falls between controlled experiments, where variables are tightly controlled, and purely observational studies, where researchers have little control over variables. Quasi-experimental design mimics some aspects of experimental research but lacks randomization.

The primary purpose of quasi-experimental design is to investigate cause-and-effect relationships between variables in real-world settings. Researchers use this approach to answer research questions, test hypotheses, and explore the impact of interventions or treatments when they cannot employ traditional experimental methods. Quasi-experimental studies aim to maximize internal validity and make meaningful inferences while acknowledging practical constraints and ethical considerations.

Quasi-Experimental vs. Experimental Design

It's essential to understand the distinctions between Quasi-Experimental and Experimental Design to appreciate the unique characteristics of each approach:

  • Randomization:  In Experimental Design, random assignment of participants to groups is a defining feature. Quasi-experimental design, on the other hand, lacks randomization due to practical constraints or ethical considerations.
  • Control Groups :  Experimental Design typically includes control groups that are subjected to no treatment or a placebo. The quasi-experimental design may have comparison groups but lacks the same level of control.
  • Manipulation of IV:  Experimental Design involves the intentional manipulation of the independent variable. Quasi-experimental design often deals with naturally occurring independent variables.
  • Causal Inference:  Experimental Design allows for stronger causal inferences due to randomization and control. Quasi-experimental design permits causal inferences but with some limitations.

When to Use Quasi-Experimental Design?

A quasi-experimental design is particularly valuable in several situations:

  • Ethical Constraints:  When manipulating the independent variable is ethically unacceptable or impractical, quasi-experimental design offers an alternative to studying naturally occurring variables.
  • Real-World Settings:  When researchers want to study phenomena in real-world contexts, quasi-experimental design allows them to do so without artificial laboratory settings.
  • Limited Resources:  In cases where resources are limited and conducting a controlled experiment is cost-prohibitive, quasi-experimental design can provide valuable insights.
  • Policy and Program Evaluation:  Quasi-experimental design is commonly used in evaluating the effectiveness of policies, interventions, or programs that cannot be randomly assigned to participants.

Importance of Quasi-Experimental Design in Research

Quasi-experimental design plays a vital role in research for several reasons:

  • Addressing Real-World Complexities:  It allows researchers to tackle complex real-world issues where controlled experiments are not feasible. This bridges the gap between controlled experiments and purely observational studies.
  • Ethical Research:  It provides an honest approach when manipulating variables or assigning treatments could harm participants or violate ethical standards.
  • Policy and Practice Implications:  Quasi-experimental studies generate findings with direct applications in policy-making and practical solutions in fields such as education, healthcare, and social sciences.
  • Enhanced External Validity:  Findings from Quasi-Experimental research often have high external validity, making them more applicable to broader populations and contexts.

By embracing the challenges and opportunities of quasi-experimental design, researchers can contribute valuable insights to their respective fields and drive positive changes in the real world.

Key Concepts in Quasi-Experimental Design

In quasi-experimental design, it's essential to grasp the fundamental concepts underpinning this research methodology. Let's explore these key concepts in detail.

Independent Variable

The independent variable (IV) is the factor you aim to study or manipulate in your research. Unlike controlled experiments, where you can directly manipulate the IV, quasi-experimental design often deals with naturally occurring variables. For example, if you're investigating the impact of a new teaching method on student performance, the teaching method is your independent variable.

Dependent Variable

The dependent variable (DV) is the outcome or response you measure to assess the effects of changes in the independent variable. Continuing with the teaching method example, the dependent variable would be the students' academic performance, typically measured using test scores, grades, or other relevant metrics.

Control Groups vs. Comparison Groups

While quasi-experimental design lacks the luxury of randomly assigning participants to control and experimental groups, you can still establish comparison groups to make meaningful inferences. Control groups consist of individuals who do not receive the treatment, while comparison groups are exposed to different levels or variations of the treatment. These groups help researchers gauge the effect of the independent variable.

Pre-Test and Post-Test Measures

In quasi-experimental design, it's common practice to collect data both before and after implementing the independent variable. The initial data (pre-test) serves as a baseline, allowing you to measure changes over time (post-test). This approach helps assess the impact of the independent variable more accurately. For instance, if you're studying the effectiveness of a new drug, you'd measure patients' health before administering the drug (pre-test) and afterward (post-test).

Threats to Internal Validity

Internal validity is crucial for establishing a cause-and-effect relationship between the independent and dependent variables. However, in a quasi-experimental design, several threats can compromise internal validity. These threats include:

  • Selection Bias :  When non-randomized groups differ systematically in ways that affect the study's outcome.
  • History Effects:  External events or changes over time that influence the results.
  • Maturation Effects:  Natural changes or developments that occur within participants during the study.
  • Regression to the Mean:  The tendency for extreme scores on a variable to move closer to the mean upon retesting.
  • Attrition and Mortality:  The loss of participants over time, potentially skewing the results.
  • Testing Effects:  The mere act of testing or assessing participants can impact their subsequent performance.

Understanding these threats is essential for designing and conducting Quasi-Experimental studies that yield valid and reliable results.

Randomization and Non-Randomization

In traditional experimental designs, randomization is a powerful tool for ensuring that groups are equivalent at the outset of a study. However, quasi-experimental design often involves non-randomization due to the nature of the research. This means that participants are not randomly assigned to treatment and control groups. Instead, researchers must employ various techniques to minimize biases and ensure that the groups are as similar as possible.

For example, if you are conducting a study on the effects of a new teaching method in a real classroom setting, you cannot randomly assign students to the treatment and control groups. Instead, you might use statistical methods to match students based on relevant characteristics such as prior academic performance or socioeconomic status. This matching process helps control for potential confounding variables, increasing the validity of your study.

Types of Quasi-Experimental Designs

In quasi-experimental design, researchers employ various approaches to investigate causal relationships and study the effects of independent variables when complete experimental control is challenging. Let's explore these types of quasi-experimental designs.

One-Group Posttest-Only Design

The One-Group Posttest-Only Design is one of the simplest forms of quasi-experimental design. In this design, a single group is exposed to the independent variable, and data is collected only after the intervention has taken place. Unlike controlled experiments, there is no comparison group. This design is useful when researchers cannot administer a pre-test or when it is logistically difficult to do so.

Example : Suppose you want to assess the effectiveness of a new time management seminar. You offer the seminar to a group of employees and measure their productivity levels immediately afterward to determine if there's an observable impact.

One-Group Pretest-Posttest Design

Similar to the One-Group Posttest-Only Design, this approach includes a pre-test measure in addition to the post-test. Researchers collect data both before and after the intervention. By comparing the pre-test and post-test results within the same group, you can gain a better understanding of the changes that occur due to the independent variable.

Example : If you're studying the impact of a stress management program on participants' stress levels, you would measure their stress levels before the program (pre-test) and after completing the program (post-test) to assess any changes.

Non-Equivalent Groups Design

The Non-Equivalent Groups Design involves multiple groups, but they are not randomly assigned. Instead, researchers must carefully match or control for relevant variables to minimize biases. This design is particularly useful when random assignment is not possible or ethical.

Example : Imagine you're examining the effectiveness of two teaching methods in two different schools. You can't randomly assign students to the schools, but you can carefully match them based on factors like age, prior academic performance, and socioeconomic status to create equivalent groups.

Time Series Design

Time Series Design is an approach where data is collected at multiple time points before and after the intervention. This design allows researchers to analyze trends and patterns over time, providing valuable insights into the sustained effects of the independent variable.

Example : If you're studying the impact of a new marketing campaign on product sales, you would collect sales data at regular intervals (e.g., monthly) before and after the campaign's launch to observe any long-term trends.

Regression Discontinuity Design

Regression Discontinuity Design is employed when participants are assigned to different groups based on a specific cutoff score or threshold. This design is often used in educational and policy research to assess the effects of interventions near a cutoff point.

Example : Suppose you're evaluating the impact of a scholarship program on students' academic performance. Students who score just above or below a certain GPA threshold are assigned differently to the program. This design helps assess the program's effectiveness at the cutoff point.

Propensity Score Matching

Propensity Score Matching is a technique used to create comparable treatment and control groups in non-randomized studies. Researchers calculate propensity scores based on participants' characteristics and match individuals in the treatment group to those in the control group with similar scores.

Example : If you're studying the effects of a new medication on patient outcomes, you would use propensity scores to match patients who received the medication with those who did not but have similar health profiles.

Interrupted Time Series Design

The Interrupted Time Series Design involves collecting data at multiple time points before and after the introduction of an intervention. However, in this design, the intervention occurs at a specific point in time, allowing researchers to assess its immediate impact.

Example : Let's say you're analyzing the effects of a new traffic management system on traffic accidents. You collect accident data before and after the system's implementation to observe any abrupt changes right after its introduction.

Each of these quasi-experimental designs offers unique advantages and is best suited to specific research questions and scenarios. Choosing the right design is crucial for conducting robust and informative studies.

Advantages and Disadvantages of Quasi-Experimental Design

Quasi-experimental design offers a valuable research approach, but like any methodology, it comes with its own set of advantages and disadvantages. Let's explore these in detail.

Quasi-Experimental Design Advantages

Quasi-experimental design presents several advantages that make it a valuable tool in research:

  • Real-World Applicability:  Quasi-experimental studies often take place in real-world settings, making the findings more applicable to practical situations. Researchers can examine the effects of interventions or variables in the context where they naturally occur.
  • Ethical Considerations:  In situations where manipulating the independent variable in a controlled experiment would be unethical, quasi-experimental design provides an ethical alternative. For example, it would be unethical to assign participants to smoke for a study on the health effects of smoking, but you can study naturally occurring groups of smokers and non-smokers.
  • Cost-Efficiency:  Conducting Quasi-Experimental research is often more cost-effective than conducting controlled experiments. The absence of controlled environments and extensive manipulations can save both time and resources.

These advantages make quasi-experimental design an attractive choice for researchers facing practical or ethical constraints in their studies.

Quasi-Experimental Design Disadvantages

However, quasi-experimental design also comes with its share of challenges and disadvantages:

  • Limited Control:  Unlike controlled experiments, where researchers have full control over variables, quasi-experimental design lacks the same level of control. This limited control can result in confounding variables that make it difficult to establish causality.
  • Threats to Internal Validity:  Various threats to internal validity, such as selection bias, history effects, and maturation effects, can compromise the accuracy of causal inferences. Researchers must carefully address these threats to ensure the validity of their findings.
  • Causality Inference Challenges:  Establishing causality can be challenging in quasi-experimental design due to the absence of randomization and control. While you can make strong arguments for causality, it may not be as conclusive as in controlled experiments.
  • Potential Confounding Variables:  In a quasi-experimental design, it's often challenging to control for all possible confounding variables that may affect the dependent variable. This can lead to uncertainty in attributing changes solely to the independent variable.

Despite these disadvantages, quasi-experimental design remains a valuable research tool when used judiciously and with a keen awareness of its limitations. Researchers should carefully consider their research questions and the practical constraints they face before choosing this approach.

How to Conduct a Quasi-Experimental Study?

Conducting a Quasi-Experimental study requires careful planning and execution to ensure the validity of your research. Let's dive into the essential steps you need to follow when conducting such a study.

1. Define Research Questions and Objectives

The first step in any research endeavor is clearly defining your research questions and objectives. This involves identifying the independent variable (IV) and the dependent variable (DV) you want to study. What is the specific relationship you want to explore, and what do you aim to achieve with your research?

  • Specify Your Research Questions :  Start by formulating precise research questions that your study aims to answer. These questions should be clear, focused, and relevant to your field of study.
  • Identify the Independent Variable:  Define the variable you intend to manipulate or study in your research. Understand its significance in your study's context.
  • Determine the Dependent Variable:  Identify the outcome or response variable that will be affected by changes in the independent variable.
  • Establish Hypotheses (If Applicable):  If you have specific hypotheses about the relationship between the IV and DV, state them clearly. Hypotheses provide a framework for testing your research questions.

2. Select the Appropriate Quasi-Experimental Design

Choosing the right quasi-experimental design is crucial for achieving your research objectives. Select a design that aligns with your research questions and the available data. Consider factors such as the feasibility of implementing the design and the ethical considerations involved.

  • Evaluate Your Research Goals:  Assess your research questions and objectives to determine which type of quasi-experimental design is most suitable. Each design has its strengths and limitations, so choose one that aligns with your goals.
  • Consider Ethical Constraints:  Take into account any ethical concerns related to your research. Depending on your study's context, some designs may be more ethically sound than others.
  • Assess Data Availability:  Ensure you have access to the necessary data for your chosen design. Some designs may require extensive historical data, while others may rely on data collected during the study.

3. Identify and Recruit Participants

Selecting the right participants is a critical aspect of Quasi-Experimental research. The participants should represent the population you want to make inferences about, and you must address ethical considerations, including informed consent.

  • Define Your Target Population:  Determine the population that your study aims to generalize to. Your sample should be representative of this population.
  • Recruitment Process:  Develop a plan for recruiting participants. Depending on your design, you may need to reach out to specific groups or institutions.
  • Informed Consent:  Ensure that you obtain informed consent from participants. Clearly explain the nature of the study, potential risks, and their rights as participants.

4. Collect Data

Data collection is a crucial step in Quasi-Experimental research. You must adhere to a consistent and systematic process to gather relevant information before and after the intervention or treatment.

  • Pre-Test Measures:  If applicable, collect data before introducing the independent variable. Ensure that the pre-test measures are standardized and reliable.
  • Post-Test Measures:  After the intervention, collect post-test data using the same measures as the pre-test. This allows you to assess changes over time.
  • Maintain Data Consistency:  Ensure that data collection procedures are consistent across all participants and time points to minimize biases.

5. Analyze Data

Once you've collected your data, it's time to analyze it using appropriate statistical techniques . The choice of analysis depends on your research questions and the type of data you've gathered.

  • Statistical Analysis :  Use statistical software to analyze your data. Common techniques include t-tests , analysis of variance (ANOVA) , regression analysis , and more, depending on the design and variables.
  • Control for Confounding Variables:  Be aware of potential confounding variables and include them in your analysis as covariates to ensure accurate results.

Chi-Square Calculator :

t-Test Calculator :

6. Interpret Results

With the analysis complete, you can interpret the results to draw meaningful conclusions about the relationship between the independent and dependent variables.

  • Examine Effect Sizes:  Assess the magnitude of the observed effects to determine their practical significance.
  • Consider Significance Levels:  Determine whether the observed results are statistically significant . Understand the p-values and their implications.
  • Compare Findings to Hypotheses:  Evaluate whether your findings support or reject your hypotheses and research questions.

7. Draw Conclusions

Based on your analysis and interpretation of the results, draw conclusions about the research questions and objectives you set out to address.

  • Causal Inferences:  Discuss the extent to which your study allows for causal inferences. Be transparent about the limitations and potential alternative explanations for your findings.
  • Implications and Applications:  Consider the practical implications of your research. How do your findings contribute to existing knowledge, and how can they be applied in real-world contexts?
  • Future Research:  Identify areas for future research and potential improvements in study design. Highlight any limitations or constraints that may have affected your study's outcomes.

By following these steps meticulously, you can conduct a rigorous and informative Quasi-Experimental study that advances knowledge in your field of research.

Quasi-Experimental Design Examples

Quasi-experimental design finds applications in a wide range of research domains, including business-related and market research scenarios. Below, we delve into some detailed examples of how this research methodology is employed in practice:

Example 1: Assessing the Impact of a New Marketing Strategy

Suppose a company wants to evaluate the effectiveness of a new marketing strategy aimed at boosting sales. Conducting a controlled experiment may not be feasible due to the company's existing customer base and the challenge of randomly assigning customers to different marketing approaches. In this scenario, a quasi-experimental design can be employed.

  • Independent Variable:  The new marketing strategy.
  • Dependent Variable:  Sales revenue.
  • Design:  The company could implement the new strategy for one group of customers while maintaining the existing strategy for another group. Both groups are selected based on similar demographics and purchase history , reducing selection bias. Pre-implementation data (sales records) can serve as the baseline, and post-implementation data can be collected to assess the strategy's impact.

Example 2: Evaluating the Effectiveness of Employee Training Programs

In the context of human resources and employee development, organizations often seek to evaluate the impact of training programs. A randomized controlled trial (RCT) with random assignment may not be practical or ethical, as some employees may need specific training more than others. Instead, a quasi-experimental design can be employed.

  • Independent Variable:  Employee training programs.
  • Dependent Variable:  Employee performance metrics, such as productivity or quality of work.
  • Design:  The organization can offer training programs to employees who express interest or demonstrate specific needs, creating a self-selected treatment group. A comparable control group can consist of employees with similar job roles and qualifications who did not receive the training. Pre-training performance metrics can serve as the baseline, and post-training data can be collected to assess the impact of the training programs.

Example 3: Analyzing the Effects of a Tax Policy Change

In economics and public policy, researchers often examine the effects of tax policy changes on economic behavior. Conducting a controlled experiment in such cases is practically impossible. Therefore, a quasi-experimental design is commonly employed.

  • Independent Variable:  Tax policy changes (e.g., tax rate adjustments).
  • Dependent Variable:  Economic indicators, such as consumer spending or business investments.
  • Design:  Researchers can analyze data from different regions or jurisdictions where tax policy changes have been implemented. One region could represent the treatment group (with tax policy changes), while a similar region with no tax policy changes serves as the control group. By comparing economic data before and after the policy change in both groups, researchers can assess the impact of the tax policy changes.

These examples illustrate how quasi-experimental design can be applied in various research contexts, providing valuable insights into the effects of independent variables in real-world scenarios where controlled experiments are not feasible or ethical. By carefully selecting comparison groups and controlling for potential biases, researchers can draw meaningful conclusions and inform decision-making processes.

How to Publish Quasi-Experimental Research?

Publishing your Quasi-Experimental research findings is a crucial step in contributing to the academic community's knowledge. We'll explore the essential aspects of reporting and publishing your Quasi-Experimental research effectively.

Structuring Your Research Paper

When preparing your research paper, it's essential to adhere to a well-structured format to ensure clarity and comprehensibility. Here are key elements to include:

Title and Abstract

  • Title:  Craft a concise and informative title that reflects the essence of your study. It should capture the main research question or hypothesis.
  • Abstract:  Summarize your research in a structured abstract, including the purpose, methods, results, and conclusions. Ensure it provides a clear overview of your study.


  • Background and Rationale:  Provide context for your study by discussing the research gap or problem your study addresses. Explain why your research is relevant and essential.
  • Research Questions or Hypotheses:  Clearly state your research questions or hypotheses and their significance.

Literature Review

  • Review of Related Work:  Discuss relevant literature that supports your research. Highlight studies with similar methodologies or findings and explain how your research fits within this context.
  • Participants:  Describe your study's participants, including their characteristics and how you recruited them.
  • Quasi-Experimental Design:  Explain your chosen design in detail, including the independent and dependent variables, procedures, and any control measures taken.
  • Data Collection:  Detail the data collection methods , instruments used, and any pre-test or post-test measures.
  • Data Analysis:  Describe the statistical techniques employed, including any control for confounding variables.
  • Presentation of Findings:  Present your results clearly, using tables, graphs, and descriptive statistics where appropriate. Include p-values and effect sizes, if applicable.
  • Interpretation of Results:  Discuss the implications of your findings and how they relate to your research questions or hypotheses.
  • Interpretation and Implications:  Analyze your results in the context of existing literature and theories. Discuss the practical implications of your findings.
  • Limitations:  Address the limitations of your study, including potential biases or threats to internal validity.
  • Future Research:  Suggest areas for future research and how your study contributes to the field.

Ethical Considerations in Reporting

Ethical reporting is paramount in Quasi-Experimental research. Ensure that you adhere to ethical standards, including:

  • Informed Consent:  Clearly state that informed consent was obtained from all participants, and describe the informed consent process.
  • Protection of Participants:  Explain how you protected the rights and well-being of your participants throughout the study.
  • Confidentiality:  Detail how you maintained privacy and anonymity, especially when presenting individual data.
  • Disclosure of Conflicts of Interest:  Declare any potential conflicts of interest that could influence the interpretation of your findings.

Common Pitfalls to Avoid

When reporting your Quasi-Experimental research, watch out for common pitfalls that can diminish the quality and impact of your work:

  • Overgeneralization:  Be cautious not to overgeneralize your findings. Clearly state the limits of your study and the populations to which your results can be applied.
  • Misinterpretation of Causality:  Clearly articulate the limitations in inferring causality in Quasi-Experimental research. Avoid making strong causal claims unless supported by solid evidence.
  • Ignoring Ethical Concerns:  Ethical considerations are paramount. Failing to report on informed consent, ethical oversight, and participant protection can undermine the credibility of your study.

Guidelines for Transparent Reporting

To enhance the transparency and reproducibility of your Quasi-Experimental research, consider adhering to established reporting guidelines, such as:

  • CONSORT Statement:  If your study involves interventions or treatments, follow the CONSORT guidelines for transparent reporting of randomized controlled trials.
  • STROBE Statement:  For observational studies, the STROBE statement provides guidance on reporting essential elements.
  • PRISMA Statement:  If your research involves systematic reviews or meta-analyses, adhere to the PRISMA guidelines.
  • Transparent Reporting of Evaluations with Non-Randomized Designs (TREND):  TREND guidelines offer specific recommendations for transparently reporting non-randomized designs, including Quasi-Experimental research.

By following these reporting guidelines and maintaining the highest ethical standards, you can contribute to the advancement of knowledge in your field and ensure the credibility and impact of your Quasi-Experimental research findings.

Quasi-Experimental Design Challenges

Conducting a Quasi-Experimental study can be fraught with challenges that may impact the validity and reliability of your findings. We'll take a look at some common challenges and provide strategies on how you can address them effectively.

Selection Bias

Challenge:  Selection bias occurs when non-randomized groups differ systematically in ways that affect the study's outcome. This bias can undermine the validity of your research, as it implies that the groups are not equivalent at the outset of the study.

Addressing Selection Bias:

  • Matching:  Employ matching techniques to create comparable treatment and control groups. Match participants based on relevant characteristics, such as age, gender, or prior performance, to balance the groups.
  • Statistical Controls:  Use statistical controls to account for differences between groups. Include covariates in your analysis to adjust for potential biases.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess how vulnerable your results are to selection bias. Explore different scenarios to understand the impact of potential bias on your conclusions.

History Effects

Challenge:  History effects refer to external events or changes over time that influence the study's results. These external factors can confound your research by introducing variables you did not account for.

Addressing History Effects:

  • Collect Historical Data:  Gather extensive historical data to understand trends and patterns that might affect your study. By having a comprehensive historical context, you can better identify and account for historical effects.
  • Control Groups:  Include control groups whenever possible. By comparing the treatment group's results to those of a control group, you can account for external influences that affect both groups equally.
  • Time Series Analysis :  If applicable, use time series analysis to detect and account for temporal trends. This method helps differentiate between the effects of the independent variable and external events.

Maturation Effects

Challenge:  Maturation effects occur when participants naturally change or develop throughout the study, independent of the intervention. These changes can confound your results, making it challenging to attribute observed effects solely to the independent variable.

Addressing Maturation Effects:

  • Randomization:  If possible, use randomization to distribute maturation effects evenly across treatment and control groups. Random assignment minimizes the impact of maturation as a confounding variable.
  • Matched Pairs:  If randomization is not feasible, employ matched pairs or statistical controls to ensure that both groups experience similar maturation effects.
  • Shorter Time Frames:  Limit the duration of your study to reduce the likelihood of significant maturation effects. Shorter studies are less susceptible to long-term maturation.

Regression to the Mean

Challenge:  Regression to the mean is the tendency for extreme scores on a variable to move closer to the mean upon retesting. This can create the illusion of an intervention's effectiveness when, in reality, it's a natural statistical phenomenon.

Addressing Regression to the Mean:

  • Use Control Groups:  Include control groups in your study to provide a baseline for comparison. This helps differentiate genuine intervention effects from regression to the mean.
  • Multiple Data Points:  Collect numerous data points to identify patterns and trends. If extreme scores regress to the mean in subsequent measurements, it may be indicative of regression to the mean rather than a true intervention effect.
  • Statistical Analysis:  Employ statistical techniques that account for regression to the mean when analyzing your data. Techniques like analysis of covariance (ANCOVA) can help control for baseline differences.

Attrition and Mortality

Challenge:  Attrition refers to the loss of participants over the course of your study, while mortality is the permanent loss of participants. High attrition rates can introduce biases and affect the representativeness of your sample.

Addressing Attrition and Mortality:

  • Careful Participant Selection:  Select participants who are likely to remain engaged throughout the study. Consider factors that may lead to attrition, such as participant motivation and commitment.
  • Incentives:  Provide incentives or compensation to participants to encourage their continued participation.
  • Follow-Up Strategies:  Implement effective follow-up strategies to reduce attrition. Regular communication and reminders can help keep participants engaged.
  • Sensitivity Analysis:  Conduct sensitivity analyses to assess the impact of attrition and mortality on your results. Compare the characteristics of participants who dropped out with those who completed the study.

Testing Effects

Challenge:  Testing effects occur when the mere act of testing or assessing participants affects their subsequent performance. This phenomenon can lead to changes in the dependent variable that are unrelated to the independent variable.

Addressing Testing Effects:

  • Counterbalance Testing:  If possible, counterbalance the order of tests or assessments between treatment and control groups. This helps distribute the testing effects evenly across groups.
  • Control Groups:  Include control groups subjected to the same testing or assessment procedures as the treatment group. By comparing the two groups, you can determine whether testing effects have influenced the results.
  • Minimize Testing Frequency:  Limit the frequency of testing or assessments to reduce the likelihood of testing effects. Conducting fewer assessments can mitigate the impact of repeated testing on participants.

By proactively addressing these common challenges, you can enhance the validity and reliability of your Quasi-Experimental study, making your findings more robust and trustworthy.

Conclusion for Quasi-Expermental Design

Quasi-experimental design is a powerful tool that helps researchers investigate cause-and-effect relationships in real-world situations where strict control is not always possible. By understanding the key concepts, types of designs, and how to address challenges, you can conduct robust research and contribute valuable insights to your field. Remember, quasi-experimental design bridges the gap between controlled experiments and purely observational studies, making it an essential approach in various fields, from business and market research to public policy and beyond. So, whether you're a researcher, student, or decision-maker, the knowledge of quasi-experimental design empowers you to make informed choices and drive positive changes in the world.

How to Supercharge Quasi-Experimental Design with Real-Time Insights?

Introducing Appinio , the real-time market research platform that transforms the world of quasi-experimental design. Imagine having the power to conduct your own market research in minutes, obtaining actionable insights that fuel your data-driven decisions. Appinio takes care of the research and tech complexities, freeing you to focus on what truly matters for your business.

Here's why Appinio stands out:

  • Lightning-Fast Insights:  From formulating questions to uncovering insights, Appinio delivers results in minutes, ensuring you get the answers you need when you need them.
  • No Research Degree Required:  Our intuitive platform is designed for everyone, eliminating the need for a PhD in research. Anyone can dive in and start harnessing the power of real-time consumer insights.
  • Global Reach, Local Expertise:  With access to over 90 countries and the ability to define precise target groups based on 1200+ characteristics, you can conduct Quasi-Experimental research on a global scale while maintaining a local touch.

Wait, there's more

A Modern Guide to Understanding and Conducting Research in Psychology

Chapter 7 quasi-experimental research, learning objectives.

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix quasi means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions ( Cook et al., 1979 ) . Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here, focusing first on nonequivalent groups, pretest-posttest, interrupted time series, and combination designs before turning to single subject designs (including reversal and multiple-baseline designs).

7.1 Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

7.2 Pretest-Posttest Design

In a pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an STEM education program on elementary school students’ attitudes toward science, technology, engineering and math. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the STEM program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of history . Other things might have happened between the pretest and the posttest. Perhaps an science program aired on television and many of the students watched it, or perhaps a major scientific discover occured and many of the students heard about it. Another category of alternative explanations goes under the name of maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become more exposed to STEM subjects in class or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study because of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all ( Posternak & Miller, 2001 ) . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Finally, it is possible that the act of taking a pretest can sensitize participants to the measurement process or heighten their awareness of the variable under investigation. This heightened sensitivity, called a testing effect , can subsequently lead to changes in their posttest responses, even in the absence of any external intervention effect.

7.3 Interrupted Time Series Design

A variant of the pretest-posttest design is the interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this is “interrupted” by a treatment. In a recent COVID-19 study, the intervention involved the implementation of state-issued mask mandates and restrictions on on-premises restaurant dining. The researchers examined the impact of these measures on COVID-19 cases and deaths ( Guy Jr et al., 2021 ) . Since there was a rapid reduction in daily case and death growth rates following the implementation of mask mandates, and this effect persisted for an extended period, the researchers concluded that the implementation of mask mandates was the cause of the decrease in COVID-19 transmission. This study employed an interrupted time series design, similar to a pretest-posttest design, as it involved measuring the outcomes before and after the intervention. However, unlike the pretest-posttest design, it incorporated multiple measurements before and after the intervention, providing a more comprehensive analysis of the policy impacts.

Figure 7.1 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of Figure 7.1 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of Figure 7.1 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Two line graphs. The x-axes on both are labeled Week and range from 0 to 14. The y-axes on both are labeled Absences and range from 0 to 8. Between weeks 7 and 8 a vertical dotted line indicates when a treatment was introduced. Both graphs show generally high levels of absences from weeks 1 through 7 (before the treatment) and only 2 absences in week 8 (the first observation after the treatment). The top graph shows the absence level staying low from weeks 9 to 14. The bottom graph shows the absence level for weeks 9 to 15 bouncing around at the same high levels as before the treatment.

Figure 7.1: Hypothetical interrupted time-series design. The top panel shows data that suggest that the treatment caused a reduction in absences. The bottom panel shows data that suggest that it did not.

7.4 Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does not receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve more than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their current level of engagement in pro-environmental behaviors (i.e., recycling, eating less red meat, abstaining for single-use plastics, etc.), then are exposed to an pro-environmental program in which they learn about the effects of human caused climate change on the planet, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an pro-environmental program, and finally are given a posttest. Again, if students in the treatment condition become more involved in pro-environmental behaviors, this could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become engage in more pro-environmental behaviors than students in the control condition. But if it is a matter of history (e.g., news of a forest fire or drought) or maturation (e.g., improved reasoning or sense of responsibility), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a local heat wave with record high temperatures), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, this kind of design has now been conducted many times—to demonstrate the effectiveness of psychotherapy.


  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two college professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.

regression to the mean

Spontaneous remission, 7.5 single-subject research.

  • Explain what single-subject research is, including how it differs from other types of psychological research and who uses single-subject research and why.
  • Design simple single-subject studies using reversal and multiple-baseline designs.
  • Explain how single-subject research designs address the issue of internal validity.
  • Interpret the results of simple single-subject studies based on the visual inspection of graphed data.
  • Explain some of the points of disagreement between advocates of single-subject research and advocates of group research.

Researcher Vance Hall and his colleagues were faced with the challenge of increasing the extent to which six disruptive elementary school students stayed focused on their schoolwork ( Hall et al., 1968 ) . For each of several days, the researchers carefully recorded whether or not each student was doing schoolwork every 10 seconds during a 30-minute period. Once they had established this baseline, they introduced a treatment. The treatment was that when the student was doing schoolwork, the teacher gave him or her positive attention in the form of a comment like “good work” or a pat on the shoulder. The result was that all of the students dramatically increased their time spent on schoolwork and decreased their disruptive behavior during this treatment phase. For example, a student named Robbie originally spent 25% of his time on schoolwork and the other 75% “snapping rubber bands, playing with toys from his pocket, and talking and laughing with peers” (p. 3). During the treatment phase, however, he spent 71% of his time on schoolwork and only 29% on other activities. Finally, when the researchers had the teacher stop giving positive attention, the students all decreased their studying and increased their disruptive behavior. This was consistent with the claim that it was, in fact, the positive attention that was responsible for the increase in studying. This was one of the first studies to show that attending to positive behavior—and ignoring negative behavior—could be a quick and effective way to deal with problem behavior in an applied setting.

Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. *Photo by Jerry Wang on Unsplash.*

Figure 7.2: Single-subject research has shown that positive attention from a teacher for studying can increase studying and decrease disruptive behavior. Photo by Jerry Wang on Unsplash.

Most of this book is about what can be called group research, which typically involves studying a large number of participants and combining their data to draw general conclusions about human behavior. The study by Hall and his colleagues, in contrast, is an example of single-subject research, which typically involves studying a small number of participants and focusing closely on each individual. In this section, we consider this alternative approach. We begin with an overview of single-subject research, including some assumptions on which it is based, who conducts it, and why they do. We then look at some basic single-subject research designs and how the data from those designs are analyzed. Finally, we consider some of the strengths and weaknesses of single-subject research as compared with group research and see how these two approaches can complement each other.

Overview of Single-Subject Research

What is single-subject research.

Single-subject research is a type of quantitative, quasi-experimental research that involves studying in detail the behavior of each of a small number of participants. Note that the term single-subject does not mean that only one participant is studied; it is more typical for there to be somewhere between two and 10 participants. (This is why single-subject research designs are sometimes called small-n designs, where n is the statistical symbol for the sample size.) Single-subject research can be contrasted with group research , which typically involves studying large numbers of participants and examining their behavior primarily in terms of group means, standard deviations, and so on. The majority of this book is devoted to understanding group research, which is the most common approach in psychology. But single-subject research is an important alternative, and it is the primary approach in some areas of psychology.

Before continuing, it is important to distinguish single-subject research from two other approaches, both of which involve studying in detail a small number of participants. One is qualitative research, which focuses on understanding people’s subjective experience by collecting relatively unstructured data (e.g., detailed interviews) and analyzing those data using narrative rather than quantitative techniques (see. Single-subject research, in contrast, focuses on understanding objective behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

It is also important to distinguish single-subject research from case studies. A case study is a detailed description of an individual, which can include both qualitative and quantitative analyses. (Case studies that include only qualitative analyses can be considered a type of qualitative research.) The history of psychology is filled with influential cases studies, such as Sigmund Freud’s description of “Anna O.” (see box “The Case of ‘Anna O.’”) and John Watson and Rosalie Rayner’s description of Little Albert ( Watson & Rayner, 1920 ) who learned to fear a white rat—along with other furry objects—when the researchers made a loud noise while he was playing with the rat. Case studies can be useful for suggesting new research questions and for illustrating general principles. They can also help researchers understand rare phenomena, such as the effects of damage to a specific part of the human brain. As a general rule, however, case studies cannot substitute for carefully designed group or single-subject research studies. One reason is that case studies usually do not allow researchers to determine whether specific events are causally related, or even related at all. For example, if a patient is described in a case study as having been sexually abused as a child and then as having developed an eating disorder as a teenager, there is no way to determine whether these two events had anything to do with each other. A second reason is that an individual case can always be unusual in some way and therefore be unrepresentative of people more generally. Thus case studies have serious problems with both internal and external validity.

The Case of “Anna O.”

Sigmund Freud used the case of a young woman he called “Anna O.” to illustrate many principles of his theory of psychoanalysis ( Freud, 1957 ) . (Her real name was Bertha Pappenheim, and she was an early feminist who went on to make important contributions to the field of social work.) Anna had come to Freud’s colleague Josef Breuer around 1880 with a variety of odd physical and psychological symptoms. One of them was that for several weeks she was unable to drink any fluids. According to Freud,

She would take up the glass of water that she longed for, but as soon as it touched her lips she would push it away like someone suffering from hydrophobia.…She lived only on fruit, such as melons, etc., so as to lessen her tormenting thirst (p. 9).

But according to Freud, a breakthrough came one day while Anna was under hypnosis.

[S]he grumbled about her English “lady-companion,” whom she did not care for, and went on to describe, with every sign of disgust, how she had once gone into this lady’s room and how her little dog—horrid creature!—had drunk out of a glass there. The patient had said nothing, as she had wanted to be polite. After giving further energetic expression to the anger she had held back, she asked for something to drink, drank a large quantity of water without any difficulty, and awoke from her hypnosis with the glass at her lips; and thereupon the disturbance vanished, never to return.

Freud’s interpretation was that Anna had repressed the memory of this incident along with the emotion that it triggered and that this was what had caused her inability to drink. Furthermore, her recollection of the incident, along with her expression of the emotion she had repressed, caused the symptom to go away.

As an illustration of Freud’s theory, the case study of Anna O. is quite effective. As evidence for the theory, however, it is essentially worthless. The description provides no way of knowing whether Anna had really repressed the memory of the dog drinking from the glass, whether this repression had caused her inability to drink, or whether recalling this “trauma” relieved the symptom. It is also unclear from this case study how typical or atypical Anna’s experience was.

"Anna O." was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Figure 7.3: “Anna O.” was the subject of a famous case study used by Freud to illustrate the principles of psychoanalysis. Source: Wikimedia Commons

Assumptions of Single-Subject Research

Again, single-subject research involves studying a small number of participants and focusing intensively on the behavior of each one. But why take this approach instead of the group approach? There are two important assumptions underlying single-subject research, and it will help to consider them now.

First and foremost is the assumption that it is important to focus intensively on the behavior of individual participants. One reason for this is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half the people exposed to it but a negative effect for the other half would, on average, appear to have no effect at all. Single-subject research, however, would likely reveal these individual differences. A second reason to focus intensively on individuals is that sometimes it is the behavior of a particular individual that is primarily of interest. A school psychologist, for example, might be interested in changing the behavior of a particular disruptive student. Although previous published research (both single-subject and group research) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective.

Another assumption of single-subject research is that it is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur. This is sometimes referred to as social validity ( Wolf, 1978 ) . The study by Hall and his colleagues, for example, had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often chaotic elementary school classrooms.

Who Uses Single-Subject Research?

Single-subject research has been around as long as the field of psychology itself. In the late 1800s, one of psychology’s founders, Wilhelm Wundt, studied sensation and consciousness by focusing intensively on each of a small number of research participants. Herman Ebbinghaus’s research on memory and Ivan Pavlov’s research on classical conditioning are other early examples, both of which are still described in almost every introductory psychology textbook.

In the middle of the 20th century, B. F. Skinner clarified many of the assumptions underlying single-subject research and refined many of its techniques ( Skinner, 1938 ) . He and other researchers then used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out primarily using nonhuman subjects—mostly rats and pigeons. This approach, which Skinner called the experimental analysis of behavior —remains an important subfield of psychology and continues to rely almost exclusively on single-subject research. For examples of this work, look at any issue of the Journal of the Experimental Analysis of Behavior . By the 1960s, many researchers were interested in using this approach to conduct applied research primarily with humans—a subfield now called applied behavior analysis ( Baer et al., 1968 ) . Applied behavior analysis plays a significant role in contemporary research on developmental disabilities, education, organizational behavior, and health, among many other areas. Examples of this work (including the study by Hall and his colleagues) can be found in the Journal of Applied Behavior Analysis . The single-subject approach can also be used by clinicians who take any theoretical perspective—behavioral, cognitive, psychodynamic, or humanistic—to study processes of therapeutic change with individual clients and to document their clients’ improvement ( Kazdin, 2019 ) .

Single-Subject Research Designs

General features of single-subject designs.

Before looking at any specific single-subject research designs, it will be helpful to consider some features that are common to most of them. Many of these features are illustrated in Figure 7.4 , which shows the results of a generic single-subject study. First, the dependent variable (represented on the y-axis of the graph) is measured repeatedly over time (represented by the x-axis) at regular intervals. Second, the study is divided into distinct phases, and the participant is tested under one condition per phase. The conditions are often designated by capital letters: A, B, C, and so on. Thus Figure 7.4 represents a design in which the participant was tested first in one condition (A), then tested in another condition (B), and finally retested in the original condition (A). (This is called a reversal design and will be discussed in more detail shortly.)

Results of a generic single-subject study illustrating several principles of single-subject research.

Figure 7.4: Results of a generic single-subject study illustrating several principles of single-subject research.

Another important aspect of single-subject research is that the change from one condition to the next does not usually occur after a fixed amount of time or number of observations. Instead, it depends on the participant’s behavior. Specifically, the researcher waits until the participant’s behavior in one condition becomes fairly consistent from observation to observation before changing conditions. This is sometimes referred to as the steady state strategy ( Sidman, 1960 ) . The idea is that when the dependent variable has reached a steady state, then any change across conditions will be relatively easy to detect. Recall that we encountered this same principle when discussing experimental research more generally. The effect of an independent variable is easier to detect when the “noise” in the data is minimized.

Reversal Designs

The most basic single-subject research design is the reversal design , also called the ABA design . During the first phase, A, a baseline is established for the dependent variable. This is the level of responding before any treatment is introduced, and therefore the baseline phase is a kind of control condition. When steady state responding is reached, phase B begins as the researcher introduces the treatment. Again, the researcher waits until that dependent variable reaches a steady state so that it is clear whether and how much it has changed. Finally, the researcher removes the treatment and again waits until the dependent variable reaches a steady state. This basic reversal design can also be extended with the reintroduction of the treatment (ABAB), another return to baseline (ABABA), and so on. The study by Hall and his colleagues was an ABAB reversal design (Figure 7.5 ).

An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Figure 7.5: An approximation of the results for Hall and colleagues’ participant Robbie in their ABAB reversal design. The percentage of time he spent studying (the dependent variable) was low during the first baseline phase, increased during the first treatment phase until it leveled off, decreased during the second baseline phase, and again increased during the second treatment phase.

Why is the reversal—the removal of the treatment—considered to be necessary in this type of design? If the dependent variable changes after the treatment is introduced, it is not always clear that the treatment was responsible for the change. It is possible that something else changed at around the same time and that this extraneous variable is responsible for the change in the dependent variable. But if the dependent variable changes with the introduction of the treatment and then changes back with the removal of the treatment, it is much clearer that the treatment (and removal of the treatment) is the cause. In other words, the reversal greatly increases the internal validity of the study.

Multiple-Baseline Designs

There are two potential problems with the reversal design—both of which have to do with the removal of the treatment. One is that if a treatment is working, it may be unethical to remove it. For example, if a treatment seemed to reduce the incidence of self-injury in a developmentally disabled child, it would be unethical to remove that treatment just to show that the incidence of self-injury increases. The second problem is that the dependent variable may not return to baseline when the treatment is removed. For example, when positive attention for studying is removed, a student might continue to study at an increased rate. This could mean that the positive attention had a lasting effect on the student’s studying, which of course would be good, but it could also mean that the positive attention was not really the cause of the increased studying in the first place.

One solution to these problems is to use a multiple-baseline design , which is represented in Figure 7.6 . In one version of the design, a baseline is established for each of several participants, and the treatment is then introduced for each one. In essence, each participant is tested in an AB design. The key to this design is that the treatment is introduced at a different time for each participant. The idea is that if the dependent variable changes when the treatment is introduced for one participant, it might be a coincidence. But if the dependent variable changes when the treatment is introduced for multiple participants—especially when the treatment is introduced at different times for the different participants—then it is less likely to be a coincidence.

Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

Figure 7.6: Results of a generic multiple-baseline study. The multiple baselines can be for different participants, dependent variables, or settings. The treatment is introduced at a different time on each baseline.

As an example, consider a study by Scott Ross and Robert Horner ( Ross et al., 2009 ) . They were interested in how a school-wide bullying prevention program affected the bullying behavior of particular problem students. At each of three different schools, the researchers studied two students who had regularly engaged in bullying. During the baseline phase, they observed the students for 10-minute periods each day during lunch recess and counted the number of aggressive behaviors they exhibited toward their peers. (The researchers used handheld computers to help record the data.) After 2 weeks, they implemented the program at one school. After 2 more weeks, they implemented it at the second school. And after 2 more weeks, they implemented it at the third school. They found that the number of aggressive behaviors exhibited by each student dropped shortly after the program was implemented at his or her school. Notice that if the researchers had only studied one school or if they had introduced the treatment at the same time at all three schools, then it would be unclear whether the reduction in aggressive behaviors was due to the bullying program or something else that happened at about the same time it was introduced (e.g., a holiday, a television program, a change in the weather). But with their multiple-baseline design, this kind of coincidence would have to happen three separate times—an unlikely occurrence—to explain their results.

Data Analysis in Single-Subject Research

In addition to its focus on individual participants, single-subject research differs from group research in the way the data are typically analyzed. As we have seen throughout the book, group research involves combining data across participants. Inferential statistics are used to help decide whether the result for the sample is likely to generalize to the population. Single-subject research, by contrast, relies heavily on a very different approach called visual inspection . This means plotting individual participants’ data as shown throughout this chapter, looking carefully at those data, and making judgments about whether and to what extent the independent variable had an effect on the dependent variable. Inferential statistics are typically not used.

In visually inspecting their data, single-subject researchers take several factors into account. One of them is changes in the level of the dependent variable from condition to condition. If the dependent variable is much higher or much lower in one condition than another, this suggests that the treatment had an effect. A second factor is trend , which refers to gradual increases or decreases in the dependent variable across observations. If the dependent variable begins increasing or decreasing with a change in conditions, then again this suggests that the treatment had an effect. It can be especially telling when a trend changes directions—for example, when an unwanted behavior is increasing during baseline but then begins to decrease with the introduction of the treatment. A third factor is latency , which is the time it takes for the dependent variable to begin changing after a change in conditions. In general, if a change in the dependent variable begins shortly after a change in conditions, this suggests that the treatment was responsible.

In the top panel of Figure 7.7 , there are fairly obvious changes in the level and trend of the dependent variable from condition to condition. Furthermore, the latencies of these changes are short; the change happens immediately. This pattern of results strongly suggests that the treatment was responsible for the changes in the dependent variable. In the bottom panel of Figure 7.7 , however, the changes in level are fairly small. And although there appears to be an increasing trend in the treatment condition, it looks as though it might be a continuation of a trend that had already begun during baseline. This pattern of results strongly suggests that the treatment was not responsible for any changes in the dependent variable—at least not to the extent that single-subject researchers typically hope to see.

Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

Figure 7.7: Visual inspection of the data suggests an effective treatment in the top panel but an ineffective treatment in the bottom panel.

The results of single-subject research can also be analyzed using statistical procedures—and this is becoming more common. There are many different approaches, and single-subject researchers continue to debate which are the most useful. One approach parallels what is typically done in group research. The mean and standard deviation of each participant’s responses under each condition are computed and compared, and inferential statistical tests such as the t test or analysis of variance are applied ( Fisch, 2001 ) . (Note that averaging across participants is less common.) Another approach is to compute the percentage of nonoverlapping data (PND) for each participant ( Scruggs & Mastropieri, 2021 ) . This is the percentage of responses in the treatment condition that are more extreme than the most extreme response in a relevant control condition. In the study of Hall and his colleagues, for example, all measures of Robbie’s study time in the first treatment condition were greater than the highest measure in the first baseline, for a PND of 100%. The greater the percentage of nonoverlapping data, the stronger the treatment effect. Still, formal statistical approaches to data analysis in single-subject research are generally considered a supplement to visual inspection, not a replacement for it.

The Single-Subject Versus Group “Debate”

Single-subject research is similar to group research—especially experimental group research—in many ways. They are both quantitative approaches that try to establish causal relationships by manipulating an independent variable, measuring a dependent variable, and controlling extraneous variables. As we will see, single-subject research and group research are probably best conceptualized as complementary approaches.

Data Analysis

One set of disagreements revolves around the issue of data analysis. Some advocates of group research worry that visual inspection is inadequate for deciding whether and to what extent a treatment has affected a dependent variable. One specific concern is that visual inspection is not sensitive enough to detect weak effects. A second is that visual inspection can be unreliable, with different researchers reaching different conclusions about the same set of data ( Danov & Symons, 2008 ) . A third is that the results of visual inspection—an overall judgment of whether or not a treatment was effective—cannot be clearly and efficiently summarized or compared across studies (unlike the measures of relationship strength typically used in group research).

In general, single-subject researchers share these concerns. However, they also argue that their use of the steady state strategy, combined with their focus on strong and consistent effects, minimizes most of them. If the effect of a treatment is difficult to detect by visual inspection because the effect is weak or the data are noisy, then single-subject researchers look for ways to increase the strength of the effect or reduce the noise in the data by controlling extraneous variables (e.g., by administering the treatment more consistently). If the effect is still difficult to detect, then they are likely to consider it neither strong enough nor consistent enough to be of further interest. Many single-subject researchers also point out that statistical analysis is becoming increasingly common and that many of them are using it as a supplement to visual inspection—especially for the purpose of comparing results across studies ( Scruggs & Mastropieri, 2021 ) .

Turning the tables, some advocates of single-subject research worry about the way that group researchers analyze their data. Specifically, they point out that focusing on group means can be highly misleading. Again, imagine that a treatment has a strong positive effect on half the people exposed to it and an equally strong negative effect on the other half. In a traditional between-subjects experiment, the positive effect on half the participants in the treatment condition would be statistically cancelled out by the negative effect on the other half. The mean for the treatment group would then be the same as the mean for the control group, making it seem as though the treatment had no effect when in fact it had a strong effect on every single participant!

But again, group researchers share this concern. Although they do focus on group statistics, they also emphasize the importance of examining distributions of individual scores. For example, if some participants were positively affected by a treatment and others negatively affected by it, this would produce a bimodal distribution of scores and could be detected by looking at a histogram of the data. The use of within-subjects designs is another strategy that allows group researchers to observe effects at the individual level and even to specify what percentage of individuals exhibit strong, medium, weak, and even negative effects.

External Validity

The second issue about which single-subject and group researchers sometimes disagree has to do with external validity—the ability to generalize the results of a study beyond the people and situation actually studied. In particular, advocates of group research point out the difficulty in knowing whether results for just a few participants are likely to generalize to others in the population. Imagine, for example, that in a single-subject study, a treatment has been shown to reduce self-injury for each of two developmentally disabled children. Even if the effect is strong for these two children, how can one know whether this treatment is likely to work for other developmentally disabled children?

Again, single-subject researchers share this concern. In response, they note that the strong and consistent effects they are typically interested in—even when observed in small samples—are likely to generalize to others in the population. Single-subject researchers also note that they place a strong emphasis on replicating their research results. When they observe an effect with a small sample of participants, they typically try to replicate it with another small sample—perhaps with a slightly different type of participant or under slightly different conditions. Each time they observe similar results, they rightfully become more confident in the generality of those results. Single-subject researchers can also point to the fact that the principles of classical and operant conditioning—most of which were discovered using the single-subject approach—have been successfully generalized across an incredibly wide range of species and situations.

And again turning the tables, single-subject researchers have concerns of their own about the external validity of group research. One extremely important point they make is that studying large groups of participants does not entirely solve the problem of generalizing to other individuals. Imagine, for example, a treatment that has been shown to have a small positive effect on average in a large group study. It is likely that although many participants exhibited a small positive effect, others exhibited a large positive effect, and still others exhibited a small negative effect. When it comes to applying this treatment to another large group , we can be fairly sure that it will have a small effect on average. But when it comes to applying this treatment to another individual , we cannot be sure whether it will have a small, a large, or even a negative effect. Another point that single-subject researchers make is that group researchers also face a similar problem when they study a single situation and then generalize their results to other situations. For example, researchers who conduct a study on the effect of cell phone use on drivers on a closed oval track probably want to apply their results to drivers in many other real-world driving situations. But notice that this requires generalizing from a single situation to a population of situations. Thus the ability to generalize is based on much more than just the sheer number of participants one has studied. It requires a careful consideration of the similarity of the participants and situations studied to the population of participants and situations that one wants to generalize to ( Shadish et al., 2002 ) .

Single-Subject and Group Research as Complementary Methods

As with quantitative and qualitative research, it is probably best to conceptualize single-subject research and group research as complementary methods that have different strengths and weaknesses and that are appropriate for answering different kinds of research questions ( Kazdin, 2019 ) . Single-subject research is particularly good for testing the effectiveness of treatments on individuals when the focus is on strong, consistent, and biologically or socially important effects. It is especially useful when the behavior of particular individuals is of interest. Clinicians who work with only one individual at a time may find that it is their only option for doing systematic quantitative research.

Group research, on the other hand, is good for testing the effectiveness of treatments at the group level. Among the advantages of this approach is that it allows researchers to detect weak effects, which can be of interest for many reasons. For example, finding a weak treatment effect might lead to refinements of the treatment that eventually produce a larger and more meaningful effect. Group research is also good for studying interactions between treatments and participant characteristics. For example, if a treatment is effective for those who are high in motivation to change and ineffective for those who are low in motivation to change, then a group design can detect this much more efficiently than a single-subject design. Group research is also necessary to answer questions that cannot be addressed using the single-subject approach, including questions about independent variables that cannot be manipulated (e.g., number of siblings, extroversion, culture).

  • Single-subject research—which involves testing a small number of participants and focusing intensively on the behavior of each individual—is an important alternative to group research in psychology.
  • Single-subject studies must be distinguished from case studies, in which an individual case is described in detail. Case studies can be useful for generating new research questions, for studying rare phenomena, and for illustrating general principles. However, they cannot substitute for carefully controlled experimental or correlational studies because they are low in internal and external validity.
  • Single-subject research designs typically involve measuring the dependent variable repeatedly over time and changing conditions (e.g., from baseline to treatment) when the dependent variable has reached a steady state. This approach allows the researcher to see whether changes in the independent variable are causing changes in the dependent variable.
  • Single-subject researchers typically analyze their data by graphing them and making judgments about whether the independent variable is affecting the dependent variable based on level, trend, and latency.
  • Differences between single-subject research and group research sometimes lead to disagreements between single-subject and group researchers. These disagreements center on the issues of data analysis and external validity (especially generalization to other people). Single-subject research and group research are probably best seen as complementary methods, with different strengths and weaknesses, that are appropriate for answering different kinds of research questions.
  • Does positive attention from a parent increase a child’s toothbrushing behavior?
  • Does self-testing while studying improve a student’s performance on weekly spelling tests?
  • Does regular exercise help relieve depression?
  • Practice: Create a graph that displays the hypothetical results for the study you designed in Exercise 1. Write a paragraph in which you describe what the results show. Be sure to comment on level, trend, and latency.
  • Discussion: Imagine you have conducted a single-subject study showing a positive effect of a treatment on the behavior of a man with social anxiety disorder. Your research has been criticized on the grounds that it cannot be generalized to others. How could you respond to this criticism?
  • Discussion: Imagine you have conducted a group study showing a positive effect of a treatment on the behavior of a group of people with social anxiety disorder, but your research has been criticized on the grounds that “average” effects cannot be generalized to individuals. How could you respond to this criticism?

7.6 Glossary

The simplest reversal design, in which there is a baseline condition (A), followed by a treatment condition (B), followed by a return to baseline (A).

applied behavior analysis

A subfield of psychology that uses single-subject research and applies the principles of behavior analysis to real-world problems in areas that include education, developmental disabilities, organizational behavior, and health behavior.

A condition in a single-subject research design in which the dependent variable is measured repeatedly in the absence of any treatment. Most designs begin with a baseline condition, and many return to the baseline condition at least once.

A detailed description of an individual case.

experimental analysis of behavior

A subfield of psychology founded by B. F. Skinner that uses single-subject research—often with nonhuman animals—to study relationships primarily between environmental conditions and objectively observable behaviors.

group research

A type of quantitative research that involves studying a large number of participants and examining their behavior in terms of means, standard deviations, and other group-level statistics.

interrupted time-series design

A research design in which a series of measurements of the dependent variable are taken both before and after a treatment.

item-order effect

The effect of responding to one survey item on responses to a later survey item.

Refers collectively to extraneous developmental changes in participants that can occur between a pretest and posttest or between the first and last measurements in a time series. It can provide an alternative explanation for an observed change in the dependent variable.

multiple-baseline design

A single-subject research design in which multiple baselines are established for different participants, different dependent variables, or different contexts and the treatment is introduced at a different time for each baseline.

naturalistic observation

An approach to data collection in which the behavior of interest is observed in the environment in which it typically occurs.

nonequivalent groups design

A between-subjects research design in which participants are not randomly assigned to conditions, usually because participants are in preexisting groups (e.g., students at different schools).

nonexperimental research

Research that lacks the manipulation of an independent variable or the random assignment of participants to conditions or orders of conditions.

open-ended item

A questionnaire item that asks a question and allows respondents to respond in whatever way they want.

percentage of nonoverlapping data

A statistic sometimes used in single-subject research. The percentage of observations in a treatment condition that are more extreme than the most extreme observation in a relevant baseline condition.

pretest-posttest design

A research design in which the dependent variable is measured (the pretest), a treatment is given, and the dependent variable is measured again (the posttest) to see if there is a change in the dependent variable from pretest to posttest.

quasi-experimental research

Research that involves the manipulation of an independent variable but lacks the random assignment of participants to conditions or orders of conditions. It is generally used in field settings to test the effectiveness of a treatment.

rating scale

An ordered set of response options to a closed-ended questionnaire item.

The statistical fact that an individual who scores extremely on one occasion will tend to score less extremely on the next occasion.

A term often used to refer to a participant in survey research.

reversal design

A single-subject research design that begins with a baseline condition with no treatment, followed by the introduction of a treatment, and after that a return to the baseline condition. It can include additional treatment conditions and returns to baseline.

single-subject research

A type of quantitative research that involves examining in detail the behavior of each of a small number of participants.

single-variable research

Research that focuses on a single variable rather than on a statistical relationship between variables.

social validity

The extent to which a single-subject study focuses on an intervention that has a substantial effect on an important behavior and can be implemented reliably in the real-world contexts (e.g., by teachers in a classroom) in which that behavior occurs.

Improvement in a psychological or medical problem over time without any treatment.

steady state strategy

In single-subject research, allowing behavior to become fairly consistent from one observation to the next before changing conditions. This makes any effect of the treatment easier to detect.

survey research

A quantitative research approach that uses self-report measures and large, carefully selected samples.

testing effect

A bias in participants’ responses in which scores on the posttest are influenced by simple exposure to the pretest

visual inspection

The primary approach to data analysis in single-subject research, which involves graphing the data and making a judgment as to whether and to what extent the independent variable affected the dependent variable.

Home Market Research Research Tools and Apps

Quasi-experimental Research: What It Is, Types & Examples

quasi-experimental research is research that appears to be experimental but is not.

Much like an actual experiment, quasi-experimental research tries to demonstrate a cause-and-effect link between a dependent and an independent variable. A quasi-experiment, on the other hand, does not depend on random assignment, unlike an actual experiment. The subjects are sorted into groups based on non-random variables.

What is Quasi-Experimental Research?

“Resemblance” is the definition of “quasi.” Individuals are not randomly allocated to conditions or orders of conditions, even though the regression analysis is changed. As a result, quasi-experimental research is research that appears to be experimental but is not.

The directionality problem is avoided in quasi-experimental research since the regression analysis is altered before the multiple regression is assessed. However, because individuals are not randomized at random, there are likely to be additional disparities across conditions in quasi-experimental research.

As a result, in terms of internal consistency, quasi-experiments fall somewhere between correlational research and actual experiments.

The key component of a true experiment is randomly allocated groups. This means that each person has an equivalent chance of being assigned to the experimental group or the control group, depending on whether they are manipulated or not.

Simply put, a quasi-experiment is not a real experiment. A quasi-experiment does not feature randomly allocated groups since the main component of a real experiment is randomly assigned groups. Why is it so crucial to have randomly allocated groups, given that they constitute the only distinction between quasi-experimental and actual  experimental research ?

Let’s use an example to illustrate our point. Let’s assume we want to discover how new psychological therapy affects depressed patients. In a genuine trial, you’d split half of the psych ward into treatment groups, With half getting the new psychotherapy therapy and the other half receiving standard  depression treatment .

And the physicians compare the outcomes of this treatment to the results of standard treatments to see if this treatment is more effective. Doctors, on the other hand, are unlikely to agree with this genuine experiment since they believe it is unethical to treat one group while leaving another untreated.

A quasi-experimental study will be useful in this case. Instead of allocating these patients at random, you uncover pre-existing psychotherapist groups in the hospitals. Clearly, there’ll be counselors who are eager to undertake these trials as well as others who prefer to stick to the old ways.

These pre-existing groups can be used to compare the symptom development of individuals who received the novel therapy with those who received the normal course of treatment, even though the groups weren’t chosen at random.

If any substantial variations between them can be well explained, you may be very assured that any differences are attributable to the treatment but not to other extraneous variables.

As we mentioned before, quasi-experimental research entails manipulating an independent variable by randomly assigning people to conditions or sequences of conditions. Non-equivalent group designs, pretest-posttest designs, and regression discontinuity designs are only a few of the essential types.

What are quasi-experimental research designs?

Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn’t give full control over the independent variable(s) like true experimental designs do.

In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at random. Instead, people are put into groups based on things they already have in common, like their age, gender, or how many times they have seen a certain stimulus.

Because the assignments are not random, it is harder to draw conclusions about cause and effect than in a real experiment. However, quasi-experimental designs are still useful when randomization is not possible or ethical.

The true experimental design may be impossible to accomplish or just too expensive, especially for researchers with few resources. Quasi-experimental designs enable you to investigate an issue by utilizing data that has already been paid for or gathered by others (often the government). 

Because they allow better control for confounding variables than other forms of studies, they have higher external validity than most genuine experiments and higher  internal validity  (less than true experiments) than other non-experimental research.

Is quasi-experimental research quantitative or qualitative?

Quasi-experimental research is a quantitative research method. It involves numerical data collection and statistical analysis. Quasi-experimental research compares groups with different circumstances or treatments to find cause-and-effect links. 

It draws statistical conclusions from quantitative data. Qualitative data can enhance quasi-experimental research by revealing participants’ experiences and opinions, but quantitative data is the method’s foundation.

Quasi-experimental research types

There are many different sorts of quasi-experimental designs. Three of the most popular varieties are described below: Design of non-equivalent groups, Discontinuity in regression, and Natural experiments.

Design of Non-equivalent Groups

Example: design of non-equivalent groups, discontinuity in regression, example: discontinuity in regression, natural experiments, example: natural experiments.

However, because they couldn’t afford to pay everyone who qualified for the program, they had to use a random lottery to distribute slots.

Experts were able to investigate the program’s impact by utilizing enrolled people as a treatment group and those who were qualified but did not play the jackpot as an experimental group.

How QuestionPro helps in quasi-experimental research?

  • > The Cambridge Handbook of Research Methods and Statistics for the Social and Behavioral Sciences
  • > Quasi-Experimental Research

two types of quasi experimental design

14 - Quasi-Experimental Research

from Part III - Data Collection

Published online by Cambridge University Press:  25 May 2023

In this chapter, we discuss the logic and practice of quasi-experimentation. Specifically, we describe four quasi-experimental designs – one-group pretest–posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs – and their statistical analyses in detail. Both simple quasi-experimental designs and embellishments of these simple designs are presented. Potential threats to internal validity are illustrated along with means of addressing their potentially biasing effects so that these effects can be minimized. In contrast to quasi-experiments, randomized experiments are often thought to be the gold standard when estimating the effects of treatment interventions. However, circumstances frequently arise where quasi-experiments can usefully supplement randomized experiments or when quasi-experiments can fruitfully be used in place of randomized experiments. Researchers need to appreciate the relative strengths and weaknesses of the various quasi-experiments so they can choose among pre-specified designs or craft their own unique quasi-experiments.

Quasi-Experimental Design is a unique research methodology because it is characterized by what is lacks. For example, Abraham & MacDonald (2011) state:

" Quasi-experimental research is similar to experimental research in that there is manipulation of an independent variable. It differs from experimental research because either there is no control group, no random selection, no random assignment, and/or no active manipulation. "

This type of research is often performed in cases where a control group cannot be created or random selection cannot be performed. This is often the case in certain medical and psychological studies. 

For more information on quasi-experimental design, review the resources below: 

Where to Start

Below are listed a few tools and online guides that can help you start your Quasi-experimental research. These include free online resources and resources available only through ISU Library.

  Quasi-Experimental Research Designs by Bruce A. Thyer This pocket guide describes the logic, design, and conduct of the range of quasi-experimental designs, encompassing pre-experiments, quasi-experiments making use of a control or comparison group, and time-series designs.
  Experimental and Quasi-Experimental Designs for Research by Donald T. Campbell; Julian C. Stanley. Call Number: Q175 C152e Written 1967 but still used heavily today, this book examines research designs for experimental and quasi-experimental research, with examples and judgments about each design's validity.

Online Resources

  • Quasi-Experimental Design From the Web Center for Social Research Methods, this is a very good overview of quasi-experimental design.
  • Experimental and Quasi-Experimental Research From Colorado State University.
  • Quasi-experimental design--Wikipedia, the free encyclopedia Wikipedia can be a useful place to start your research- check the citations at the bottom of the article for more information.
Child Care and Early Education Research Connections

Experiments and quasi-experiments.

This page includes an explanation of the types, key components, validity, ethics, and advantages and disadvantages of experimental design.

An experiment is a study in which the researcher manipulates the level of some independent variable and then measures the outcome. Experiments are powerful techniques for evaluating cause-and-effect relationships. Many researchers consider experiments the "gold standard" against which all other research designs should be judged. Experiments are conducted both in the laboratory and in real life situations.

Types of Experimental Design

There are two basic types of research design:

  • True experiments
  • Quasi-experiments

The purpose of both is to examine the cause of certain phenomena.

True experiments, in which all the important factors that might affect the phenomena of interest are completely controlled, are the preferred design. Often, however, it is not possible or practical to control all the key factors, so it becomes necessary to implement a quasi-experimental research design.

Similarities between true and quasi-experiments:

  • Study participants are subjected to some type of treatment or condition
  • Some outcome of interest is measured
  • The researchers test whether differences in this outcome are related to the treatment

Differences between true experiments and quasi-experiments:

  • In a true experiment, participants are randomly assigned to either the treatment or the control group, whereas they are not assigned randomly in a quasi-experiment
  • In a quasi-experiment, the control and treatment groups differ not only in terms of the experimental treatment they receive, but also in other, often unknown or unknowable, ways. Thus, the researcher must try to statistically control for as many of these differences as possible
  • Because control is lacking in quasi-experiments, there may be several "rival hypotheses" competing with the experimental manipulation as explanations for observed results

Key Components of Experimental Research Design

The manipulation of predictor variables.

In an experiment, the researcher manipulates the factor that is hypothesized to affect the outcome of interest. The factor that is being manipulated is typically referred to as the treatment or intervention. The researcher may manipulate whether research subjects receive a treatment (e.g., antidepressant medicine: yes or no) and the level of treatment (e.g., 50 mg, 75 mg, 100 mg, and 125 mg).

Suppose, for example, a group of researchers was interested in the causes of maternal employment. They might hypothesize that the provision of government-subsidized child care would promote such employment. They could then design an experiment in which some subjects would be provided the option of government-funded child care subsidies and others would not. The researchers might also manipulate the value of the child care subsidies in order to determine if higher subsidy values might result in different levels of maternal employment.

Random Assignment

  • Study participants are randomly assigned to different treatment groups
  • All participants have the same chance of being in a given condition
  • Participants are assigned to either the group that receives the treatment, known as the "experimental group" or "treatment group," or to the group which does not receive the treatment, referred to as the "control group"
  • Random assignment neutralizes factors other than the independent and dependent variables, making it possible to directly infer cause and effect

Random Sampling

Traditionally, experimental researchers have used convenience sampling to select study participants. However, as research methods have become more rigorous, and the problems with generalizing from a convenience sample to the larger population have become more apparent, experimental researchers are increasingly turning to random sampling. In experimental policy research studies, participants are often randomly selected from program administrative databases and randomly assigned to the control or treatment groups.

Validity of Results

The two types of validity of experiments are internal and external. It is often difficult to achieve both in social science research experiments.

Internal Validity

  • When an experiment is internally valid, we are certain that the independent variable (e.g., child care subsidies) caused the outcome of the study (e.g., maternal employment)
  • When subjects are randomly assigned to treatment or control groups, we can assume that the independent variable caused the observed outcomes because the two groups should not have differed from one another at the start of the experiment
  • For example, take the child care subsidy example above. Since research subjects were randomly assigned to the treatment (child care subsidies available) and control (no child care subsidies available) groups, the two groups should not have differed at the outset of the study. If, after the intervention, mothers in the treatment group were more likely to be working, we can assume that the availability of child care subsidies promoted maternal employment

One potential threat to internal validity in experiments occurs when participants either drop out of the study or refuse to participate in the study. If particular types of individuals drop out or refuse to participate more often than individuals with other characteristics, this is called differential attrition. For example, suppose an experiment was conducted to assess the effects of a new reading curriculum. If the new curriculum was so tough that many of the slowest readers dropped out of school, the school with the new curriculum would experience an increase in the average reading scores. The reason they experienced an increase in reading scores, however, is because the worst readers left the school, not because the new curriculum improved students' reading skills.

External Validity

  • External validity is also of particular concern in social science experiments
  • It can be very difficult to generalize experimental results to groups that were not included in the study
  • Studies that randomly select participants from the most diverse and representative populations are more likely to have external validity
  • The use of random sampling techniques makes it easier to generalize the results of studies to other groups

For example, a research study shows that a new curriculum improved reading comprehension of third-grade children in Iowa. To assess the study's external validity, you would ask whether this new curriculum would also be effective with third graders in New York or with children in other elementary grades.

Glossary terms related to validity:

  • internal validity
  • external validity
  • differential attrition

It is particularly important in experimental research to follow ethical guidelines. Protecting the health and safety of research subjects is imperative. In order to assure subject safety, all researchers should have their project reviewed by the Institutional Review Boards (IRBS). The  National Institutes of Health  supplies strict guidelines for project approval. Many of these guidelines are based on the  Belmont Report  (pdf).

The basic ethical principles:

  • Respect for persons  -- requires that research subjects are not coerced into participating in a study and requires the protection of research subjects who have diminished autonomy
  • Beneficence  -- requires that experiments do not harm research subjects, and that researchers minimize the risks for subjects while maximizing the benefits for them
  • Justice  -- requires that all forms of differential treatment among research subjects be justified

Advantages and Disadvantages of Experimental Design

The environment in which the research takes place can often be carefully controlled. Consequently, it is easier to estimate the true effect of the variable of interest on the outcome of interest.


It is often difficult to assure the external validity of the experiment, due to the frequently nonrandom selection processes and the artificial nature of the experimental context.

Experimental vs Quasi-Experimental Design: Which to Choose?

Here’s a table that summarizes the similarities and differences between an experimental and a quasi-experimental study design:

 Experimental Study (a.k.a. Randomized Controlled Trial)Quasi-Experimental Study
ObjectiveEvaluate the effect of an intervention or a treatmentEvaluate the effect of an intervention or a treatment
How participants get assigned to groups?Random assignmentNon-random assignment (participants get assigned according to their choosing or that of the researcher)
Is there a control group?YesNot always (although, if present, a control group will provide better evidence for the study results)
Is there any room for confounding?No (although check for a detailed discussion on post-randomization confounding in randomized controlled trials)Yes (however, statistical techniques can be used to study causal relationships in quasi-experiments)
Level of evidenceA randomized trial is at the highest level in the hierarchy of evidenceA quasi-experiment is one level below the experimental study in the hierarchy of evidence [ ]
AdvantagesMinimizes bias and confounding– Can be used in situations where an experiment is not ethically or practically feasible
– Can work with smaller sample sizes than randomized trials
Limitations– High cost (as it generally requires a large sample size)
– Ethical limitations
– Generalizability issues
– Sometimes practically infeasible
Lower ranking in the hierarchy of evidence as losing the power of randomization causes the study to be more susceptible to bias and confounding

What is a quasi-experimental design?

A quasi-experimental design is a non-randomized study design used to evaluate the effect of an intervention. The intervention can be a training program, a policy change or a medical treatment.

Unlike a true experiment, in a quasi-experimental study the choice of who gets the intervention and who doesn’t is not randomized. Instead, the intervention can be assigned to participants according to their choosing or that of the researcher, or by using any method other than randomness.

Having a control group is not required, but if present, it provides a higher level of evidence for the relationship between the intervention and the outcome.

(for more information, I recommend my other article: Understand Quasi-Experimental Design Through an Example ) .

Examples of quasi-experimental designs include:

  • One-Group Posttest Only Design
  • Static-Group Comparison Design
  • One-Group Pretest-Posttest Design
  • Separate-Sample Pretest-Posttest Design

What is an experimental design?

An experimental design is a randomized study design used to evaluate the effect of an intervention. In its simplest form, the participants will be randomly divided into 2 groups:

  • A treatment group: where participants receive the new intervention which effect we want to study.
  • A control or comparison group: where participants do not receive any intervention at all (or receive some standard intervention).

Randomization ensures that each participant has the same chance of receiving the intervention. Its objective is to equalize the 2 groups, and therefore, any observed difference in the study outcome afterwards will only be attributed to the intervention – i.e. it removes confounding.

(for more information, I recommend my other article: Purpose and Limitations of Random Assignment ).

Examples of experimental designs include:

  • Posttest-Only Control Group Design
  • Pretest-Posttest Control Group Design
  • Solomon Four-Group Design
  • Matched Pairs Design
  • Randomized Block Design

When to choose an experimental design over a quasi-experimental design?

Although many statistical techniques can be used to deal with confounding in a quasi-experimental study, in practice, randomization is still the best tool we have to study causal relationships.

Another problem with quasi-experiments is the natural progression of the disease or the condition under study — When studying the effect of an intervention over time, one should consider natural changes because these can be mistaken with changes in outcome that are caused by the intervention. Having a well-chosen control group helps dealing with this issue.

So, if losing the element of randomness seems like an unwise step down in the hierarchy of evidence, why would we ever want to do it?

This is what we’re going to discuss next.

When to choose a quasi-experimental design over a true experiment?

The issue with randomness is that it cannot be always achievable.

So here are some cases where using a quasi-experimental design makes more sense than using an experimental one:

  • If being in one group is believed to be harmful for the participants , either because the intervention is harmful (ex. randomizing people to smoking), or the intervention has a questionable efficacy, or on the contrary it is believed to be so beneficial that it would be malevolent to put people in the control group (ex. randomizing people to receiving an operation).
  • In cases where interventions act on a group of people in a given location , it becomes difficult to adequately randomize subjects (ex. an intervention that reduces pollution in a given area).
  • When working with small sample sizes , as randomized controlled trials require a large sample size to account for heterogeneity among subjects (i.e. to evenly distribute confounding variables between the intervention and control groups).

Further reading

  • Statistical Software Popularity in 40,582 Research Papers
  • Checking the Popularity of 125 Statistical Tests and Models
  • Objectives of Epidemiology (With Examples)
  • 12 Famous Epidemiologists and Why

  • Quasi-Experimental Design | Definition, Types & Examples

Quasi-Experimental Design | Definition, Types & Examples

Published on 11 April 2022 by Lauren Thomas . Revised on 22 January 2024.

Like a true experiment , a quasi-experimental design aims to establish a cause-and-effect relationship between an independent and dependent variable .

However, unlike a true experiment, a quasi-experiment does not rely on random assignment . Instead, subjects are assigned to groups based on non-random criteria.

Quasi-experimental design is a useful tool in situations where true experiments cannot be used for ethical or practical reasons.

Table of contents

Differences between quasi-experiments and true experiments, types of quasi-experimental designs, when to use quasi-experimental design, advantages and disadvantages, frequently asked questions about quasi-experimental design.

There are several common differences between true and quasi-experimental designs.

True experimental design Quasi-experimental design
Assignment to treatment: The researcher subjects to control and treatment groups. Some other method is used to assign subjects to groups.
Control over treatment: The researcher usually controls treatment delivery. The researcher often does not control treatment, but instead studies pre-existing groups that received different treatments after the fact.
Use of control groups: Requires the use of control groups. Control groups are not required (although they are commonly used).

Example of a true experiment vs a quasi-experiment

However, for ethical reasons, the directors of the mental health clinic may not give you permission to randomly assign their patients to treatments. In this case, you cannot run a true experiment.

Instead, you can use a quasi-experimental design.

You can use these pre-existing groups to study the symptom progression of the patients treated with the new therapy versus those receiving the standard course of treatment.

Many types of quasi-experimental designs exist. Here we explain three of the most common types: nonequivalent groups design, regression discontinuity, and natural experiments.

Nonequivalent groups design

In nonequivalent group design, the researcher chooses existing groups that appear similar, but where only one of the groups experiences the treatment.

In a true experiment with random assignment , the control and treatment groups are considered equivalent in every way other than the treatment. But in a quasi-experiment where the groups are not random, they may differ in other ways – they are nonequivalent groups .

When using this kind of design, researchers try to account for any confounding variables by controlling for them in their analysis or by choosing groups that are as similar as possible.

This is the most common type of quasi-experimental design.

Regression discontinuity

Many potential treatments that researchers wish to study are designed around an essentially arbitrary cutoff, where those above the threshold receive the treatment and those below it do not.

Near this threshold, the differences between the two groups are often so minimal as to be nearly nonexistent. Therefore, researchers can use individuals just below the threshold as a control group and those just above as a treatment group.

However, since the exact cutoff score is arbitrary, the students near the threshold – those who just barely pass the exam and those who fail by a very small margin – tend to be very similar, with the small differences in their scores mostly due to random chance. You can therefore conclude that any outcome differences must come from the school they attended.

Natural experiments

In both laboratory and field experiments, researchers normally control which group the subjects are assigned to. In a natural experiment, an external event or situation (‘nature’) results in the random or random-like assignment of subjects to the treatment group.

Even though some use random assignments, natural experiments are not considered to be true experiments because they are observational in nature.

Although the researchers have no control over the independent variable, they can exploit this event after the fact to study the effect of the treatment.

However, as they could not afford to cover everyone who they deemed eligible for the program, they instead allocated spots in the program based on a random lottery.

Although true experiments have higher internal validity , you might choose to use a quasi-experimental design for ethical or practical reasons.

Sometimes it would be unethical to provide or withhold a treatment on a random basis, so a true experiment is not feasible. In this case, a quasi-experiment can allow you to study the same causal relationship without the ethical issues.

The Oregon Health Study is a good example. It would be unethical to randomly provide some people with health insurance but purposely prevent others from receiving it solely for the purposes of research.

However, since the Oregon government faced financial constraints and decided to provide health insurance via lottery, studying this event after the fact is a much more ethical approach to studying the same problem.

True experimental design may be infeasible to implement or simply too expensive, particularly for researchers without access to large funding streams.

At other times, too much work is involved in recruiting and properly designing an experimental intervention for an adequate number of subjects to justify a true experiment.

In either case, quasi-experimental designs allow you to study the question by taking advantage of data that has previously been paid for or collected by others (often the government).

Quasi-experimental designs have various pros and cons compared to other types of studies.

  • Higher external validity than most true experiments, because they often involve real-world interventions instead of artificial laboratory settings.
  • Higher internal validity than other non-experimental types of research, because they allow you to better control for confounding variables than other types of studies do.
  • Lower internal validity than true experiments – without randomisation, it can be difficult to verify that all confounding variables have been accounted for.
  • The use of retrospective data that has already been collected for other purposes can be inaccurate, incomplete or difficult to access.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference between this and a true experiment is that the groups are not randomly assigned.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomisation. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

Chapter 7: Nonexperimental Research

Quasi-Experimental Research

Learning Objectives

  • Explain what quasi-experimental research is and distinguish it clearly from both experimental and correlational research.
  • Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one.

The prefix  quasi  means “resembling.” Thus quasi-experimental research is research that resembles experimental research but is not true experimental research. Although the independent variable is manipulated, participants are not randomly assigned to conditions or orders of conditions (Cook & Campbell, 1979). [1] Because the independent variable is manipulated before the dependent variable is measured, quasi-experimental research eliminates the directionality problem. But because participants are not randomly assigned—making it likely that there are other differences between conditions—quasi-experimental research does not eliminate the problem of confounding variables. In terms of internal validity, therefore, quasi-experiments are generally somewhere between correlational studies and true experiments.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention. There are many different kinds of quasi-experiments, but we will discuss just a few of the most common ones here.

Nonequivalent Groups Design

Recall that when participants in a between-subjects experiment are randomly assigned to conditions, the resulting groups are likely to be quite similar. In fact, researchers consider them to be equivalent. When participants are not randomly assigned to conditions, however, the resulting groups are likely to be dissimilar in some ways. For this reason, researchers consider them to be nonequivalent. A  nonequivalent groups design , then, is a between-subjects design in which participants have not been randomly assigned to conditions.

Imagine, for example, a researcher who wants to evaluate a new method of teaching fractions to third graders. One way would be to conduct a study with a treatment group consisting of one class of third-grade students and a control group consisting of another class of third-grade students. This design would be a nonequivalent groups design because the students are not randomly assigned to classes by the researcher, which means there could be important differences between them. For example, the parents of higher achieving or more motivated students might have been more likely to request that their children be assigned to Ms. Williams’s class. Or the principal might have assigned the “troublemakers” to Mr. Jones’s class because he is a stronger disciplinarian. Of course, the teachers’ styles, and even the classroom environments, might be very different and might cause different levels of achievement or motivation among the students. If at the end of the study there was a difference in the two classes’ knowledge of fractions, it might have been caused by the difference between the teaching methods—but it might have been caused by any of these confounding variables.

Of course, researchers using a nonequivalent groups design can take steps to ensure that their groups are as similar as possible. In the present example, the researcher could try to select two classes at the same school, where the students in the two classes have similar scores on a standardized math test and the teachers are the same sex, are close in age, and have similar teaching styles. Taking such steps would increase the internal validity of the study because it would eliminate some of the most important confounding variables. But without true random assignment of the students to conditions, there remains the possibility of other important confounding variables that the researcher was not able to control.

Pretest-Posttest Design

In a  pretest-posttest design , the dependent variable is measured once before the treatment is implemented and once after it is implemented. Imagine, for example, a researcher who is interested in the effectiveness of an antidrug education program on elementary school students’ attitudes toward illegal drugs. The researcher could measure the attitudes of students at a particular elementary school during one week, implement the antidrug program during the next week, and finally, measure their attitudes again the following week. The pretest-posttest design is much like a within-subjects experiment in which each participant is tested first under the control condition and then under the treatment condition. It is unlike a within-subjects experiment, however, in that the order of conditions is not counterbalanced because it typically is not possible for a participant to be tested in the treatment condition first and then in an “untreated” control condition.

If the average posttest score is better than the average pretest score, then it makes sense to conclude that the treatment might be responsible for the improvement. Unfortunately, one often cannot conclude this with a high degree of certainty because there may be other explanations for why the posttest scores are better. One category of alternative explanations goes under the name of  history . Other things might have happened between the pretest and the posttest. Perhaps an antidrug program aired on television and many of the students watched it, or perhaps a celebrity died of a drug overdose and many of the students heard about it. Another category of alternative explanations goes under the name of  maturation . Participants might have changed between the pretest and the posttest in ways that they were going to anyway because they are growing and learning. If it were a yearlong program, participants might become less impulsive or better reasoners and this might be responsible for the change.

Another alternative explanation for a change in the dependent variable in a pretest-posttest design is  regression to the mean . This refers to the statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion. For example, a bowler with a long-term average of 150 who suddenly bowls a 220 will almost certainly score lower in the next game. Her score will “regress” toward her mean score of 150. Regression to the mean can be a problem when participants are selected for further study  because  of their extreme scores. Imagine, for example, that only students who scored especially low on a test of fractions are given a special training program and then retested. Regression to the mean all but guarantees that their scores will be higher even if the training program has no effect. A closely related concept—and an extremely important one in psychological research—is  spontaneous remission . This is the tendency for many medical and psychological problems to improve over time without any form of treatment. The common cold is a good example. If one were to measure symptom severity in 100 common cold sufferers today, give them a bowl of chicken soup every day, and then measure their symptom severity again in a week, they would probably be much improved. This does not mean that the chicken soup was responsible for the improvement, however, because they would have been much improved without any treatment at all. The same is true of many psychological problems. A group of severely depressed people today is likely to be less depressed on average in 6 months. In reviewing the results of several studies of treatments for depression, researchers Michael Posternak and Ivan Miller found that participants in waitlist control conditions improved an average of 10 to 15% before they received any treatment at all (Posternak & Miller, 2001) [2] . Thus one must generally be very cautious about inferring causality from pretest-posttest designs.

Does Psychotherapy Work?

Early studies on the effectiveness of psychotherapy tended to use pretest-posttest designs. In a classic 1952 article, researcher Hans Eysenck summarized the results of 24 such studies showing that about two thirds of patients improved between the pretest and the posttest (Eysenck, 1952) [3] . But Eysenck also compared these results with archival data from state hospital and insurance company records showing that similar patients recovered at about the same rate  without  receiving psychotherapy. This parallel suggested to Eysenck that the improvement that patients showed in the pretest-posttest studies might be no more than spontaneous remission. Note that Eysenck did not conclude that psychotherapy was ineffective. He merely concluded that there was no evidence that it was, and he wrote of “the necessity of properly planned and executed experimental studies into this important field” (p. 323). You can read the entire article here: Classics in the History of Psychology .

Fortunately, many other researchers took up Eysenck’s challenge, and by 1980 hundreds of experiments had been conducted in which participants were randomly assigned to treatment and control conditions, and the results were summarized in a classic book by Mary Lee Smith, Gene Glass, and Thomas Miller (Smith, Glass, & Miller, 1980) [4] . They found that overall psychotherapy was quite effective, with about 80% of treatment participants improving more than the average control participant. Subsequent research has focused more on the conditions under which different types of psychotherapy are more or less effective.

Interrupted Time Series Design

A variant of the pretest-posttest design is the  interrupted time-series design . A time series is a set of measurements taken at intervals over a period of time. For example, a manufacturing company might measure its workers’ productivity each week for a year. In an interrupted time series-design, a time series like this one is “interrupted” by a treatment. In one classic example, the treatment was the reduction of the work shifts in a factory from 10 hours to 8 hours (Cook & Campbell, 1979) [5] . Because productivity increased rather quickly after the shortening of the work shifts, and because it remained elevated for many months afterward, the researcher concluded that the shortening of the shifts caused the increase in productivity. Notice that the interrupted time-series design is like a pretest-posttest design in that it includes measurements of the dependent variable both before and after the treatment. It is unlike the pretest-posttest design, however, in that it includes multiple pretest and posttest measurements.

Figure 7.3 shows data from a hypothetical interrupted time-series study. The dependent variable is the number of student absences per week in a research methods course. The treatment is that the instructor begins publicly taking attendance each day so that students know that the instructor is aware of who is present and who is absent. The top panel of  Figure 7.3 shows how the data might look if this treatment worked. There is a consistently high number of absences before the treatment, and there is an immediate and sustained drop in absences after the treatment. The bottom panel of  Figure 7.3 shows how the data might look if this treatment did not work. On average, the number of absences after the treatment is about the same as the number before. This figure also illustrates an advantage of the interrupted time-series design over a simpler pretest-posttest design. If there had been only one measurement of absences before the treatment at Week 7 and one afterward at Week 8, then it would have looked as though the treatment were responsible for the reduction. The multiple measurements both before and after the treatment suggest that the reduction between Weeks 7 and 8 is nothing more than normal week-to-week variation.

Combination Designs

A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. But at the same time there is a control group that is given a pretest, does  not  receive the treatment, and then is given a posttest. The question, then, is not simply whether participants who receive the treatment improve but whether they improve  more  than participants who do not receive the treatment.

Imagine, for example, that students in one school are given a pretest on their attitudes toward drugs, then are exposed to an antidrug program, and finally are given a posttest. Students in a similar school are given the pretest, not exposed to an antidrug program, and finally are given a posttest. Again, if students in the treatment condition become more negative toward drugs, this change in attitude could be an effect of the treatment, but it could also be a matter of history or maturation. If it really is an effect of the treatment, then students in the treatment condition should become more negative than students in the control condition. But if it is a matter of history (e.g., news of a celebrity drug overdose) or maturation (e.g., improved reasoning), then students in the two conditions would be likely to show similar amounts of change. This type of design does not completely eliminate the possibility of confounding variables, however. Something could occur at one of the schools but not the other (e.g., a student drug overdose), so students at the first school would be affected by it while students at the other school would not.

Finally, if participants in this kind of design are randomly assigned to conditions, it becomes a true experiment rather than a quasi experiment. In fact, it is the kind of experiment that Eysenck called for—and that has now been conducted many times—to demonstrate the effectiveness of psychotherapy.

Key Takeaways

  • Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or orders of conditions. Among the important types are nonequivalent groups designs, pretest-posttest, and interrupted time-series designs.
  • Quasi-experimental research eliminates the directionality problem because it involves the manipulation of the independent variable. It does not eliminate the problem of confounding variables, however, because it does not involve random assignment to conditions. For these reasons, quasi-experimental research is generally higher in internal validity than correlational studies but lower than true experiments.
  • Practice: Imagine that two professors decide to test the effect of giving daily quizzes on student performance in a statistics course. They decide that Professor A will give quizzes but Professor B will not. They will then compare the performance of students in their two sections on a common final exam. List five other variables that might differ between the two sections that could affect the results.
  • regression to the mean
  • spontaneous remission

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues in field settings . Boston, MA: Houghton Mifflin. ↵
  • Posternak, M. A., & Miller, I. (2001). Untreated short-term course of major depression: A meta-analysis of studies using outcomes from studies using wait-list control groups. Journal of Affective Disorders, 66 , 139–146. ↵
  • Eysenck, H. J. (1952). The effects of psychotherapy: An evaluation. Journal of Consulting Psychology, 16 , 319–324. ↵
  • Smith, M. L., Glass, G. V., & Miller, T. I. (1980). The benefits of psychotherapy . Baltimore, MD: Johns Hopkins University Press. ↵

A between-subjects design in which participants have not been randomly assigned to conditions.

The dependent variable is measured once before the treatment is implemented and once after it is implemented.

A category of alternative explanations for differences between scores such as events that happened between the pretest and posttest, unrelated to the study.

An alternative explanation that refers to how the participants might have changed between the pretest and posttest in ways that they were going to anyway because they are growing and learning.

The statistical fact that an individual who scores extremely on a variable on one occasion will tend to score less extremely on the next occasion.

The tendency for many medical and psychological problems to improve over time without any form of treatment.

A set of measurements taken at intervals over a period of time that are interrupted by a treatment.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

two types of quasi experimental design

5 Chapter 5: Experimental and Quasi-Experimental Designs

Case stu dy: the impact of teen court.

Research Study

An Experimental Evaluation of Teen Courts 1

Research Question

Is teen court more effective at reducing recidivism and improving attitudes than traditional juvenile justice processing?


Researchers randomly assigned 168 juvenile offenders ages 11 to 17 from four different counties in Maryland to either teen court as experimental group members or to traditional juvenile justice processing as control group members. (Note: Discussion on the technical aspects of experimental designs, including random assignment, is found in detail later in this chapter.) Of the 168 offenders, 83 were assigned to teen court and 85 were assigned to regular juvenile justice processing through random assignment. Of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study.

Upon assignment to teen court or regular juvenile justice processing, all offenders entered their respective sanction. Approximately four months later, offenders in both the experimental group (teen court) and the control group (regular juvenile justice processing) were asked to complete a post-test survey inquiring about a variety of behaviors (frequency of drug use, delinquent behavior, variety of drug use) and attitudinal measures (social skills, rebelliousness, neighborhood attachment, belief in conventional rules, and positive self-concept). The study researchers also collected official re-arrest data for 18 months starting at the time of offender referral to juvenile justice authorities.

Teen court participants self-reported higher levels of delinquency than those processed through regular juvenile justice processing. According to official re-arrests, teen court youth were re-arrested at a higher rate and incurred a higher average number of total arrests than the control group. Teen court offenders also reported significantly lower scores on survey items designed to measure their �belief in conventional rules� compared to offenders processed through regular juvenile justice avenues. Other attitudinal and opinion measures did not differ significantly between the experimental and control group members based on their post-test responses. In sum, those youth randomly assigned to teen court fared worse than control group members who were not randomly assigned to teen court.

Limitations with the Study Procedure

Limitations are inherent in any research study and those research efforts that utilize experimental designs are no exception. It is important to consider the potential impact that a limitation of the study procedure could have on the results of the study.

In the current study, one potential limitation is that teen courts from four different counties in Maryland were utilized. Because of the diversity in teen court sites, it is possible that there were differences in procedure between the four teen courts and such differences could have impacted the outcomes of this study. For example, perhaps staff members at one teen court were more punishment-oriented than staff members at the other county teen courts. This philosophical difference may have affected treatment delivery and hence experimental group members� belief in conventional attitudes and recidivism. Although the researchers monitored each teen court to help ensure treatment consistency between study sites, it is possible that differences existed in the day-to-day operation of the teen courts that may have affected participant outcomes. This same limitation might also apply to control group members who were sanctioned with regular juvenile justice processing in four different counties.

A researcher must also consider the potential for differences between the experimental and control group members. Although the offenders were randomly assigned to the experimental or control group, and the assumption is that the groups were equivalent to each other prior to program participation, the researchers in this study were only able to compare the experimental and control groups on four variables: age, school grade, gender, and race. It is possible that the experimental and control group members differed by chance on one or more factors not measured or available to the researchers. For example, perhaps a large number of teen court members experienced problems at home that can explain their more dismal post-test results compared to control group members without such problems. A larger sample of juvenile offenders would likely have helped to minimize any differences between the experimental and control group members. The collection of additional information from study participants would have also allowed researchers to be more confident that the experimental and control group members were equivalent on key pieces of information that could have influenced recidivism and participant attitudes.

Finally, while 168 juvenile offenders were randomly assigned to either the experimental or control group, not all offenders agreed to participate in the evaluation. Remember that of the 83 offenders assigned to the teen court experimental group, only 56 (67%) agreed to participate in the study. Of the 85 youth randomly assigned to normal juvenile justice processing, only 51 (60%) agreed to participate in the study. While this limitation is unavoidable, it still could have influenced the study. Perhaps those 27 offenders who declined to participate in the teen court group differed significantly from the 56 who agreed to participate. If so, it is possible that the differences among those two groups could have impacted the results of the study. For example, perhaps the 27 youths who were randomly assigned to teen court but did not agree to be a part of the study were some of the least risky of potential teen court participants�less serious histories, better attitudes to begin with, and so on. In this case, perhaps the most risky teen court participants agreed to be a part of the study, and as a result of being more risky, this led to more dismal delinquency outcomes compared to the control group at the end of each respective program. Because parental consent was required for the study authors to be able to compare those who declined to participate in the study to those who agreed, it is unknown if the participants and nonparticipants differed significantly on any variables among either the experimental or control group. Moreover, of the resulting 107 offenders who took part in the study, only 75 offenders accurately completed the post-test survey measuring offending and attitudinal outcomes.

Again, despite the experimental nature of this study, such limitations could have impacted the study results and must be considered.

Impact on Criminal Justice

Teen courts are generally designed to deal with nonserious first time offenders before they escalate to more serious and chronic delinquency. Innovative programs such as �Scared Straight� and juvenile boot camps have inspired an increase in teen court programs across the country, although there is little evidence regarding their effectiveness compared to traditional sanctions for youthful offenders. This study provides more specific evidence as to the effectiveness of teen courts relative to normal juvenile justice processing. Researchers learned that teen court participants fared worse than those in the control group. The potential labeling effects of teen court, including stigma among peers, especially where the offense may have been very minor, may be more harmful than doing less or nothing. The real impact of this study lies in the recognition that teen courts and similar sanctions for minor offenders may do more harm than good.

One important impact of this study is that it utilized an experimental design to evaluate the effectiveness of a teen court compared to traditional juvenile justice processing. Despite the study�s limitations, by using an experimental design it improved upon previous teen court evaluations by attempting to ensure any results were in fact due to the treatment, not some difference between the experimental and control group. This study also utilized both official and self-report measures of delinquency, in addition to self-report measures on such factors as self-concept and belief in conventional rules, which have been generally absent from teen court evaluations. The study authors also attempted to gauge the comparability of the experimental and control groups on factors such as age, gender, and race to help make sure study outcomes were attributable to the program, not the participants.

In This Chapter You Will Learn

The four components of experimental and quasi-experimental research designs and their function in answering a research question

The differences between experimental and quasi-experimental designs

The importance of randomization in an experimental design

The types of questions that can be answered with an experimental or quasi-experimental research design

About the three factors required for a causal relationship

That a relationship between two or more variables may appear causal, but may in fact be spurious, or explained by another factor

That experimental designs are relatively rare in criminal justice and why

About common threats to internal validity or alternative explanations to what may appear to be a causal relationship between variables

Why experimental designs are superior to quasi-experimental designs for eliminating or reducing the potential of alternative explanations


The teen court evaluation that began this chapter is an example of an experimental design. The researchers of the study wanted to determine whether teen court was more effective at reducing recidivism and improving attitudes compared to regular juvenile justice case processing. In short, the researchers were interested in the relationship between variables �the relationship of teen court to future delinquency and other outcomes. When researchers are interested in whether a program, policy, practice, treatment, or other intervention impacts some outcome, they often utilize a specific type of research method/design called experimental design. Although there are many types of experimental designs, the foundation for all of them is the classic experimental design. This research design, and some typical variations of this experimental design, are the focus of this chapter.

Although the classic experiment may be appropriate to answer a particular research question, there are barriers that may prevent researchers from using this or another type of experimental design. In these situations, researchers may turn to quasi-experimental designs. Quasi-experiments include a group of research designs that are missing a key element found in the classic experiment and other experimental designs (hence the term �quasi� experiment). Despite this missing part, quasi-experiments are similar in structure to experimental designs and are used to answer similar types of research questions. This chapter will also focus on quasi-experiments and how they are similar to and different from experimental designs.

Uncovering the relationship between variables, such as the impact of teen court on future delinquency, is important in criminal justice and criminology, just as it is in other scientific disciplines such as education, biology, and medicine. Indeed, whereas criminal justice researchers may be interested in whether a teen court reduces recidivism or improves attitudes, medical field researchers may be concerned with whether a new drug reduces cholesterol, or an education researcher may be focused on whether a new teaching style leads to greater academic gains. Across these disciplines and topics of interest, the experimental design is appropriate. In fact, experimental designs are used in all scientific disciplines; the only thing that changes is the topic. Specific to criminal justice, below is a brief sampling of the types of questions that can be addressed using an experimental design:

Does participation in a correctional boot camp reduce recidivism?

What is the impact of an in-cell integration policy on inmate-on-inmate assaults in prisons?

Does police officer presence in schools reduce bullying?

Do inmates who participate in faith-based programming while in prison have a lower recidivism rate upon their release from prison?

Do police sobriety checkpoints reduce drunken driving fatalities?

What is the impact of a no-smoking policy in prisons on inmate-on-inmate assaults?

Does participation in a domestic violence intervention program reduce repeat domestic violence arrests?

A focus on the classic experimental design will demonstrate the usefulness of this research design for addressing criminal justice questions interested in cause and effect relationships. Particular attention is paid to the classic experimental design because it serves as the foundation for all other experimental and quasi-experimental designs, some of which are covered in this chapter. As a result, a clear understanding of the components, organization, and logic of the classic experimental design will facilitate an understanding of other experimental and quasi-experimental designs examined in this chapter. It will also allow the reader to better understand the results produced from those various designs, and importantly, what those results mean. It is a truism that the results of a research study are only as �good� as the design or method used to produce them. Therefore, understanding the various experimental and quasi-experimental designs is the key to becoming an informed consumer of research.

The Challenge of Establishing Cause and Effect

Researchers interested in explaining the relationship between variables, such as whether a treatment program impacts recidivism, are interested in causation or causal relationships. In a simple example, a causal relationship exists when X (independent variable) causes Y (dependent variable), and there are no other factors (Z) that can explain that relationship. For example, offenders who participated in a domestic violence intervention program (X�domestic violence intervention program) experienced fewer re-arrests (Y�re-arrests) than those who did not participate in the domestic violence program, and no other factor other than participation in the domestic violence program can explain these results. The classic experimental design is superior to other research designs in uncovering a causal relationship, if one exists. Before a causal relationship can be established, however, there are three conditions that must be met (see Figure 5.1). 2

FIGURE 5.1 | The Cause and Effect Relationship

two types of quasi experimental design

Timing The first condition for a causal relationship is timing. For a causal relationship to exist, it must be shown that the independent variable or cause (X) preceded the dependent variable or outcome (Y) in time. A decrease in domestic violence re-arrests (Y) cannot occur before participation in a domestic violence reduction program (X ), if the domestic violence program is proposed to be the cause of fewer re-arrests. Ensuring that cause comes before effect is not sufficient to establish that a causal relationship exists, but it is one requirement that must be met for a causal relationship.

Association In addition to timing, there must also be an observable association between X and Y, the second necessary condition for a causal relationship. Association is also commonly referred to as covariance or correlation. When an association or correlation exits, this means there is some pattern of relationship between X and Y �as X changes by increasing or decreasing, Y also changes by increasing or decreasing. Here, the notion of X and Y increasing or decreasing can mean an actual increase/decrease in the quantity of some factor, such as an increase/decrease in the number of prison terms or days in a program or re-arrests. It can also refer to an increase/decrease in a particular category, for example, from nonparticipation in a program to participation in a program. For instance, subjects who participated in a domestic violence reduction program (X) incurred fewer domestic violence re-arrests (Y) than those who did not participate in the program. In this example, X and Y are associated�as X change s or increases from nonparticipation to participation in the domestic violence program, Y or the number of re-arrests for domestic violence decreases.

Associations between X and Y can occur in two different directions: positive or negative. A positive association means that as X increases, Y increases, or, as X decreases, Y decreases. A negative association means that as X increases, Y decreases, or, as X decreases, Y increases. In the example above, the association is negative�participation in the domestic violence program was associated with a reduction in re-arrests. This is also sometimes called an inverse relationship.

Elimination of Alternative Explanations Although participation in a domestic violence program may be associated with a reduction in re-arrests, this does not mean for certain that participation in the program was the cause of reduced re-arrests. Just as timing by itself does not imply a causal relationship, association by itself does not imply a causal relationship. For example, instead of the program being the cause of a reduction in re-arrests, perhaps several of the program participants died shortly after completion of the domestic violence program and thus were not able to engage in domestic violence (and their deaths were unknown to the researcher tracking re-arrests). Perhaps a number of the program participants moved out of state and domestic violence re-arrests occurred but were not able to be uncovered by the researcher. Perhaps those in the domestic violence program experienced some other event, such as the trauma of a natural disaster, and that experience led to a reduction in domestic violence, an event not connected to the domestic violence program. If any of these situations occurred, it might appear that the domestic violence program led to fewer re-arrests. However, the observed reduction in re-arrests can actually be attributed to a factor unrelated to the domestic violence program.

The previous discussion leads to the third and final necessary consideration in determining a causal relationship� elimination of alternative explanations. This means that the researcher must rule out any other potential explanation of the results, except for the experimental condition such as a program, policy, or practice. Accounting for or ruling out alternative explanations is much more difficult than ensuring timing and association. Ruling out all alternative explanations is difficult because there are so many potential other explanations that can wholly or partly explain the findings of a research study. This is especially true in the social sciences, where researchers are often interested in relationships explaining human behavior. Because of this difficulty, associations by themselves are sometimes mistaken as causal relationships when in fact they are spurious. A spurious relationship is one where it appears that X and Y are causally related, but the relationship is actually explained by something other than the independent variable, or X.

One only needs to go so far as the daily newspaper to find headlines and stories of mere associations being mistaken, assumed, or represented as causal relationships. For example, a newspaper headline recently proclaimed �Churchgoers live longer.� 3 An uninformed consumer may interpret this headline as evidence of a causal relationship�that going to church by itself will lead to a longer life�but the astute consumer would note possible alternative explanations. For example, people who go to church may live longer because they tend to live healthier lifestyles and tend to avoid risky situations. These are two probable alternative explanations to the relationship independent of simply going to church. In another example, researchers David Kalist and Daniel Yee explored the relationship between first names and delinquent behavior in their manuscript titled �First Names and Crime: Does Unpopularity Spell Trouble?� 4 Kalist and Lee (2009) found that unpopular names are associated with juvenile delinquency. In other words, those individuals with the most unpopular names were more likely to be delinquent than those with more popular names. According to the authors, is it not necessarily someone�s name that leads to delinquent behavior, but rather, the most unpopular names also tend to be correlated with individuals who come from disadvantaged home environments and experience a low socio-economic status of living. Rightly noted by the authors, these alternative explanations help to explain the link between someone�s name and delinquent behavior�a link that is not causal.

A frequently cited example provides more insight to the claim that an association by itself is not sufficient to prove causality. In certain cities in the United States, for example, as ice cream sales increase on a particular day or in a particular month so does the incidence of certain forms of crime. If this association were represented as a causal statement, it would be that ice cream or ice cream sales causes crime. There is an association, no doubt, and let us assume that ice cream sales rose before the increase in crime (timing). Surely, however, this relationship between ice cream sales and crime is spurious. The alternative explanation is that ice cream sales and crime are associated in certain parts of the country because of the weather. Ice cream sales tend to increase in warmer temperatures, and it just so happens that certain forms of crime tend to increase in warmer temperatures as well. This coincidence or association does not mean a causal relationship exists. Additionally, this does not mean that warm temperatures cause crime either. There are plenty of other alternative explanations for the increase in certain forms of crime and warmer temperatures. 6 For another example of a study subject to alternative explanations, read the June 2011 news article titled �Less Crime in U.S. Thanks to Videogames.� 7 Based on your reading, what are some other potential explanations for the crime drop other than videogames?

The preceding examples demonstrate how timing and association can be present, but the final needed condition for a causal relationship is that all alternative explanations are ruled out. While this task is difficult, the classic experimental design helps to ensure these additional explanatory factors are minimized. When other designs are used, such as quasi-experimental designs, the chance that alternative explanations emerge is greater. This potential should become clearer as we explore the organization and logic of the classic experimental design.


Minneapolis Domestic Violence Experiment

The Minneapolis Domestic Violence Experiment (MDVE) 5

Which police action (arrest, separation, or mediation) is most effective at deterring future misdemeanor domestic violence?

The experiment began on March 17, 1981, and continued until August 1, 1982. The experiment was conducted in two of Minneapolis�s four police precincts�the two with the highest number of domestic violence reports and arrests. A total of 314 reports of misdemeanor domestic violence were handled by the police during this time frame.

This study utilized an experimental design with the random assignment of police actions. Each police officer involved in the study was given a pad of report forms. Upon a misdemeanor domestic violence call, the officer�s action (arrest, separation, or mediation) was predetermined by the order and color of report forms in the officer�s notebook. Colored report forms were randomly ordered in the officer�s notebook and the color on the form determined the officer response once at the scene. For example, after receiving a call for domestic violence, an officer would turn to his or her report pad to determine the action. If the top form was pink, the action was arrest. If on the next call the top form was a different color, an action other than arrest would occur. All colored report forms were randomly ordered through a lottery assignment method. The result is that all police officer actions to misdemeanor domestic violence calls were randomly assigned. To ensure the lottery procedure was properly carried out, research staff participated in ride-alongs with officers to ensure that officers did not skip the order of randomly ordered forms. Research staff also made sure the reports were received in the order they were randomly assigned in the pad of report forms.

To examine the relationship of different officer responses to future domestic violence, the researchers examined official arrests of the suspects in a 6-month follow-up period. For example, the researchers examined those initially arrested for misdemeanor domestic violence and how many were subsequently arrested for domestic violence within a 6-month time frame. They did the same procedure for the police actions of separation and mediation. The researchers also interviewed the victim(s) of each incident and asked if a repeat domestic violence incident occurred with the same suspect in the 6-month follow-up period. This allowed researchers to examine domestic violence offenses that may have occurred but did not come to the official attention of police. The researchers then compared official arrests for domestic violence to self-reported domestic violence after the experiment.

Suspects arrested for misdemeanor domestic violence, as opposed to situations where separation or mediation was used, were significantly less likely to engage in repeat domestic violence as measured by official arrest records and victim interviews during the 6-month follow-up period. According to official police records, 10% of those initially arrested engaged in repeat domestic violence in the followup period, 19% of those who initially received mediation engaged in repeat domestic violence, and 24% of those who randomly received separation engaged in repeat domestic violence. According to victim interviews, 19% of those initially arrested engaged in repeat domestic violence, compared to 37% for separation and 33% for mediation. The general conclusion of the experiment was that arrest was preferable to separation or mediation in deterring repeat domestic violence across both official police records and victim interviews.

A few issues that affected the random assignment procedure occurred throughout the study. First, some officers did not follow the randomly assigned action (arrest, separation, or mediation) as a result of other circumstances that occurred at the scene. For example, if the randomly assigned action was separation, but the suspect assaulted the police officer during the call, the officer might arrest the suspect. Second, some officers simply ignored the assigned action if they felt a particular call for domestic violence required another action. For example, if the action was mediation as indicated by the randomly assigned report form, but the officer felt the suspect should be arrested, he or she may have simply ignored the randomly assigned response and substituted his or her own. Third, some officers forgot their report pads and did not know the randomly assigned course of action to take upon a call of domestic violence. Fourth and finally, the police chief also allowed officers to deviate from the randomly assigned action in certain circumstances. In all of these situations, the random assignment procedures broke down.

The results of the MDVE had a rapid and widespread impact on law enforcement practice throughout the United States. Just two years after the release of the study, a 1986 telephone survey of 176 urban police departments serving cities with populations of 100,000 or more found that 46 percent of the departments preferred to make arrests in cases of minor domestic violence, largely due to the effectiveness of this practice in the Minneapolis Domestic Violence Experiment. 8

In an attempt to replicate the findings of the Minneapolis Domestic Violence Experiment, the National Institute of Justice sponsored the Spouse Assault Replication Program. Replication studies were conducted in Omaha, Charlotte, Milwaukee, Miami, and Colorado Springs from 1986�1991. In three of the five replications, offenders randomly assigned to the arrest group had higher levels of continued domestic violence in comparison to other police actions during domestic violence situations. 9 Therefore, rather than providing results that were consistent with the Minneapolis Domestic Violence Experiment, the results from the five replication experiments produced inconsistent findings about whether arrest deters domestic violence. 10

Despite the findings of the replications, the push to arrest domestic violence offenders has continued in law enforcement. Today many police departments require officers to make arrests in domestic violence situations. In agencies that do not mandate arrest, department policy typically states a strong preference toward arrest. State legislatures have also enacted laws impacting police actions regarding domestic violence. Twenty-one states have mandatory arrest laws while eight have pro-arrest statutes for domestic violence. 11

The Classic Experimental Design

Table 5.1 provides an illustration of the classic experimental design. 12 It is important to become familiar with the specific notation and organization of the classic experiment before a full discussion of its components and their purpose.

Major Components of the Classic Experimental Design

The classic experimental design has four major components:

1. Treatment

2. Experimental Group and Control Group

3. Pre-Test and Post-Test

4. Random Assignment

Treatment The first component of the classic experimental design is the treatment, and it is denoted by X in the classic experimental design. The treatment can be a number of things�a program, a new drug, or the implementation of a new policy. In a classic experimental design, the primary goal is to determine what effect, if any, a particular treatment had on some outcome. In this way, the treatment can also be considered the independent variable.

TABLE 5.1 | The Classic Experimental Design








Experimental Group = Group that receives the treatment

Control Group = Group that does not receive the treatment

R = Random assignment

O 1 = Observation before the treatment, or the pre-test

X = Treatment or the independent variable

O 2 = Observation after the treatment, or the post-test

Experimental and Control Groups The second component of the classic experiment is an experimental group and a control group. The experimental group receives the treatment, and the control group does not receive the treatment. There will always be at least one group that receives the treatment in experimental and quasi-experimental designs. In some cases, experiments may have multiple experimental groups receiving multiple treatments.

Pre-Test and Post-Test The third component of the classic experiment is a pre-test and a post-test. A pretest is a measure of the dependent variable or outcome before the treatment. The post-test is a measure of the dependent variable after the treatment is administered. It is important to note that the post-test is defined based on the stated goals of the program. For example, if the stated goal of a particular program is to reduce re-arrests, the post-test will be a measure of re-arrests after the program. The dependent variable also defines the pre-test. For example, if a researcher wanted to examine the impact of a domestic violence reduction program (treatment or X) on the goal of reducing re-arrests (dependent variable or Y), the pre-test would be the number of domestic violence arrests incurred before the program. Program goals may be numerous and all can constitute a post-test, and hence, the pre-test. For example, perhaps the goal of the domestic violence program is also that participants learn of different pro-social ways to handle domestic conflicts other than resorting to violence. If researchers wanted to examine this goal, the post-test might be subjects� level of knowledge about pro-social ways to handle domestic conflicts other than violence. The pre-test would then be subjects� level of knowledge about these pro-social alternatives to violence before they received the treatment program.

Although all designs have a post-test, it is not always the case that designs have a pre-test. This is because researchers may not have access or be able to collect information constituting the pre-test. For example, researchers may not be able to determine subjects� level of knowledge about alternatives to domestic violence before the intervention program if the subjects are already enrolled in the domestic violence intervention program. In other cases, there may be financial barriers to collecting pre-test information. In the teen court evaluation that started this chapter, for example, researchers were not able to collect pre-test information on study participants due to the financial strain it would have placed on the agencies involved in the study. 13 There are a number of potential reasons why a pre-test might not be available in a research study. The defining feature, however, is that the pre-test is determined by the post-test.

Random Assignment The fourth component of the classic experiment is random assignment. Random assignment refers to a process whereby members of the experimental group and control group are assigned to the two groups through a random and unbiased process. Random assignment should not be mistaken for random selection as discussed in Chapter 3. Random selection refers to selecting a smaller but representative sample from a larger population. For example, a researcher may randomly select a sample from a larger city population for the purposes of sending sample members a mail survey to determine their attitudes on crime. The goal of random selection in this example is to make sure the sample, although smaller in size than the population, accurately represents the larger population.

Random assignment, on the other hand, refers to the process of assigning subjects to either the experimental or control group with the goal that the groups are similar or equivalent to each other in every way (see Figure 5.2). The exception to this rule is that one group gets the treatment and the other does not (see discussion below on why equivalence is so important). Although the concept of random is similar in each, the goals are different between random selection and random assignment. 14 Experimental designs all feature random assignment, but this is not true of other research designs, in particular quasi-experimental designs.

FIGURE 5.2 | Random Assignment

two types of quasi experimental design

The classic experimental design is the foundation for all other experimental and quasi-experimental designs because it retains all of the major components discussed above. As mentioned, sometimes designs do not have a pre-test, a control group, or random assignment. Because the pre-test, control group, and random assignment are so critical to the goal of uncovering a causal relationship, if one exists, we explore them further below.

The Logic of the Classic Experimental Design

Consider a research study using the classic experimental design where the goal is to determine if a domestic violence treatment program has any effect on re-arrests for domestic violence. The randomly assigned experimental and control groups are comprised of persons who had previously been arrested for domestic violence. The pretest is a measure of the number of domestic violence arrests before the program. This is because the goal of the program is to determine whether re-arrests are impacted after the treatment. The post-test is the number of re-arrests following the treatment program.

Once randomly assigned, the experimental group members receive the domestic violence program, and the control group members do not. After the program, the researcher will compare the pre-test arrests for domestic violence of the experimental group to post-test arrests for domestic violence to determine if arrests increased, decreased, or remained constant since the start of the program. The researcher will also compare the post-test re-arrests for domestic violence between the experimental and control groups. With this example, we explore the usefulness of the classic experimental design, and the contribution of the pre-test, random assignment, and the control group to the goal of determining whether a domestic violence program reduces re-arrests.

The Pre-Test As a component of the classic experiment, the pre-test allows an examination of change in the dependent variable from before the domestic violence program to after the domestic violence program. In short, a pre-test allows the researcher to determine if re-arrests increased, decreased, or remained the same following the domestic violence program. Without a pre-test, researchers would not be able to determine the extent of change, if any, from before to after the program for either the experimental or control group.

Although the pre-test is a measure of the dependent variable before the treatment, it can also be thought of as a measure whereby the researcher can compare the experimental group to the control group before the treatment is administered. For example, the pre-test helps researchers to make sure both groups are similar or equivalent on previous arrests for domestic violence. The importance of equivalence between the experimental and control groups on previous arrests is discussed below with random assignment.

Random Assignment Random assignment helps to ensure that the experimental and control groups are equivalent before the introduction of the treatment. This is perhaps one of the most critical aspects of the classic experiment and all experimental designs. Although the experimental and control groups will be made up of different people with different characteristics, assigning them to groups via a random assignment process helps to ensure that any differences or bias between the groups is eliminated or minimized. By minimizing bias, we mean that the groups will balance each other out on all factors except the treatment. If they are balanced out on all factors prior to the administration of the treatment, any differences between the groups at the post-test must be due to the treatment�the only factor that differs between the experimental group and the control group. According to Shadish, Cook, and Campbell: �If implemented correctly, random assignment creates two or more groups of units that are probabilistically similar to each other on the average. Hence, any outcome differences that are observed between those groups at the end of a study are likely to be due to treatment, not to differences between the groups that already existed at the start of the study.� 15 Considered in another way, if the experimental and control group differed significantly on any relevant factor other than the treatment, the researcher would not know if the results observed at the post-test are attributable to the treatment or to the differences between the groups.

Consider an example where 500 domestic abusers were randomly assigned to the experimental group and 500 were randomly assigned to the control group. Because they were randomly assigned, we would likely find more frequent domestic violence arrestees in both groups, older and younger arrestees in both groups, and so on. If random assignment was implemented correctly, it would be highly unlikely that all of the experimental group members were the most serious or frequent arrestees and all of the control group members were less serious and/or less frequent arrestees. While there are no guarantees, we know the chance of this happening is extremely small with random assignment because it is based on known probability theory. Thus, except for a chance occurrence, random assignment will result in equivalence between the experimental and control group in much the same way that flipping a coin multiple times will result in heads approximately 50% of the time and tails approximately 50% of the time. Over 1,000 tosses of a coin, for example, should result in roughly 500 heads and 500 tails. While there is a chance that flipping a coin 1,000 times will result in heads 1,000 times, or some other major imbalance between heads and tails, this potential is small and would only occur by chance.

The same logic from above also applies with randomly assigning people to groups, and this can even be done by flipping a coin. By assigning people to groups through a random and unbiased process, like flipping a coin, only by chance (or researcher error) will one group have more of one characteristic than another, on average. If there are no major (also called statistically significant) differences between the experimental and control group before the treatment, the most plausible explanation for the results at the post-test is the treatment.

As mentioned, it is possible by some chance occurrence that the experimental and control group members are significantly different on some characteristic prior to administration of the treatment. To confirm that the groups are in fact similar after they have been randomly assigned, the researcher can examine the pre-test if one is present. If the researcher has additional information on subjects before the treatment is administered, such as age, or any other factor that might influence post-test results at the end of the study, he or she can also compare the experimental and control group on those measures to confirm that the groups are equivalent. Thus, a researcher can confirm that the experimental and control groups are equivalent on information known to the researcher.

Being able to compare the groups on known measures is an important way to ensure the random assignment process �worked.� However, perhaps most important is that randomization also helps to ensure similarity across unknown variables between the experimental and control group. Because random assignment is based on known probability theory, there is a much higher probability that all potential differences between the groups that could impact the post-test should balance out with random assignment�known or unknown. Without random assignment, it is likely that the experimental and control group would differ on important but unknown factors and such differences could emerge as alternative explanations for the results. For example, if a researcher did not utilize random assignment and instead took the first 500 domestic abusers from an ordered list and assigned them to the experimental group and the last 500 domestic abusers and assigned them to the control group, one of the groups could be �lopsided� or imbalanced on some important characteristic that could impact the outcome of the study. With random assignment, there is a much higher likelihood that these important characteristics among the experimental and control groups will balance out because no individual has a different chance of being placed into one group versus the other. The probability of one or more characteristics being concentrated into one group and not the other is extremely small with random assignment.

To further illustrate the importance of random assignment to group equivalence, suppose the first 500 domestic violence abusers who were assigned to the experimental group from the ordered list had significantly fewer domestic violence arrests before the program than the last 500 domestic violence abusers on the list. Perhaps this is because the ordered list was organized from least to most chronic domestic abusers. In this instance, the control group would be lopsided concerning number of pre-program domestic violence arrests�they would be more chronic than the experimental group. The arrest imbalance then could potentially explain the post-test results following the domestic violence program. For example, the �less risky� offenders in the experimental group might be less likely to be re-arrested regardless of their participation in the domestic violence program, especially compared to the more chronic domestic abusers in the control group. Because of imbalances between the experimental and control group on arrests before the program was implemented, it would not be known for certain whether an observed reduction in re-arrests after the program for the experimental group was due to the program or the natural result of having less risky offenders in the experimental group. In this instance, the results might be taken to suggest that the program significantly reduces re-arrests. This conclusion might be spurious, however, for the association may simply be due to the fact that the offenders in the experimental group were much different (less frequent offenders) than the control group. Here, the program may have had no effect�the experimental group members may have performed the same regardless of the treatment because they were low-level offenders.

The example above suggests that differences between the experimental and control groups based on previous arrest records could have a major impact on the results of a study. Such differences can arise with the lack of random assignment. If subjects were randomly assigned to the experimental and control group, however, there would be a much higher probability that less frequent and more frequent domestic violence arrestees would have been found in both the experimental and control groups and the differences would have balanced out between the groups�leaving any differences between the groups at the post-test attributable to the treatment only.

In summary, random assignment helps to ensure that the experimental and control group members are balanced or equivalent on all factors that could impact the dependent variable or post-test�known or unknown. The only factor they are not balanced or equal on is the treatment. As such, random assignment helps to isolate the impact of the treatment, if any, on the post-test because it increases confidence that the only difference between the groups should be that one group gets the treatment and the other does not. If that is the only difference between the groups, any change in the dependent variable between the experimental and control group must be attributed to the treatment and not an alternative explanation, such as significant arrest history imbalance between the groups (refer to Figure 5.2). This logic also suggests that if the experimental group and control group are imbalanced on any factor that may be relevant to the outcome, that factor then becomes a potential alternative explanation for the results�an explanation that reduces the researcher�s ability to isolate the real impact of the treatment.


Scared Straight

The 1978 documentary Scared Straight introduced to the public the �Lifer�s Program� at Rahway State Prison in New Jersey. This program sought to decrease juvenile delinquency by bringing at-risk and delinquent juveniles into the prison where they would be �scared straight� by inmates serving life sentences. Participants in the program were talked to and yelled at by the inmates in an effort to scare them. It was believed that the fear felt by the participants would lead to a discontinuation of their problematic behavior so that they would not end up in prison themselves. Although originally touted as a success based on anecdotal evidence, subsequent evaluations of the program and others like it proved otherwise.

Using a classic experimental design, Finckenauer evaluated the original �Lifer�s Program� at Rahway State Prison. 16 Participating juveniles were randomly assigned to the experimental group or the control group. Results of the evaluation were not positive. Post-test measures revealed that juveniles who were assigned to the experimental group and participated in the program were actually more seriously delinquent afterwards than those who did not participate in the program. Also using an experimental design with random assignment, Yarborough evaluated the �Juvenile Offenders Learn Truth� (JOLT) program at the State Prison of Southern Michigan at Jackson. 17 This program was similar to that of the �Lifer�s Program� only with fewer obscenities used by the inmates. Post-test measurements were taken at two intervals, 3 and 6 months after program completion. Again, results were not positive. Findings revealed no significant differences between those juveniles who attended the program and those who did not.

Other experiments conducted on Scared Straight -like programs further revealed their inability to deter juveniles from future criminality. 18 Despite the intuitive popularity of these programs, these evaluations proved that such programs were not successful. In fact, it is postulated that these programs may have actually done more harm than good.

The Control Group The presence of an equivalent control group (created through random assignment) also gives the researcher more confidence that the findings at the post-test are due to the treatment and not some other alternative explanation. This logic is perhaps best demonstrated by considering how interpretation of results is affected without a control group. Absent an equivalent control group, it cannot be known whether the results of the study are due to the program or some other factor. This is because the control group provides a baseline of comparison or a �control.� For example, without a control group, the researcher may find that domestic violence arrests declined from pre-test to post-test. But the researcher would not be able to definitely attribute that finding to the program without a control group. Perhaps the single experimental group incurred fewer arrests because they matured over their time in the program, regardless of participation in the domestic violence program. Having a randomly assigned control group would allow this consideration to be eliminated, because the equivalent control group would also have naturally matured if that was the case.

Because the control group is meant to be similar to the experimental group on all factors with the exception that the experimental group receives the treatment, the logic is that any differences between the experimental and control group after the treatment must then be attributable only to the treatment itself�everything else occurs equally in both the experimental and control groups and thus cannot be the cause of results. The bottom line is that a control group allows the researcher more confidence to attribute any change in the dependent variable from pre- to post-test and between the experimental and control groups to the treatment�and not another alternative explanation. Absent a control group, the researcher would have much less confidence in the results.

Knowledge about the major components of the classic experimental design and how they contribute to an understanding of cause and effect serves as an important foundation for studying different types of experimental and quasi-experimental designs and their organization. A useful way to become familiar with the components of the experimental design and their important role is to consider the impact on the interpretation of results when one or more components are lacking. For example, what if a design lacked a pre-test? How could this impact the interpretation of post-test results and knowledge about the comparability of the experimental and control group? What if a design lacked random assignment? What are some potential problems that could occur and how could those potential problems impact interpretation of results? What if a design lacked a control group? How does the absence of an equivalent control group affect a researcher�s ability to determine the unique effects of the treatment on the outcomes being measured? The ability to discuss the contribution of a pre-test, random assignment, and a control group�and what is the impact when one or more of those components is absent from a research design�is the key to understanding both experimental and quasi-experimental designs that will be discussed in the remainder of this chapter. As designs lose these important parts and transform from a classic experiment to another experimental design or to a quasi-experiment, they become less useful in isolating the impact that a treatment has on the dependent variable and allow more room for alternative explanations of the results.

One more important point must be made before further delving into experimental and quasi-experimental designs. This point is that rarely, if ever, will the average consumer of research be exposed to the symbols or specific language of the classic experiment, or other experimental and quasi-experimental designs examined in this chapter. In fact, it is unlikely that the average consumer will ever be exposed to the terms pre-test, post-test, experimental group, or random assignment in the popular media, among other terms related to experimental and quasi-experimental designs. Yet, consumers are exposed to research results produced from these and other research designs every day. For example, if a national news organization or your regional newspaper reported a story about the effectiveness of a new drug to reduce cholesterol or the effects of different diets on weight loss, it is doubtful that the results would be reported as produced through a classic experimental design that used a control group and random assignment. Rather, these media outlets would use generally nonscientific terminology such as �results of an experiment showed� or �results of a scientific experiment indicated� or �results showed that subjects who received the new drug had greater cholesterol reductions than those who did not receive the new drug.� Even students who regularly search and read academic articles for use in course papers and other projects will rarely come across such design notation in the research studies they utilize. Depiction of the classic experimental design, including a discussion of its components and their function, simply illustrates the organization and notation of the classic experimental design. Unfortunately, the average consumer has to read between the lines to determine what type of design was used to produce the reported results. Understanding the key components of the classic experimental design allows educated consumers of research to read between those lines.


�Swearing Makes Pain More Tolerable� 19

In 2009, Richard Stephens, John Atkins, and Andrew Kingston of the School of Psychology at Keele University conducted a study with 67 undergraduate students to determine if swearing affects an individual�s response to pain. Researchers asked participants to immerse their hand in a container filled with ice-cold water and repeat a preferred swear word. The researchers then asked the same participants to immerse their hand in ice-cold water while repeating a word used to describe a table (a non-swear word). The results showed that swearing increased pain tolerance compared to the non-swearing condition. Participants who used a swear word were able to hold their hand in ice-cold water longer than when they did not swear. Swearing also decreased participants� perception of pain.

1. This study is an example of a repeated measures design. In this form of experimental design, study participants are exposed to an experimental condition (swearing with hand in ice-cold water) and a control condition (non-swearing with hand in ice-cold water) while repeated outcome measures are taken with each condition, for example, the length of time a participant was able to keep his or her hand submerged in ice-cold water. Conduct an Internet search for �repeated measures design� and explore the various ways such a study could be conducted, including the potential benefits and drawbacks to this design.

2. After researching repeated measures designs, devise a hypothetical repeated measures study of your own.

3. Retrieve and read the full research study �Swearing as a Response to Pain� by Stephens, Atkins, and Kingston while paying attention to the design and methods (full citation information for this study is listed below). Has your opinion of the study results changed after reading the full study? Why or why not?

Full Study Source: Stephens, R., Atkins, J., and Kingston, A. (2009). �Swearing as a response to pain.� NeuroReport 20, 1056�1060.

Variations on the Experimental Design

The classic experimental design is the foundation upon which all experimental and quasi-experimental designs are based. As such, it can be modified in numerous ways to fit the goals (or constraints) of a particular research study. Below are two variations of the experimental design. Again, knowledge about the major components of the classic experiment, how they contribute to an explanation of results, and what the impact is when one or more components are missing provides an understanding of all other experimental designs.

Post-Test Only Experimental Design

The post-test only experimental design could be used to examine the impact of a treatment program on school disciplinary infractions as measured or operationalized by referrals to the principal�s office (see Table 5.2). In this design, the researcher randomly assigns a group of discipline problem students to the experimental group and control group by flipping a coin�heads to the experimental group and tails to the control group. The experimental group then enters the 3-month treatment program. After the program, the researcher compares the number of referrals to the principal�s office between the experimental and control groups over some period of time, for example, discipline referrals at 6 months after the program. The researcher finds that the experimental group has a much lower number of referrals to the principal�s office in the 6 month follow-up period than the control group.

TABLE 5.2 | Post-Test Only Experimental Design






Several issues arise in this example study. The researcher would not know if discipline problems decreased, increased, or stayed the same from before to after the treatment program because the researcher did not have a count of disciplinary referrals prior to the treatment program (e.g., a pre-test). Although the groups were randomly assigned and are presumed equivalent, the absence of a pre-test means the researcher cannot confirm that the experimental and control groups were equivalent before the treatment was administered, particularly on the number of referrals to the principal�s office. The groups could have differed by a chance occurrence even with random assignment, and any such differences between the groups could potentially explain the post-test difference in the number of referrals to the principal�s office. For example, if the control group included much more serious or frequent discipline problem students than the experimental group by chance, this difference might explain the lower number of referrals for the experimental group, not that the treatment produced this result.

Experimental Design with Two Treatments and a Control Group

This design could be used to determine the impact of boot camp versus juvenile detention on post-release recidivism (see Table 5.3). Recidivism in this study is operationalized as re-arrest for delinquent behavior. First, a population of known juvenile delinquents is randomly assigned to either boot camp, juvenile detention, or a control condition where they receive no sanction. To accomplish random assignment to groups, the researcher places the names of all youth into a hat and assigns the groups in order. For example, the first name pulled goes into experimental group 1, the next into experimental group 2, and the next into the control group, and so on. Once randomly assigned, the experimental group youth receive either boot camp or juvenile detention for a period of 3 months, whereas members of the control group are released on their own recognizance to their parents. At the end of the experiment, the researcher compares the re-arrest activity of boot camp participants to detention delinquents to control group members during a 6-month follow-up period.

TABLE 5.3 | Experimental Design with Two Treatments and a Control Group












This design has several advantages. First, it includes all major components of the classic experimental design, and simply adds an additional treatment for comparison purposes. Random assignment was utilized and this means that the groups have a higher probability of being equivalent on all factors that could impact the post-test. Thus, random assignment in this example helps to ensure the only differences between the groups are the treatment conditions. Without random assignment, there is a greater chance that one group of youth was somehow different, and this difference could impact the post-test. For example, if the boot camp youth were much less serious and frequent delinquents than the juvenile detention youth or control group youth, the results might erroneously show that the boot camp reduced recidivism when in fact the youth in boot camp may have been the �best risks��unlikely to get re-arrested with or without boot camp. The pre-test in the example above allows the researcher to determine change in re-arrests from pretest to post-test. Thus, the researcher can determine if delinquent behavior, as measured by re-arrest, increased, decreased, or remained constant from pre- to post-test. The pre-test also allows the researcher to confirm that the random assignment process resulted in equivalent groups based on the pre-test. Finally, the presence of a control group allows the researcher to have more confidence that any differences in the post-test are due to the treatment. For example, if the control group had more re-arrests than the boot camp or juvenile detention experimental groups 6 months after their release from those programs, the researcher would have more confidence that the programs produced fewer re-arrests because the control group members were the same as the experimental groups; the only difference was that they did not receive a treatment.

The one key feature of experimental designs is that they all retain random assignment. This is why they are considered �experimental� designs. Sometimes, however, experimental designs lack a pre-test. Knowledge of the usefulness of a pre-test demonstrates the potential problems with those designs where it is missing. For example, in the post-test only experimental design, a researcher would not be able to make a determination of change in the dependent variable from pre- to post-test. Perhaps most importantly, the researcher would not be able to confirm that the experimental and control groups were in fact equivalent on a pre-test measure before the introduction of the treatment. Even though both groups were randomly assigned, and probability theory suggests they should be equivalent, without a pre-test measure the researcher could not confirm similarity because differences could occur by chance even with random assignment. If there were any differences at the post-test between the experimental group and control group, the results might be due to some explanation other than the treatment, namely that the groups differed prior to the administration of the treatment. The same limitation could apply in any form of experimental design that does not utilize a pre-test for conformational purposes.

Understanding the contribution of a pre-test to an experimental design shows that it is a critical component. It provides a measure of change and also gives the researcher more confidence that the observed results are due to the treatment, and not some difference between the experimental and control groups. Despite the usefulness of a pre-test, however, perhaps the most critical ingredient of any experimental design is random assignment. It is important to note that all experimental designs retain random assignment.

Experimental Designs Are Rare in Criminal Justice and Criminology

The classic experiment is the foundation for other types of experimental and quasi-experimental designs. The unfortunate reality, however, is that the classic experiment, or other experimental designs, are few and far between in criminal justice. 20 Recall that one of the major components of an experimental design is random assignment. Achieving random assignment is often a barrier to experimental research in criminal justice. Achieving random assignment might, for example, require the approval of the chief (or city council or both) of a major metropolitan police agency to allow researchers to randomly assign patrol officers to certain areas of a city and/or randomly assign police officer actions. Recall the MDVE. This experiment required the full cooperation of the chief of police and other decision-makers to allow researchers to randomly assign police actions. In another example, achieving random assignment might require a judge to randomly assign a group of youthful offenders to a certain juvenile court sanction (experimental group), and another group of similar youthful offenders to no sanction or an alternative sanction as a control group. 21 In sum, random assignment typically requires the cooperation of a number of individuals and sometimes that cooperation is difficult to obtain.

Even when random assignment can be accomplished, sometimes it is not implemented correctly and the random assignment procedure breaks down. This is another barrier to conducting experimental research. For example, in the MDVE, researchers randomly assigned officer responses, but the officers did not always follow the assigned course of action. Moreover, some believe that the random assignment of criminal justice programs, sentences, or randomly assigning officer responses may be unethical in certain circumstances, and even a violation of the rights of citizens. For example, some believe it is unfair when random assignment results in some delinquents being sentenced to boot camp while others get assigned to a control group without any sanction at all or a less restrictive sanction than boot camp. In the MDVE, some believe it is unfair that some suspects were arrested and received an official record whereas others were not arrested for the same type of behavior. In other cases, subjects in the experimental group may receive some benefit from the treatment that is essentially denied to the control group for a period of time and this can become an issue as well.

There are other important reasons why random assignment is difficult to accomplish. Random assignment may, for example, involve a disruption of the normal procedures of agencies and their officers. In the MDVE, officers had to adjust their normal and established routine, and this was a barrier at times in that study. Shadish, Cook, and Campbell also note that random assignment may not always be feasible or desirable when quick answers are needed. 22 This is because experimental designs sometimes take a long time to produce results. In addition to the time required in planning and organizing the experiment, and treatment delivery, researchers may need several months if not years to collect and analyze the data before they have answers. This is particularly important because time is often of the essence in criminal justice research, especially in research efforts testing the effect of some policy or program where it is not feasible to wait years for answers. Waiting for the results of an experimental design means that many policy-makers may make decisions without the results.

Quasi-Experimental Designs

In general terms, quasi-experiments include a group of designs that lack random assignment. Quasi-experiments may also lack other parts, such as a pre-test or a control group, just like some experimental designs. The absence of random assignment, however, is the ingredient that transforms an otherwise experimental design into a quasi-experiment. Lacking random assignment is a major disadvantage because it increases the chances that the experimental and control groups differ on relevant factors before the treatment�both known and unknown�differences that may then emerge as alternative explanations of the outcomes.

Just like experimental designs, quasi-experimental designs can be organized in many different ways. This section will discuss three types of quasi-experiments: nonequivalent group design, one-group longitudinal design, and two-group longitudinal design.

Nonequivalent Group Design

The nonequivalent group design is perhaps the most common type of quasi-experiment. 23 Notice that it is very similar to the classic experimental design with the exception that it lacks random assignment (see Table 5.4). Additionally, what was labeled the experimental group in an experimental design is sometimes called the treatment group in the nonequivalent group design. What was labeled the control group in the experimental design is sometimes called the comparison group in the nonequivalent group design. This terminological distinction is an indicator that the groups were not created through random assignment.

TABLE 5.4 | Nonequivalent Group Design








NR = Not Randomly assigned

One of the main problems with the nonequivalent group design is that it lacks random assignment, and without random assignment, there is a greater chance that the treatment and comparison groups may be different in some way that can impact study results. Take, for example, a nonequivalent group design where a researcher is interested in whether an aggression-reduction treatment program can reduce inmate-on-inmate assaults in a prison setting. Assume that the researcher asked for inmates who had previously been involved in assaultive activity to volunteer for the aggression-reduction program. Suppose the researcher placed the first 50 volunteers into the treatment group and the next 50 volunteers into the comparison group. Note that this method of assignment is not random but rather first come, first serve.

Because the study utilized volunteers and there was no random assignment, it is possible that the first 50 volunteers placed into the treatment group differed significantly from the last 50 volunteers who were placed in the comparison group. This can lead to alternative explanations for the results. For example, if the treatment group was much younger than the comparison group, the researcher may find at the end of the program that the treatment group still maintained a higher rate of infractions than the comparison group�even after the aggression-reduction program! The conclusion might be that the aggression program actually increased the level of violence among the treatment group. This conclusion would likely be spurious and may be due to the age differential between the treatment and comparison groups. Indeed, research has revealed that younger inmates are significantly more likely to engage in prison assaults than older inmates. The fact that the treatment group incurred more assaults than the comparison group after the aggression-reduction program may only relate to the age differential between the groups, not that the program had no effect or that it somehow may have increased aggression. The previous example highlights the importance of random assignment and the potential problems that can occur in its absence.

Although researchers who utilize a quasi-experimental design are not able to randomly assign their subjects to groups, they can employ other techniques in an attempt to make the groups as equivalent as possible on known or measured factors before the treatment is given. In the example above, it is likely that the researcher would have known the age of inmates, their prior assault record, and various other pieces of information (e.g., previous prison stays). Through a technique called matching, the researcher could make sure the treatment and comparison groups were �matched� on these important factors before administering the aggression reduction program to the treatment group. This type of matching can be done individual to individual (e.g., subject #1 in treatment group is matched to a selected subject #1 in comparison group on age, previous arrests, gender), or aggregately, such that the comparison group is similar to the treatment group overall (e.g., average ages between groups are similar, equal proportions of males and females). Knowledge of these and other important variables, for example, would allow the researcher to make sure that the treatment group did not have heavy concentrations of younger or more frequent or serious offenders than the comparison group�factors that are related to assaultive activity independent of the treatment program. In short, matching allows the researcher some control over who goes into the treatment and comparison groups so as to balance these groups on important factors absent random assignment. If unbalanced on one or more factors, these factors could emerge as alternative explanations of the results. Figure 5.3 demonstrates the logic of matching both at the individual and aggregate level in a quasi-experimental design.

Matching is an important part of the nonequivalent group design. By matching, the researcher can approximate equivalence between the groups on important variables that may influence the post-test. However, it is important to note that a researcher can only match subjects on factors that they have information about�a researcher cannot match the treatment and comparison group members on factors that are unmeasured or otherwise unknown but which may still impact outcomes. For example, if the researcher has no knowledge about the number of previous incarcerations, the researcher cannot match the treatment and comparison groups on this factor. Matching also requires that the information used for matching is valid and reliable, which is not always the case. Agency records, for example, are notorious for inconsistencies, errors, omissions, and for being dated, but are often utilized for matching purposes. Asking survey questions to generate information for matching (for example, how many times have you been incarcerated?) can also be problematic because some respondents may lie, forget, or exaggerate their behavior or experiences.

In addition to the above considerations, the more factors a researcher wishes to match the group members on, the more difficult it becomes to find appropriate matches. Matching on prior arrests or age is less complex than matching on several additional pieces of information. Finally, matching is never considered superior to random assignment when the goal is to construct equitable groups. This is because there is a much higher likelihood of equivalence with random assignment on factors that are both measured and unknown to the researcher. Thus, the results produced from a nonequivalent group design, even with matching, are at a greater risk of alternative explanations than an experimental design that features random assignment.

FIGURE 5.3 | (a) Individual Matching (b) Aggregate Matching

two types of quasi experimental design

The previous discussion is not to suggest that the nonequivalent group design cannot be useful in answering important research questions. Rather, it is to suggest that the nonequivalent group design, and hence any quasi-experiment, is more susceptible to alternative explanations than the classic experimental design because of the absence of random assignment. As a result, a researcher must be prepared to rule out potential alternative explanations. Quasi-experimental designs that lack a pre-test or a comparison group are even less desirable than the nonequivalent group design and are subject to additional alternative explanations because of these missing parts. Although the quasi-experiment may be all that is available and still can serve as an important design in evaluating the impact of a particular treatment, it is not preferable to the classic experiment. Researchers (and consumers) must be attuned to the potential issues of this design so as to make informed conclusions about the results produced from such research studies.

The Effects of Red Light Camera (RLC) Enforcement

On March 15, 2009, an article appeared in the Santa Cruz Sentinel entitled �Ticket�s in the Mail: Red-Light Cameras Questioned.� The article stated �while studies show fewer T-bone crashes at lights with cameras and fewer drivers running red lights, the number of rear-end crashes increases.� 24 The study mentioned in the newspaper, which showed fewer drivers running red lights with cameras, was conducted by Richard Retting, Susan Ferguson, and Charles Farmer of the Insurance Institute for Highway Safety (IIHS). 25 They completed a quasi-experimental study in Philadelphia to determine the impact of red light cameras (RLC) on red light violations. In the study, the researchers selected nine intersections�six of which were experimental sites that utilized RLCs and three comparison sites that did not utilize RLCs. The six experimental sites were located in Philadelphia, Pennsylvania, and the three comparison sites were located in Atlantic County, New Jersey. The researchers chose the comparison sites based on the proximity to Philadelphia, the ability to collect data using the same methods as at experimental intersections (e.g., the use of cameras for viewing red light traffic), and the fact that police officials in Atlantic County had offered assistance selecting and monitoring the intersections.

The authors collected three phases of information in the RLC study at the experimental and comparison sites:

Phase 1 Data Collection: Baseline (pre-test) data collection at the experimental and comparison sites consisting of the number of vehicles passing through each intersection, the number of red light violations, and the rate of red light violations per 10,000 vehicles.

Phase 2 Data Collection: Number of vehicles traveling through experimental and comparison intersections, number of red light violations after a 1-second yellow light increase at the experimental sites (treatment 1), number of red light violations at comparison sites without a 1-second yellow light increase, and red light violations per 10,000 vehicles at both experimental and comparison sites.

Phase 3 Data Collection: Red light violations after a 1-second yellow light increase and RLC enforcement at the experimental sites (treatment 2), red light violations at comparison sites without a 1-second yellow increase or RLC enforcement, number of vehicles passing through the experimental and comparison intersections, and the rate of red light violations per 10,000 vehicles.

The researchers operationalized �red light violations� as those where the vehicle entered the intersection one-half of a second or more after the onset of the red signal where the vehicle�s rear tires had to be positioned behind the crosswalk or stop line prior to entering on red. Vehicles already in the intersection at the onset of the red light, or those making a right turn on red with or without stopping were not considered red light violations.

The researchers collected video data at each of the experimental and comparison sites during Phases 1�3. This allowed the researchers to examine red light violations before, during, and after the implementation of red light enforcement and yellow light time increases. Based on an analysis of data, the researchers revealed that the implementation of a 1-second yellow light increase led to reductions in the rate of red light violations from Phase 1 to Phase 2 in all of the experimental sites. In 2 out of 3 comparison sites, the rate of red light violations also decreased, despite no yellow light increase. From Phase 2 to Phase 3 (the enforcement of red light camera violations in addition to a 1-second yellow light increase at experimental sites), the authors noted decreases in the rate of red light violations in all experimental sites, and decreases among 2 of 3 comparison sites without red light enforcement in effect.

Concluding their study, the researchers noted that the study �found large and highly significant incremental reductions in red light running associated with increased yellow signal timing followed by the introduction of red light cameras.� Despite these findings, the researchers noted a number of potential factors to consider in light of the findings: the follow-up time periods utilized when counting red light violations before and after the treatment conditions were instituted; publicity about red light camera enforcement; and the size of fines associated with red light camera enforcement (the fine in Philadelphia was $100, higher than in many other cities), among others.

After reading about the study used in the newspaper article, has your impression of the newspaper headline and quote changed?

For more information and research on the effect of RLCs, visit the Insurance Institute for Highway Safety at http://www .iihs.org/research/topics/rlr.html .

One-Group Longitudinal Design

Like all experimental designs, the quasi-experimental design can come in a variety of forms. The second quasi-experimental design (above) is the one-group longitudinal design (also called a simple interrupted time series design). 26 An examination of this design shows that it lacks both random assignment and a comparison group (see Table 5.5). A major difference between this design and others we have covered is that it includes multiple pre-test and post-test observations.

TABLE 5.5 | One-Group Longitudinal Design











The one-group longitudinal design is useful when researchers are interested in exploring longer-term patterns. Indeed, the term longitudinal generally means �over time��repeated measurements of the pre-test and post-test over time. This is different from cross-sectional designs, which examine the pre-test and post-test at only one point in time (e.g., at a single point before the application of the treatment and at a single point after the treatment). For example, in the nonequivalent group design and the classic experimental design previously examined, both are cross-sectional because pre-tests and post-tests are measured at one point in time (e.g., at a point 6 months after the treatment). Yet, these designs could easily be considered longitudinal if researchers took repeated measures of the pre-test and post-test.

The organization of the one-group longitudinal design is to examine a baseline of several pre-test observations, introduce a treatment or intervention, and then examine the post-test at several different time intervals. As organized, this design is useful for gauging the impact that a particular program, policy, or law has, if any, and how long the treatment impact lasts. Consider an example whereby a researcher is interested in gauging the impact of a tobacco ban on inmate-on-inmate assaults in a prison setting. This is an important question, for recent years have witnessed correctional systems banning all tobacco products from prison facilities. Correctional administrators predicted that there would be a major increase of inmate-on-inmate violence once the bans took effect. The one-group longitudinal design would be one appropriate design to examine the impact of banning tobacco on inmate assaults.

To construct this study using the one-group longitudinal design, the researcher would first examine the rate of inmate-on-inmate assaults in the prison system (or at an individual prison, a particular cellblock, or whatever the unit of analysis) prior to the removal of tobacco. This is the pre-test, or a baseline of assault activity before the ban goes into effect. In the design presented above, perhaps the researcher would measure the level of assaults in the preceding four months prior to the tobacco ban. When establishing a pre-test baseline, the general rule is that, in a longitudinal design, the more time utilized, both in overall time and number of intervals, the better. For example, the rate of assaults in the preceding month is not as useful as an entire year of data on inmate assaults prior to the tobacco ban. Next, once the tobacco ban is implemented, the researcher would then measure the rate of inmate assaults in the coming months to determine what impact the ban had on inmate-on-inmate assaults. This is shown in Table 5.5 as the multiple post-test measures of assaults. Assaults may increase, decrease, or remain constant from the pre-test baseline over the term of the post-test.

If assaults increased at the same time as the ban went into effect, the researcher might conclude that the increase was due only to the tobacco ban. But, could there be alternative explanations? The answer to this question is yes, there may be other plausible explanations for the increase even with several months of pre-test data. Unfortunately, without a comparison group there is no way for the researcher to be certain if the increase in assaults was due to the tobacco ban, or some other factor that may have spurred the increase in assaults and happened at the same time as the tobacco ban. What if assaults decreased after the tobacco ban went into effect? In this scenario, because there is no comparison group, the researcher would still not know if the results would have happened anyway without the tobacco ban. In these instances, the lack of a comparison group prevents the researcher from confidently attributing the results to the tobacco ban, and interpretation is subject to numerous alternative explanations.

Two-Group Longitudinal Design

A remedy for the previous situation would be to introduce a comparison group (see Table 5.6). Prior to the full tobacco ban, suppose prison administrators conducted a pilot program at one prison to provide insight as to what would happen once the tobacco ban went into effect systemwide. To conduct this pilot, the researcher identified one prison. At this prison, the researcher identified two different cellblocks, C-Block and D-Block. C-Block constitutes the treatment group, or the cellblock of inmates who will have their tobacco taken away. D-Block is the comparison group�inmates in this cellblock will retain their tobacco privileges during the course of the study and during a determined follow-up period to measure post-test assaults (e.g., 12-months). This is a two-group longitudinal design (also sometimes called a multiple interrupted time series design), and adding a comparison group makes this design superior to the one-group longitudinal design.

TABLE 5.6 | Two-Group Longitudinal Design




















The usefulness of adding a comparison group to the study means that the researcher can have more confidence that the results at the post-test are due to the tobacco ban and not some alternative explanation. This is because any difference in assaults at the post-test between the treatment and comparison group should be attributed to the only difference between them, the tobacco ban. For this interpretation to hold, however, the researcher must be sure that C-Block and D-Block are similar or equivalent on all factors that might influence the post-test. There are many potential factors that should be considered. For example, the researcher will want to make sure that the same types of inmates are housed in both cellblocks. If a chronic group of assaultive inmates constitutes members of C-Block, but not D-Block, this differential could explain the results, not the treatment.

The researcher might also want to make sure equitable numbers of tobacco and non-tobacco users are found in each cellblock. If very few inmates in C-Block are smokers, the real effect of removing tobacco may be hidden. The researcher might also examine other areas where potential differences might arise, for example, that both cellblocks are staffed with equal numbers of officers, that officers in each cellblock tend to resolve inmate disputes similarly, and other potential issues that could influence post-test measure of assaults. Equivalence could also be ensured by comparing the groups on additional evidence before the ban takes effect: number of prior prison sentences, time served in prison, age, seriousness of conviction crime, and other factors that might relate to assaultive behavior, regardless of the tobacco ban. Moreover, the researcher should ensure that inmates in C-Block do not know that their D-Block counterparts are still allowed tobacco during the pilot study, and vice versa. If either group knows about the pilot program being an experiment, they might act differently than normal, and this could become an explanation of results. Additionally, the researchers might also try to make sure that C-Block inmates are completely tobacco free after the ban goes into effect�that they do not hoard, smuggle, or receive tobacco from officers or other inmates during the tobacco ban in or outside of the cellblock. If these and other important differences are accounted for at the individual and cellblock level, the researcher will have more confidence that any differences in assaults at the post-test between the treatment and comparison groups are related to the tobacco ban, and not some other difference between the two groups or the two cellblocks.

The addition of a comparison group aids in the ability of the researcher to isolate the true impact of a tobacco ban on inmate-on-inmate assaults. All factors that influence the treatment group should also influence the comparison group because the groups are made up of equivalent individuals in equivalent circumstances, with the exception of the tobacco ban. If this is the only difference, the results can be attributed to the ban. Although the addition of the comparison group in the two-group longitudinal design provides more confidence that the findings are attributed to the tobacco ban, the fact that this design lacks randomization means that alternative explanations cannot be completely ruled out�but they can be minimized. This example also suggests that the quasi-experiment in this instance may actually be preferable to an experimental design�noting the realities of prison administration. For example, prison inmates are not typically randomly assigned to different cellblocks by prison officers. Moreover, it is highly unlikely that a prison would have two open cellblocks waiting for a researcher to randomly assign incoming inmates to the prison for a tobacco ban study. Therefore, it is likely there would be differences among the groups in the quasi-experiment.

Fortunately, if differences between the groups are present, the researcher can attempt to determine their potential impact before interpretation of results. The researcher can also use statistical models after the ban takes effect to determine the impact of any differences between the groups on the post-test. While the two-group longitudinal quasi-experiment just discussed could also take the form of an experimental design, if random assignment could somehow be accomplished, the previous discussion provides one situation where an experimental design might be appropriate and desired for a particular research question, but would not be realistic considering the many barriers.

The Threat of Alternative Explanations

Alternative explanations are those factors that could explain the post-test results, other than the treatment. Throughout this chapter, we have noted the potential for alternative explanations and have given several examples of explanations other than the treatment. It is important to know that potential alternative explanations can arise in any research design discussed in this chapter. However, alternative explanations often arise because some design part is missing, for example, random assignment, a pre-test, or a control or comparison group. This is especially true in criminal justice where researchers often conduct field studies and have less control over their study conditions than do researchers who conduct experiments under highly controlled laboratory conditions. A prime example of this is the tobacco ban study, where it would be difficult for researchers to ensure that C-Block inmates, the treatment group, were completely tobacco free during the course of the study.

Alternative explanations are typically referred to as threats to internal validity. In this context, if an experiment is internally valid, it means that alternative explanations have been ruled out and the treatment is the only factor that produced the results. If a study is not internally valid, this means that alternative explanations for the results exist or potentially exist. In this section, we focus on some common alternative explanations that may arise in experimental and quasi-experimental designs. 27

Selection Bias

One of the more common alternative explanations that may occur is selection bias. Selection bias generally indicates that the treatment group (or experimental group) is somehow different from the comparison group (or control group) on a factor that could influence the post-test results. Selection bias is more often a threat in quasi-experimental designs than experimental designs due to the lack of random assignment. Suppose in our study of the prison tobacco ban, members of C-Block were substantially younger than members of D-Block, the comparison group. Such an imbalance between the groups would mean the researcher would not know if the differences in assaults are real (meaning the result of the tobacco ban) or a result of the age differential. Recall that research shows that younger inmates are more assaultive than older inmates and so we would expect more assaults among the younger offenders independent of the tobacco ban.

In a quasi-experiment, selection bias is perhaps the most prevalent type of alternative explanation and can seriously compromise results. Indeed, many of the examples above have referred to potential situations where the groups are imbalanced or not equivalent on some important factor. Although selection bias is a common threat in quasi-experimental designs because of lack of random assignment, and can be a threat in experimental designs because the groups could differ by chance alone or the practice of randomization was not maintained throughout the study (see Classics in CJ Research-MDVE above), a researcher may be able to detect such differentials. For example, the researcher could detect such differences by comparing the groups on the pre-test or other types of information before the start of the study. If differences were found, the researcher could take measures to correct them. The researcher could also use a statistical model that could account or control for differences between the groups and isolate the impact of the treatment, if any. This discussion is beyond the scope of this text but would be a potential way to deal with selection bias and estimate the impact of this bias on study results. The researcher could also, if possible, attempt to re-match the groups in a quasi-experiment or randomly assign the groups a second time in an experimental design to ensure equivalence. At the least, the researcher could recognize the group differences and discuss their potential impact on the results. Without a pre-test or other pre-study information on study participants, however, such differences might not be able to be detected and, therefore, it would be more difficult to determine how the differences, as a result of selection bias, influenced the results.

Another potential alternative explanation is history. History refers to any event experienced differently by the treatment and comparison groups in the time between the pre-test and the post-test that could impact results. Suppose during the course of the tobacco ban study several riots occurred on D-Block, the comparison group. Because of the riots, prison officers �locked down� this cellblock numerous times. Because D-Block inmates were locked down at various times, this could have affected their ability to otherwise engage in inmate assaults. At the end of the study, the assaults in D-Block might have decreased from their pre-test levels because of the lockdowns, whereas in C-Block assaults may have occurred at their normal pace because there was not a lockdown, or perhaps even increased from the pretest because tobacco was also taken away. Even if the tobacco ban had no effect and assaults remained constant in C-Block from pre- to post-test, the lockdown in D-Block might make it appear that the tobacco ban led to increased assaults in C-Block. Thus, the researcher would not know if the post-test results for the C-Block treatment group were attributable to the tobacco ban or the simple fact that D-Block inmates were locked down and their assault activity was artificially reduced. In this instance, the comparison group becomes much less useful because the lockdown created a historical factor that imbalanced the groups during the treatment phase and nullified the comparison.

Another potential alternative explanation is maturation. Maturation refers to the natural biological, psychological, or emotional processes we all experience as time passes�aging, becoming more or less intelligent, becoming bored, and so on. For example, if a researcher was interested in the effect of a boot camp on recidivism for juvenile offenders, it is possible that over the course of the boot camp program the delinquents naturally matured as they aged and this produced the reduction in recidivism�not that the boot camp somehow led to this reduction. This threat is particularly applicable in situations that deal with populations that rapidly change over a relatively short period of time or when a treatment lasts a considerable period of time. However, this threat could be eliminated with a comparison group that is similar to the treatment group. This is because the maturation effects would occur in both groups and the effect of the boot camp, if any, could be isolated. This assumes, however, that the groups are matched and equitable on factors subject to the maturation process, such as age. If not, such differentials could be an alternative explanation of results. For example, if the treatment and comparison groups differ by age, on average, this could mean that one group changes or matures at a different rate than the other group. This differential rate of change or maturation as a result of the age differential could explain the results, not the treatment. This example demonstrates how selection bias and maturation can interact at the same time as alternative explanations. This example also suggests the importance of an equivalent control or comparison group to eliminate or minimize the impact of maturation as an alternative explanation.

Attrition or Subject Mortality

Attrition or subject mortality is another typical alternative explanation. Attrition refers to differential loss in the number or type of subjects between the treatment and comparison groups and can occur in both experimental and quasi-experimental designs. Suppose we wanted to conduct a study to determine who is the better research methods professor among the authors of this textbook. Let�s assume that we have an experimental design where students were randomly assigned to professor 1, professor 2, or professor 3. By randomly assigning students to each respective professor, there is greater probability that the groups are equivalent and thus there are no differences between the three groups with one exception�the professor they receive and his or her particular teaching and delivery style. This is the treatment. Let�s also assume that the professors will be administering the same tests and using the same textbook. After the group members are randomly assigned, a pre-treatment evaluation shows the groups are in fact equivalent on all important known factors that could influence post-test scores, such as grade point average, age, time in school, and exposure to research methods concepts. Additionally, all groups scored comparably on a pre-test of knowledge about research methods, thus there is more confidence that the groups are in fact equivalent.

At the conclusion of the study, we find that professor 2�s group has the lowest final test scores of the three. However, because professor 2 is such an outstanding professor, the results appear odd. At first glance, the researcher thinks the results could have been influenced by students dropping out of the class. For example, perhaps several of professor 2�s students dropped the course but none did from the classes of professor 1 or 3. It is revealed, however, that an equal number of students dropped out of all three courses before the post-test and, therefore, this could not be the reason for the low scores in professor 2�s course. Upon further investigation, however, the researcher finds that although an equal number of students dropped out of each class, the dropouts in professor 2�s class were some of his best students. In contrast, those who dropped out of professor 1�s and professor 3�s courses were some of their poorest students. In this example, professor 2 appears to be the least effective teacher. However, this result appears to be due to the fact that his best students dropped out, and this highly influenced the final test average for his group. Although there was not a differential loss of subjects in terms of numbers (which can also be an attrition issue), there was differential loss in the types of students. This differential loss, not the teaching style, is an alternative explanation of the results.

Testing or Testing Bias

Another potential alternative explanation is testing or testing bias. Suppose that after the pre-test of research methods knowledge, professor 1 and professor 3 reviewed the test with their students and gave them the correct answers. Professor 2 did not. The fact that professor l�s and professor 3�s groups did better on the post-test final exam may be explained by the finding that students in those groups remembered the answers to the pre-test, were thus biased at the pre-test, and this artificially inflated their post-test scores. Testing bias can explain the results because students in groups 1 and 3 may have simply remembered the answers from the pre-test review. In fact, the students in professor l�s and 3�s courses may have scored high on the post-test without ever having been exposed to the treatment because they were biased at the pre-test.


Another alternative explanation that can arise is instrumentation. Instrumentation refers to changes in the measuring instrument from pre- to post-test. Using the previous example, suppose professors 1 and 3 did not give the same final exam as professor 2. For example, professors 1 and 3 changed the final exam and professor 2 kept the final exam the same as the pretest. Because professors 1 and 3 changed the exam, and perhaps made it easier or somehow different from the pre-test exam, results that showed lower scores for professor 2�s students may be related only to instrumentation changes from pre- to post-test. Obviously, to limit the influence of instrumentation, researchers should make sure that instruments remain consistent from pre- to post-test.

A final alternative explanation is reactivity. Reactivity occurs when members of the treatment or experimental group change their behavior simply as a result of being part of a study. This is akin to the finding that people tend to change their behavior when they are being watched or are aware they are being studied. If members of the experiment know they are part of an experiment and are being studied and watched, it is possible that their behavior will change independent of the treatment. If this occurs, the researcher will not know if the behavior change is the result of the treatment, or simply a result of being part of a study. For example, suppose a researcher wants to determine if a boot camp program impacts the recidivism of delinquent offenders. Members of the experimental group are sentenced to boot camp and members of the control group are released on their own recognizance to their parents. Because members of the experimental group know they are part of the experiment, and hence being watched closely after they exit boot camp, they may artificially change their behavior and avoid trouble. Their change of behavior may be totally unrelated to boot camp, but rather, to their knowledge of being part of an experiment.

Other Potential Alternative Explanations

The above discussion provided some typical alternative explanations that may arise with the designs discussed in this chapter. There are, however, other potential alternative explanations that may arise. These alternative explanations arise only when a control or comparison group is present.

One such alternative explanation is diffusion of treatment. Diffusion of treatment occurs when the control or comparison group learns about the treatment its members are being denied and attempts to mimic the behavior of the treatment group. If the control group is successful in mimicking the experimental group, for example, the results at the end of the study may show similarity in outcomes between groups and cause the researcher to conclude that the program had no effect. In fact, however, the finding of no effect can be explained by the comparison group mimicking the treatment group. 28 In reality, there may be no effect of the treatment, but the researcher would not know this for sure because the control group effectively transformed into another experimental group�there is then no baseline of comparison. Consider a study where a researcher wants to determine the impact of a training program on class behavior and participation. In this study, the experimental group is exposed to several sessions of training on how to act appropriately in class and how to engage in class participation. The control group does not receive such training, but they are aware that they are part of an experiment. Suppose after a few class sessions the control group starts to mimic the behavior of the experimental group, acting the same way and participating in class the same way. At the conclusion of the study, the researcher might determine that the program had no impact because the comparison group, which did not receive the new program, showed similar progress.

In a related explanation, sometimes the comparison or control group learns about the experiment and attempts to compete with the experimental or treatment group. This alternative explanation is called compensatory rivalry. For example, suppose a police chief wants to determine if a new training program will increase the endurance of SWAT team officers. The chief randomly assigns SWAT members to either an experimental or control group. The experimental group will receive the new endurance training program and the control group will receive the normal program that has been used for years. During the course of the study, suppose the control group learns that the treatment group is receiving the new endurance program and starts to compete with the experimental group. Perhaps the control group runs five more miles per day and works out an extra hour in the weight room, in addition to their normal endurance program. At the end of the study, and due to the control group�s extra and competing effort, the results might show no effect of the new endurance program, and at worst, experimental group members may show a decline in endurance compared to the control group. The rivalry or competing behavior actually explains the results, not that the new endurance program has no effect or a damaging effect. Although the new endurance program may in reality have no effect, this cannot be known because of the actions of the control group, who learned about the treatment and competed with the experimental group.

Closely related to compensatory rivalry is the alternative explanation of comparison or control group demoralization. 29 In this instance, instead of competing with the experimental or treatment group, the control or comparison group simply gives up and changes their normal behavior. Using the SWAT example, perhaps the control group simply quits their normal endurance program when they learn about the treatment group receiving the new endurance program. At the post-test, their endurance will likely drop considerably compared to the treatment group. Because of this, the new endurance program might emerge as a shining success. In reality, however, the researcher will not know if any changes in endurance between the experimental and control groups are a result of the new endurance program or the control group giving up. Due to their giving up, there is no longer a comparison group of equitable others, the change in endurance among the treatment group members could be attributed to a number of alternative explanations, for example, maturation. If the comparison group behaves normally, the researcher will be able to exclude maturation as a potential explanation. This is because any maturation effects will occur in both groups.

The previous discussion suggests that when the control or comparison group learns about the experiment and the treatment they are denied, potential alternative explanations can arise. Perhaps the best remedy to protect from the alternative explanations just discussed is to make sure the treatment and comparison groups do not have contact with one another. In laboratory experiments this can be ensured, but sometimes this is a problem in criminal justice studies, which are often conducted in the field.

The previous discussion also suggests that there are numerous alternative explanations that can impact the interpretation of results from a study. A careful researcher would know that alternative explanations must be ruled out before reaching a definitive conclusion about the impact of a particular program. The researcher must be attuned to these potential alternative explanations because they can influence results and how results are interpreted. Moreover, the discussion shows that several alternative explanations can occur at the same time. For example, it is possible that selection bias, maturation, attrition, and compensatory rivalry all emerge as alternative explanations in the same study. Knowing about these potential alternative explanations and how they can impact the results of a study is what distinguishes a consumer of research from an educated consumer of research.

Chapter Summary

The primary focus of this chapter was the classic experimental design, the foundation for other types of experimental and quasi-experimental designs. The classic experimental design is perhaps the most useful design when exploring causal relationships. Often, however, researchers cannot employ the classic experimental design to answer a research question. In fact, the classic experimental design is rare in criminal justice and criminology because it is often difficult to ensure random assignment for a variety of reasons. In circumstances where an experimental design is appropriate but not feasible, researchers may turn to one of many quasi-experimental designs. The most important difference between the two is that quasi-experimental designs do not feature random assignment. This can create potential problems for researchers. The main problem is that there is a greater chance the treatment and comparison groups may differ on important characteristics that could influence the results of a study. Although researchers can attempt to prevent imbalances between the groups by matching them on important known characteristics, it is still much more difficult to establish equivalence than it is in the classic experiment. As such, it becomes more difficult to determine what impact a treatment had, if any, as one moves from an experimental to a quasi-experimental design.

Perhaps the most important lesson to be learned in this chapter is that to be an educated consumer of research results requires an understanding of the type of design that produced the results. There are numerous ways experimental and quasi-experimental designs can be structured. This is why much attention was paid to the classic experimental design. In reality, all experimental and quasi-experimental designs are variations of the classic experiment in some way�adding or deleting certain components. If the components and organization and logic of the classic experimental design are understood, consumers of research will have a better understanding of the results produced from any sort of research design. For example, what problems in interpretation arise when a design lacks a pre-test, a control group, or random assignment? Having an answer to this question is a good start toward being an informed consumer of research results produced through experimental and quasi-experimental designs.

Critical Thinking Questions

1. Why is randomization/random assignment preferable to matching? Provide several reasons with explanation.

2. What are some potential reasons a researcher would not be able to utilize random assignment?

3. What is a major limitation of matching?

4. What is the difference between a longitudinal study and a cross-sectional study?

5. Describe a hypothetical study where maturation, and not the treatment, could explain the outcomes of the research.

association (or covariance or correlation): One of three conditions that must be met for establishing cause and effect, or a causal relationship. Association refers to the condition that X and Y must be related for a causal relationship to exist. Association is also referred to as covariance or correlation. Although two variables may be associated (or covary or be correlated), this does not automatically imply that they are causally related

attrition or subject mortality: A threat to internal validity, it refers to the differential loss of subjects between the experimental (treatment) and control (comparison) groups during the course of a study

cause and effect relationship: A cause and effect relationship occurs when one variable causes another, and no other explanation for that relationship exists

classic experimental design or experimental design: A design in a research study that features random assignment to an experimental or control group. Experimental designs can vary tremendously, but a constant feature is random assignment, experimental and control groups, and a post-test. For example, a classic experimental design features random assignment, a treatment, experimental and control groups, and pre- and post-tests

comparison group: The group in a quasi-experimental design that does not receive the treatment. In an experimental design, the comparison group is referred to as the control group

compensatory rivalry: A threat to internal validity, it occurs when the control or comparison group attempts to compete with the experimental or treatment group

control group: In an experimental design, the control group does not receive the treatment. The control group serves as a baseline of comparison to the experimental group. It serves as an example of what happens when a group equivalent to the experimental group does not receive the treatment

cross-sectional designs: A measurement of the pre-test and post-test at one point in time (e.g., six months before and six months after the program)

demoralization: A threat to internal validity closely associated with compensatory rivalry, it occurs when the control or comparison group gives up and changes their normal behavior. While in compensatory rivalry the group members compete, in demoralization, they simply quit. Both are not normal behavioral reactions

dependent variable: Also known as the outcome in a research study. A post-test is a measure of the dependent variable

diffusion of treatment: A threat to internal validity, it occurs when the control or comparison group members learn that they are not getting the treatment and attempt to mimic the behavior of the experimental or treatment group. This mimicking may make it seem as if the treatment is having no effect, when in fact it may be

elimination of alternative explanations: One of three conditions that must be met for establishing cause and effect. Elimination of alternative explanations means that the researcher has ruled out other explanations for an observed relationship between X and Y

experimental group: In an experimental design, the experimental group receives the treatment

history: A threat to internal validity, it refers to any event experienced differently by the treatment and comparison groups�an event that could explain the results other than the supposed cause

independent variable: Also called the cause

instrumentation: A threat to internal validity, it refers to changes in the measuring instrument from pre- to post-test

longitudinal: Refers to repeated measurements of the pre-test and post-test over time, typically for the same group of individuals. This is the opposite of cross-sectional

matching: A process sometimes utilized in some quasi-experimental designs that feature treatment and comparison groups. Matching is a process whereby the researcher attempts to ensure equivalence between the treatment and comparison groups on known information, in the absence of the ability to randomly assign the groups

maturation: A threat to internal validity, maturation refers to the natural biological, psychological, or emotional processes as time passes

negative association: Refers to a negative association between two variables. A negative association is demonstrated when X increases and Y decreases, or X decreases and Y increases. Also known as an inverse relationship�the variables moving in opposite directions

operationalized or operationalization: Refers to the process of assigning a working definition to a concept. For example, the concept of intelligence can be operationalized or defined as grade point average or score on a standardized exam, among others

pilot program or test: Refers to a smaller test study or pilot to work out problems before a larger study and to anticipate changes needed for a larger study. Similar to a test run

positive association: Refers to a positive association between two variables. A positive association means as X increases, Y increases, or as X decreases, Y decreases

post-test: The post-test is a measure of the dependent variable after the treatment has been administered

pre-test: The pre-test is a measure of the dependent variable or outcome before a treatment is administered

quasi-experiment: A quasi-experiment refers to any number of research design configurations that resemble an experimental design but primarily lack random assignment. In the absence of random assignment, quasi-experimental designs feature matching to attempt equivalence

random assignment: Refers to a process whereby members of the experimental group and control group are assigned to each group through a random and unbiased process

random selection: Refers to selecting a smaller but representative subset from a population. Not to be confused with random assignment

reactivity: A threat to internal validity, it occurs when members of the experimental (treatment) or control (comparison) group change their behavior unnaturally as a result of being part of a study

selection bias: A threat to internal validity, selection bias occurs when the experimental (treatment) group and control (comparison) group are not equivalent. The difference between the groups can be a threat to internal validity, or, an alternative explanation to the findings

spurious: A spurious relationship is one where X and Y appear to be causally related, but in fact the relationship is actually explained by a variable or factor other than X

testing or testing bias: A threat to internal validity, it refers to the potential of study members being biased prior to a treatment, and this bias, rather than the treatment, may explain study results

threat to internal validity: Also known as alternative explanation to a relationship between X and Y. Threats to internal validity are factors that explain Y, or the dependent variable, and are not X, or the independent variable

timing: One of three conditions that must be met for establishing cause and effect. Timing refers to the condition that X must come before Y in time for X to be a cause of Y. While timing is necessary for a causal relationship, it is not sufficient, and considerations of association and eliminating other alternative explanations must be met

treatment: A component of a research design, it is typically denoted by the letter X. In a research study on the impact of teen court on juvenile recidivism, teen court is the treatment. In a classic experimental design, the treatment is given only to the experimental group, not the control group

treatment group: The group in a quasi-experimental design that receives the treatment. In an experimental design, this group is called the experimental group

unit of analysis: Refers to the focus of a research study as being individuals, groups, or other units of analysis, such as prisons or police agencies, and so on

variable(s): A variable is a concept that has been given a working definition and can take on different values. For example, intelligence can be defined as a person�s grade point average and can range from low to high or can be defined numerically by different values such as 3.5 or 4.0

1 Povitsky, W., N. Connell, D. Wilson, & D. Gottfredson. (2008). �An experimental evaluation of teen courts.� Journal of Experimental Criminology, 4, 137�163.

2 Hirschi, T., and H. Selvin (1966). �False criteria of causality in delinquency.� Social Problems, 13, 254�268.

3 Robert Roy Britt, �Churchgoers Live Longer.� April, 3, 2006. http://www.livescience.com/health/060403_church_ good.html. Retrieved on September 30, 2008.

4 Kalist, D., and D. Yee (2009). �First names and crime: Does unpopularity spell trouble?� Social Science Quarterly, 90 (1), 39�48.

5 Sherman, L. (1992). Policing domestic violence. New York: The Free Press.

6 For historical and interesting reading on the effects of weather on crime and other disorder, see Dexter, E. (1899). �Influence of weather upon crime.� Popular Science Monthly, 55, 653�660 in Horton, D. (2000). Pioneering Perspectives in Criminology. Incline Village, NV: Copperhouse.

7 http://www.escapistmagazine.com/news/view/111191-Less-Crime-in-U-S-Thanks-to-Videogames , retrieved on September 13, 2011. This news article was in response to a study titled �Understanding the effects of violent videogames on violent crime.� See Cunningham, Scott, Engelst�tter, Benjamin, and Ward, (April 7, 2011). Available at SSRN: http://ssm.com/abstract= 1804959.

8 Cohn, E. G. (1987). �Changing the domestic violence policies of urban police departments: Impact of the Minneapolis experiment.� Response, 10 (4), 22�24.

9 Schmidt, Janell D., & Lawrence W. Sherman (1993). �Does arrest deter domestic violence?� American Behavioral Scientist, 36 (5), 601�610.

10 Maxwell, Christopher D., Joel H. Gamer, & Jeffrey A. Fagan. (2001). The effects of arrest on intimate partner violence: New evidence for the spouse assault replication program. Washington D.C.: National Institute of Justice.

11 Miller, N. (2005). What does research and evaluation say about domestic violence laws? A compendium of justice system laws and related research assessments. Alexandria, VA: Institute for Law and Justice.

12 The sections on experimental and quasi-experimental designs rely heavily on the seminal work of Campbell and Stanley (Campbell, D.T., & J. C. Stanley. (1963). Experimental and quasi-experimental designs for research. Chicago: RandMcNally) and more recently, Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin.

13 Povitsky et al. (2008). p. 146, note 9.

14 Shadish, W., T. Cook, & D. Campbell. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York: Houghton Mifflin Company.

15 Ibid, 15.

16 Finckenauer, James O. (1982). Scared straight! and the panacea phenomenon. Englewood Cliffs, N.J.: Prentice Hall.

17 Yarborough, J.C. (1979). Evaluation of JOLT (Juvenile Offenders Learn Truth) as a deterrence program. Lansing, MI: Michigan Department of Corrections.

18 Petrosino, Anthony, Carolyn Turpin-Petrosino, & James O. Finckenauer. (2000). �Well-meaning programs can have harmful effects! Lessons from experiments of programs such as Scared Straight.� Crime and Delinquency, 46, 354�379.

19 �Swearing makes pain more tolerable� retrieved at http:// www.livescience.com/health/090712-swearing-pain.html (July 13, 2009). Also see �Bleep! My finger! Why swearing helps ease pain� by Tiffany Sharpies, retrieved at http://www.time.com/time/health/article /0,8599,1910691,00.html?xid=rss-health (July 16, 2009).

20 For an excellent discussion of the value of controlled experiments and why they are so rare in the social sciences, see Sherman, L. (1992). Policing domestic violence. New York: The Free Press, 55�74.

21 For discussion, see Weisburd, D., T. Einat, & M. Kowalski. (2008). �The miracle of the cells: An experimental study of interventions to increase payment of court-ordered financial obligations.� Criminology and Public Policy, 7, 9�36.

22 Shadish, Cook, & Campbell. (2002).

24 Kelly, Cathy. (March 15, 2009). �Tickets in the mail: Red-light cameras questioned.� Santa Cruz Sentinel.

25 Retting, Richard, Susan Ferguson, & Charles Farmer. (January 2007). �Reducing red light running through longer yellow signal timing and red light camera enforcement: Results of a field investigation.� Arlington, VA: Insurance Institute for Highway Safety.

26 Shadish, Cook, & Campbell. (2002).

27 See Shadish, Cook, & Campbell. (2002), pp. 54�61 for an excellent discussion of threats to internal validity. Also see Chapter 2 for an extended discussion of all forms of validity considered in research design.

28 Trochim, W. (2001). The research methods knowledge base, 2nd ed. Cincinnati, OH: Atomic Dog.

Applied Research Methods in Criminal Justice and Criminology Copyright © 2022 by University of North Texas is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Am Med Inform Assoc
  • v.13(1); Jan-Feb 2006

The Use and Interpretation of Quasi-Experimental Studies in Medical Informatics

Associated data.

Quasi-experimental study designs, often described as nonrandomized, pre-post intervention studies, are common in the medical informatics literature. Yet little has been written about the benefits and limitations of the quasi-experimental approach as applied to informatics studies. This paper outlines a relative hierarchy and nomenclature of quasi-experimental study designs that is applicable to medical informatics intervention studies. In addition, the authors performed a systematic review of two medical informatics journals, the Journal of the American Medical Informatics Association (JAMIA) and the International Journal of Medical Informatics (IJMI), to determine the number of quasi-experimental studies published and how the studies are classified on the above-mentioned relative hierarchy. They hope that future medical informatics studies will implement higher level quasi-experimental study designs that yield more convincing evidence for causal links between medical informatics interventions and outcomes.

Quasi-experimental studies encompass a broad range of nonrandomized intervention studies. These designs are frequently used when it is not logistically feasible or ethical to conduct a randomized controlled trial. Examples of quasi-experimental studies follow. As one example of a quasi-experimental study, a hospital introduces a new order-entry system and wishes to study the impact of this intervention on the number of medication-related adverse events before and after the intervention. As another example, an informatics technology group is introducing a pharmacy order-entry system aimed at decreasing pharmacy costs. The intervention is implemented and pharmacy costs before and after the intervention are measured.

In medical informatics, the quasi-experimental, sometimes called the pre-post intervention, design often is used to evaluate the benefits of specific interventions. The increasing capacity of health care institutions to collect routine clinical data has led to the growing use of quasi-experimental study designs in the field of medical informatics as well as in other medical disciplines. However, little is written about these study designs in the medical literature or in traditional epidemiology textbooks. 1 , 2 , 3 In contrast, the social sciences literature is replete with examples of ways to implement and improve quasi-experimental studies. 4 , 5 , 6

In this paper, we review the different pretest-posttest quasi-experimental study designs, their nomenclature, and the relative hierarchy of these designs with respect to their ability to establish causal associations between an intervention and an outcome. The example of a pharmacy order-entry system aimed at decreasing pharmacy costs will be used throughout this article to illustrate the different quasi-experimental designs. We discuss limitations of quasi-experimental designs and offer methods to improve them. We also perform a systematic review of four years of publications from two informatics journals to determine the number of quasi-experimental studies, classify these studies into their application domains, determine whether the potential limitations of quasi-experimental studies were acknowledged by the authors, and place these studies into the above-mentioned relative hierarchy.

The authors reviewed articles and book chapters on the design of quasi-experimental studies. 4 , 5 , 6 , 7 , 8 , 9 , 10 Most of the reviewed articles referenced two textbooks that were then reviewed in depth. 4 , 6

Key advantages and disadvantages of quasi-experimental studies, as they pertain to the study of medical informatics, were identified. The potential methodological flaws of quasi-experimental medical informatics studies, which have the potential to introduce bias, were also identified. In addition, a summary table outlining a relative hierarchy and nomenclature of quasi-experimental study designs is described. In general, the higher the design is in the hierarchy, the greater the internal validity that the study traditionally possesses because the evidence of the potential causation between the intervention and the outcome is strengthened. 4

We then performed a systematic review of four years of publications from two informatics journals. First, we determined the number of quasi-experimental studies. We then classified these studies on the above-mentioned hierarchy. We also classified the quasi-experimental studies according to their application domain. The categories of application domains employed were based on categorization used by Yearbooks of Medical Informatics 1992–2005 and were similar to the categories of application domains employed by Annual Symposiums of the American Medical Informatics Association. 11 The categories were (1) health and clinical management; (2) patient records; (3) health information systems; (4) medical signal processing and biomedical imaging; (5) decision support, knowledge representation, and management; (6) education and consumer informatics; and (7) bioinformatics. Because the quasi-experimental study design has recognized limitations, we sought to determine whether authors acknowledged the potential limitations of this design. Examples of acknowledgment included mention of lack of randomization, the potential for regression to the mean, the presence of temporal confounders and the mention of another design that would have more internal validity.

All original scientific manuscripts published between January 2000 and December 2003 in the Journal of the American Medical Informatics Association (JAMIA) and the International Journal of Medical Informatics (IJMI) were reviewed. One author (ADH) reviewed all the papers to identify the number of quasi-experimental studies. Other authors (ADH, JCM, JF) then independently reviewed all the studies identified as quasi-experimental. The three authors then convened as a group to resolve any disagreements in study classification, application domain, and acknowledgment of limitations.

Results and Discussion

What is a quasi-experiment.

Quasi-experiments are studies that aim to evaluate interventions but that do not use randomization. Similar to randomized trials, quasi-experiments aim to demonstrate causality between an intervention and an outcome. Quasi-experimental studies can use both preintervention and postintervention measurements as well as nonrandomly selected control groups.

Using this basic definition, it is evident that many published studies in medical informatics utilize the quasi-experimental design. Although the randomized controlled trial is generally considered to have the highest level of credibility with regard to assessing causality, in medical informatics, researchers often choose not to randomize the intervention for one or more reasons: (1) ethical considerations, (2) difficulty of randomizing subjects, (3) difficulty to randomize by locations (e.g., by wards), (4) small available sample size. Each of these reasons is discussed below.

Ethical considerations typically will not allow random withholding of an intervention with known efficacy. Thus, if the efficacy of an intervention has not been established, a randomized controlled trial is the design of choice to determine efficacy. But if the intervention under study incorporates an accepted, well-established therapeutic intervention, or if the intervention has either questionable efficacy or safety based on previously conducted studies, then the ethical issues of randomizing patients are sometimes raised. In the area of medical informatics, it is often believed prior to an implementation that an informatics intervention will likely be beneficial and thus medical informaticians and hospital administrators are often reluctant to randomize medical informatics interventions. In addition, there is often pressure to implement the intervention quickly because of its believed efficacy, thus not allowing researchers sufficient time to plan a randomized trial.

For medical informatics interventions, it is often difficult to randomize the intervention to individual patients or to individual informatics users. So while this randomization is technically possible, it is underused and thus compromises the eventual strength of concluding that an informatics intervention resulted in an outcome. For example, randomly allowing only half of medical residents to use pharmacy order-entry software at a tertiary care hospital is a scenario that hospital administrators and informatics users may not agree to for numerous reasons.

Similarly, informatics interventions often cannot be randomized to individual locations. Using the pharmacy order-entry system example, it may be difficult to randomize use of the system to only certain locations in a hospital or portions of certain locations. For example, if the pharmacy order-entry system involves an educational component, then people may apply the knowledge learned to nonintervention wards, thereby potentially masking the true effect of the intervention. When a design using randomized locations is employed successfully, the locations may be different in other respects (confounding variables), and this further complicates the analysis and interpretation.

In situations where it is known that only a small sample size will be available to test the efficacy of an intervention, randomization may not be a viable option. Randomization is beneficial because on average it tends to evenly distribute both known and unknown confounding variables between the intervention and control group. However, when the sample size is small, randomization may not adequately accomplish this balance. Thus, alternative design and analytical methods are often used in place of randomization when only small sample sizes are available.

What Are the Threats to Establishing Causality When Using Quasi-experimental Designs in Medical Informatics?

The lack of random assignment is the major weakness of the quasi-experimental study design. Associations identified in quasi-experiments meet one important requirement of causality since the intervention precedes the measurement of the outcome. Another requirement is that the outcome can be demonstrated to vary statistically with the intervention. Unfortunately, statistical association does not imply causality, especially if the study is poorly designed. Thus, in many quasi-experiments, one is most often left with the question: “Are there alternative explanations for the apparent causal association?” If these alternative explanations are credible, then the evidence of causation is less convincing. These rival hypotheses, or alternative explanations, arise from principles of epidemiologic study design.

Shadish et al. 4 outline nine threats to internal validity that are outlined in ▶ . Internal validity is defined as the degree to which observed changes in outcomes can be correctly inferred to be caused by an exposure or an intervention. In quasi-experimental studies of medical informatics, we believe that the methodological principles that most often result in alternative explanations for the apparent causal effect include (a) difficulty in measuring or controlling for important confounding variables, particularly unmeasured confounding variables, which can be viewed as a subset of the selection threat in ▶ ; (b) results being explained by the statistical principle of regression to the mean . Each of these latter two principles is discussed in turn.

Threats to Internal Validity

1. Ambiguous temporal precedence: Lack of clarity about whether intervention occurred before outcome
2. Selection: Systematic differences over conditions in respondent characteristics that could also cause the observed effect
3. History: Events occurring concurrently with intervention could cause the observed effect
4. Maturation: Naturally occurring changes over time could be confused with a treatment effect
5. Regression: When units are selected for their extreme scores, they will often have less extreme subsequent scores, an occurrence that can be confused with an intervention effect
6. Attrition: Loss of respondents can produce artifactual effects if that loss is correlated with intervention
7. Testing: Exposure to a test can affect scores on subsequent exposures to that test
8. Instrumentation: The nature of a measurement may change over time or conditions
9. Interactive effects: The impact of an intervention may depend on the level of another intervention

Adapted from Shadish et al. 4

An inability to sufficiently control for important confounding variables arises from the lack of randomization. A variable is a confounding variable if it is associated with the exposure of interest and is also associated with the outcome of interest; the confounding variable leads to a situation where a causal association between a given exposure and an outcome is observed as a result of the influence of the confounding variable. For example, in a study aiming to demonstrate that the introduction of a pharmacy order-entry system led to lower pharmacy costs, there are a number of important potential confounding variables (e.g., severity of illness of the patients, knowledge and experience of the software users, other changes in hospital policy) that may have differed in the preintervention and postintervention time periods ( ▶ ). In a multivariable regression, the first confounding variable could be addressed with severity of illness measures, but the second confounding variable would be difficult if not nearly impossible to measure and control. In addition, potential confounding variables that are unmeasured or immeasurable cannot be controlled for in nonrandomized quasi-experimental study designs and can only be properly controlled by the randomization process in randomized controlled trials.

An external file that holds a picture, illustration, etc.
Object name is 16f01.jpg

Example of confounding. To get the true effect of the intervention of interest, we need to control for the confounding variable.

Another important threat to establishing causality is regression to the mean. 12 , 13 , 14 This widespread statistical phenomenon can result in wrongly concluding that an effect is due to the intervention when in reality it is due to chance. The phenomenon was first described in 1886 by Francis Galton who measured the adult height of children and their parents. He noted that when the average height of the parents was greater than the mean of the population, the children tended to be shorter than their parents, and conversely, when the average height of the parents was shorter than the population mean, the children tended to be taller than their parents.

In medical informatics, what often triggers the development and implementation of an intervention is a rise in the rate above the mean or norm. For example, increasing pharmacy costs and adverse events may prompt hospital informatics personnel to design and implement pharmacy order-entry systems. If this rise in costs or adverse events is really just an extreme observation that is still within the normal range of the hospital's pharmaceutical costs (i.e., the mean pharmaceutical cost for the hospital has not shifted), then the statistical principle of regression to the mean predicts that these elevated rates will tend to decline even without intervention. However, often informatics personnel and hospital administrators cannot wait passively for this decline to occur. Therefore, hospital personnel often implement one or more interventions, and if a decline in the rate occurs, they may mistakenly conclude that the decline is causally related to the intervention. In fact, an alternative explanation for the finding could be regression to the mean.

What Are the Different Quasi-experimental Study Designs?

In the social sciences literature, quasi-experimental studies are divided into four study design groups 4 , 6 :

  • Quasi-experimental designs without control groups
  • Quasi-experimental designs that use control groups but no pretest
  • Quasi-experimental designs that use control groups and pretests
  • Interrupted time-series designs

There is a relative hierarchy within these categories of study designs, with category D studies being sounder than categories C, B, or A in terms of establishing causality. Thus, if feasible from a design and implementation point of view, investigators should aim to design studies that fall in to the higher rated categories. Shadish et al. 4 discuss 17 possible designs, with seven designs falling into category A, three designs in category B, and six designs in category C, and one major design in category D. In our review, we determined that most medical informatics quasi-experiments could be characterized by 11 of 17 designs, with six study designs in category A, one in category B, three designs in category C, and one design in category D because the other study designs were not used or feasible in the medical informatics literature. Thus, for simplicity, we have summarized the 11 study designs most relevant to medical informatics research in ▶ .

Relative Hierarchy of Quasi-experimental Designs

Quasi-experimental Study DesignsDesign Notation
A. Quasi-experimental designs without control groups
    1. The one-group posttest-only designX O1
    2. The one-group pretest-posttest designO1 X O2
    3. The one-group pretest-posttest design using a double pretestO1 O2 X O3
    4. The one-group pretest-posttest design using a nonequivalent dependent variable(O1a, O1b) X (O2a, O2b)
    5. The removed-treatment designO1 X O2 O3 removeX O4
    6. The repeated-treatment designO1 X O2 removeX O3 X O4
B. Quasi-experimental designs that use a control group but no pretest
    1. Posttest-only design with nonequivalent groupsIntervention group: X O1
Control group: O2
C. Quasi-experimental designs that use control groups and pretests
    1. Untreated control group with dependent pretest and posttest samplesIntervention group: O1a X O2a
Control group: O1b O2b
    2. Untreated control group design with dependent pretest and posttest samples using a double pretestIntervention group: O1a O2a X O3a
Control group: O1b O2b O3b
    3. Untreated control group design with dependent pretest and posttest samples using switching replicationsIntervention group: O1a X O2a O3a
Control group: O1b O2b X O3b
D. Interrupted time-series design
    1. Multiple pretest and posttest observations spaced at equal intervals of timeO1 O2 O3 O4 O5 X O6 O7 O8 O9 O10

O = Observational Measurement; X = Intervention Under Study. Time moves from left to right.

The nomenclature and relative hierarchy were used in the systematic review of four years of JAMIA and the IJMI. Similar to the relative hierarchy that exists in the evidence-based literature that assigns a hierarchy to randomized controlled trials, cohort studies, case-control studies, and case series, the hierarchy in ▶ is not absolute in that in some cases, it may be infeasible to perform a higher level study. For example, there may be instances where an A6 design established stronger causality than a B1 design. 15 , 16 , 17

Quasi-experimental Designs without Control Groups

equation M1

Here, X is the intervention and O is the outcome variable (this notation is continued throughout the article). In this study design, an intervention (X) is implemented and a posttest observation (O1) is taken. For example, X could be the introduction of a pharmacy order-entry intervention and O1 could be the pharmacy costs following the intervention. This design is the weakest of the quasi-experimental designs that are discussed in this article. Without any pretest observations or a control group, there are multiple threats to internal validity. Unfortunately, this study design is often used in medical informatics when new software is introduced since it may be difficult to have pretest measurements due to time, technical, or cost constraints.

equation M2

This is a commonly used study design. A single pretest measurement is taken (O1), an intervention (X) is implemented, and a posttest measurement is taken (O2). In this instance, period O1 frequently serves as the “control” period. For example, O1 could be pharmacy costs prior to the intervention, X could be the introduction of a pharmacy order-entry system, and O2 could be the pharmacy costs following the intervention. Including a pretest provides some information about what the pharmacy costs would have been had the intervention not occurred.

equation M3

The advantage of this study design over A2 is that adding a second pretest prior to the intervention helps provide evidence that can be used to refute the phenomenon of regression to the mean and confounding as alternative explanations for any observed association between the intervention and the posttest outcome. For example, in a study where a pharmacy order-entry system led to lower pharmacy costs (O3 < O2 and O1), if one had two preintervention measurements of pharmacy costs (O1 and O2) and they were both elevated, this would suggest that there was a decreased likelihood that O3 is lower due to confounding and regression to the mean. Similarly, extending this study design by increasing the number of measurements postintervention could also help to provide evidence against confounding and regression to the mean as alternate explanations for observed associations.

equation M4

This design involves the inclusion of a nonequivalent dependent variable ( b ) in addition to the primary dependent variable ( a ). Variables a and b should assess similar constructs; that is, the two measures should be affected by similar factors and confounding variables except for the effect of the intervention. Variable a is expected to change because of the intervention X, whereas variable b is not. Taking our example, variable a could be pharmacy costs and variable b could be the length of stay of patients. If our informatics intervention is aimed at decreasing pharmacy costs, we would expect to observe a decrease in pharmacy costs but not in the average length of stay of patients. However, a number of important confounding variables, such as severity of illness and knowledge of software users, might affect both outcome measures. Thus, if the average length of stay did not change following the intervention but pharmacy costs did, then the data are more convincing than if just pharmacy costs were measured.

The Removed-Treatment Design

equation M5

This design adds a third posttest measurement (O3) to the one-group pretest-posttest design and then removes the intervention before a final measure (O4) is made. The advantage of this design is that it allows one to test hypotheses about the outcome in the presence of the intervention and in the absence of the intervention. Thus, if one predicts a decrease in the outcome between O1 and O2 (after implementation of the intervention), then one would predict an increase in the outcome between O3 and O4 (after removal of the intervention). One caveat is that if the intervention is thought to have persistent effects, then O4 needs to be measured after these effects are likely to have disappeared. For example, a study would be more convincing if it demonstrated that pharmacy costs decreased after pharmacy order-entry system introduction (O2 and O3 less than O1) and that when the order-entry system was removed or disabled, the costs increased (O4 greater than O2 and O3 and closer to O1). In addition, there are often ethical issues in this design in terms of removing an intervention that may be providing benefit.

The Repeated-Treatment Design

equation M6

The advantage of this design is that it demonstrates reproducibility of the association between the intervention and the outcome. For example, the association is more likely to be causal if one demonstrates that a pharmacy order-entry system results in decreased pharmacy costs when it is first introduced and again when it is reintroduced following an interruption of the intervention. As for design A5, the assumption must be made that the effect of the intervention is transient, which is most often applicable to medical informatics interventions. Because in this design, subjects may serve as their own controls, this may yield greater statistical efficiency with fewer numbers of subjects.

Quasi-experimental Designs That Use a Control Group but No Pretest

equation M7

An intervention X is implemented for one group and compared to a second group. The use of a comparison group helps prevent certain threats to validity including the ability to statistically adjust for confounding variables. Because in this study design, the two groups may not be equivalent (assignment to the groups is not by randomization), confounding may exist. For example, suppose that a pharmacy order-entry intervention was instituted in the medical intensive care unit (MICU) and not the surgical intensive care unit (SICU). O1 would be pharmacy costs in the MICU after the intervention and O2 would be pharmacy costs in the SICU after the intervention. The absence of a pretest makes it difficult to know whether a change has occurred in the MICU. Also, the absence of pretest measurements comparing the SICU to the MICU makes it difficult to know whether differences in O1 and O2 are due to the intervention or due to other differences in the two units (confounding variables).

Quasi-experimental Designs That Use Control Groups and Pretests

The reader should note that with all the studies in this category, the intervention is not randomized. The control groups chosen are comparison groups. Obtaining pretest measurements on both the intervention and control groups allows one to assess the initial comparability of the groups. The assumption is that if the intervention and the control groups are similar at the pretest, the smaller the likelihood there is of important confounding variables differing between the two groups.

equation M8

The use of both a pretest and a comparison group makes it easier to avoid certain threats to validity. However, because the two groups are nonequivalent (assignment to the groups is not by randomization), selection bias may exist. Selection bias exists when selection results in differences in unit characteristics between conditions that may be related to outcome differences. For example, suppose that a pharmacy order-entry intervention was instituted in the MICU and not the SICU. If preintervention pharmacy costs in the MICU (O1a) and SICU (O1b) are similar, it suggests that it is less likely that there are differences in the important confounding variables between the two units. If MICU postintervention costs (O2a) are less than preintervention MICU costs (O1a), but SICU costs (O1b) and (O2b) are similar, this suggests that the observed outcome may be causally related to the intervention.

equation M9

In this design, the pretests are administered at two different times. The main advantage of this design is that it controls for potentially different time-varying confounding effects in the intervention group and the comparison group. In our example, measuring points O1 and O2 would allow for the assessment of time-dependent changes in pharmacy costs, e.g., due to differences in experience of residents, preintervention between the intervention and control group, and whether these changes were similar or different.

equation M10

With this study design, the researcher administers an intervention at a later time to a group that initially served as a nonintervention control. The advantage of this design over design C2 is that it demonstrates reproducibility in two different settings. This study design is not limited to two groups; in fact, the study results have greater validity if the intervention effect is replicated in different groups at multiple times. In the example of a pharmacy order-entry system, one could implement or intervene in the MICU and then at a later time, intervene in the SICU. This latter design is often very applicable to medical informatics where new technology and new software is often introduced or made available gradually.

Interrupted Time-Series Designs

equation M11

An interrupted time-series design is one in which a string of consecutive observations equally spaced in time is interrupted by the imposition of a treatment or intervention. The advantage of this design is that with multiple measurements both pre- and postintervention, it is easier to address and control for confounding and regression to the mean. In addition, statistically, there is a more robust analytic capability, and there is the ability to detect changes in the slope or intercept as a result of the intervention in addition to a change in the mean values. 18 A change in intercept could represent an immediate effect while a change in slope could represent a gradual effect of the intervention on the outcome. In the example of a pharmacy order-entry system, O1 through O5 could represent monthly pharmacy costs preintervention and O6 through O10 monthly pharmacy costs post the introduction of the pharmacy order-entry system. Interrupted time-series designs also can be further strengthened by incorporating many of the design features previously mentioned in other categories (such as removal of the treatment, inclusion of a nondependent outcome variable, or the addition of a control group).

Systematic Review Results

The results of the systematic review are in ▶ . In the four-year period of JAMIA publications that the authors reviewed, 25 quasi-experimental studies among 22 articles were published. Of these 25, 15 studies were of category A, five studies were of category B, two studies were of category C, and no studies were of category D. Although there were no studies of category D (interrupted time-series analyses), three of the studies classified as category A had data collected that could have been analyzed as an interrupted time-series analysis. Nine of the 25 studies (36%) mentioned at least one of the potential limitations of the quasi-experimental study design. In the four-year period of IJMI publications reviewed by the authors, nine quasi-experimental studies among eight manuscripts were published. Of these nine, five studies were of category A, one of category B, one of category C, and two of category D. Two of the nine studies (22%) mentioned at least one of the potential limitations of the quasi-experimental study design.

Systematic Review of Four Years of Quasi-designs in JAMIA

StudyJournalInformatics Topic CategoryQuasi-experimental DesignLimitation of Quasi-design Mentioned in Article
Staggers and Kobus JAMIA1Counterbalanced study designYes
Schriger et al. JAMIA1A5Yes
Patel et al. JAMIA2A5 (study 1, phase 1)No
Patel et al. JAMIA2A2 (study 1, phase 2)No
Borowitz JAMIA1A2No
Patterson and Harasym JAMIA6C1Yes
Rocha et al. JAMIA5A2Yes
Lovis et al. JAMIA1Counterbalanced study designNo
Hersh et al. JAMIA6B1No
Makoul et al. JAMIA2B1Yes
Ruland JAMIA3B1No
DeLusignan et al. JAMIA1A1No
Mekhjian et al. JAMIA1A2 (study design 1)Yes
Mekhjian et al. JAMIA1B1 (study design 2)Yes
Ammenwerth et al. JAMIA1A2No
Oniki et al. JAMIA5C1Yes
Liederman and Morefield JAMIA1A1 (study 1)No
Liederman and Morefield JAMIA1A2 (study 2)No
Rotich et al. JAMIA2A2 No
Payne et al. JAMIA1A1No
Hoch et al. JAMIA3A2 No
Laerum et al. JAMIA1B1Yes
Devine et al. JAMIA1Counterbalanced study design
Dunbar et al. JAMIA6A1
Lenert et al. JAMIA6A2
Koide et al. IJMI5D4No
Gonzalez-Hendrich et al. IJMI2A1No
Anantharaman and Swee Han IJMI3B1No
Chae et al. IJMI6A2No
Lin et al. IJMI3A1No
Mikulich et al. IJMI1A2Yes
Hwang et al. IJMI1A2Yes
Park et al. IJMI1C2No
Park et al. IJMI1D4No

JAMIA = Journal of the American Medical Informatics Association; IJMI = International Journal of Medical Informatics.

In addition, three studies from JAMIA were based on a counterbalanced design. A counterbalanced design is a higher order study design than other studies in category A. The counterbalanced design is sometimes referred to as a Latin-square arrangement. In this design, all subjects receive all the different interventions but the order of intervention assignment is not random. 19 This design can only be used when the intervention is compared against some existing standard, for example, if a new PDA-based order entry system is to be compared to a computer terminal–based order entry system. In this design, all subjects receive the new PDA-based order entry system and the old computer terminal-based order entry system. The counterbalanced design is a within-participants design, where the order of the intervention is varied (e.g., one group is given software A followed by software B and another group is given software B followed by software A). The counterbalanced design is typically used when the available sample size is small, thus preventing the use of randomization. This design also allows investigators to study the potential effect of ordering of the informatics intervention.

Although quasi-experimental study designs are ubiquitous in the medical informatics literature, as evidenced by 34 studies in the past four years of the two informatics journals, little has been written about the benefits and limitations of the quasi-experimental approach. As we have outlined in this paper, a relative hierarchy and nomenclature of quasi-experimental study designs exist, with some designs being more likely than others to permit causal interpretations of observed associations. Strengths and limitations of a particular study design should be discussed when presenting data collected in the setting of a quasi-experimental study. Future medical informatics investigators should choose the strongest design that is feasible given the particular circumstances.

Supplementary Material

Dr. Harris was supported by NIH grants K23 AI01752-01A1 and R01 AI60859-01A1. Dr. Perencevich was supported by a VA Health Services Research and Development Service (HSR&D) Research Career Development Award (RCD-02026-1). Dr. Finkelstein was supported by NIH grant RO1 HL71690.

Statistics By Jim

Making statistics intuitive

Experimental Design: Definition and Types

By Jim Frost

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.

Scientist who developed an experimental design for her research.

Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.

Learn more about Independent and Dependent Variables .

Design of Experiments: Goals & Settings

Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect , confirm a known effect, or test a hypothesis.

Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.

An experimental design’s focus depends on the subject area and can include the following goals:

  • Understanding the relationships between variables.
  • Identifying the variables that have the largest impact on the outcomes.
  • Finding the input variable settings that produce an optimal result.

For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.

Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.

In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.

Developing an Experimental Design

Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.

To accomplish these goals, experimental designs carefully manage data validity and reliability , and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.

An excellent experimental design involves the following:

  • Lots of preplanning.
  • Developing experimental treatments.
  • Determining how to assign subjects to treatment groups.

The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.

Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity .

Preplanning, Defining, and Operationalizing for Design of Experiments

A literature review is crucial for the design of experiments.

This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.

Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.

This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.

  • Null hypothesis : The jumping exercise intervention does not affect bone density.
  • Alternative hypothesis : The jumping exercise intervention affects bone density.

To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses .

Formulating Treatments in Experimental Designs

In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.

As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.

Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.

How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.

Assigning Subjects to Experimental Groups

A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups .

How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation .

Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.

Let’s explore some of the ways to assign subjects in design of experiments.

Completely Randomized Designs

A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.

Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.

For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.

Statisticians consider randomized experimental designs to be the best for identifying causal relationships.

If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design .

Learn more about Randomized Controlled Trials and Random Assignment in Experiments .

Randomized Block Designs

Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.

This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.

Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.

Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.

A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.

You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses .

Observational Studies

In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.

Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.

Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.

Learn more about Observational Studies .

For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments .

Between-Subjects vs. Within-Subjects Experimental Designs

When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.

In a between-subjects design , you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.

A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.

In a  within-subjects experimental design , also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.

In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs .

Assigned to one experimental condition Participates in all experimental conditions
Requires more subjects Fewer subjects
Differences between subjects in the groups can affect the results Uses same subjects in all conditions.
No order of treatment effects. Order of treatments can affect results.

Design of Experiments Examples

For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.

In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.

In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.

Matched Pairs Experimental Design

A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.

Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.

On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.

On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.

Learn more about Matched Pairs Design: Uses & Examples .

Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time .

A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples .

In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.

two types of quasi experimental design

Experimental Design: Types, Examples & Methods

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.


Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Research Article

Effects of an instructional WhatsApp group on self-care and HbA1c among female patients with Type 2 diabetes mellitus

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – review & editing

Affiliations Faculty of Nursing, Medical/Surgical Department, King Abdulaziz University, Jeddah, Saudi Arabia, Medical Department Rabigh General Hospital, Rabigh, Saudi Arabia

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – review & editing

Affiliations Faculty of Nursing, Medical/Surgical Department, King Abdulaziz University, Jeddah, Saudi Arabia, Faculty of Nursing, Ain Shams University, Cairo, Egypt

Roles Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Affiliation Faculty of Nursing, Medical/Surgical Department, King Abdulaziz University, Jeddah, Saudi Arabia

  • Riham Saud Alhazmy, 
  • Asmaa Hamdi Khalil, 
  • Hayfa Almutary


  Published: September 18, 2024
  • https://doi.org/10.1371/journal.pone.0305845
Fig 1

Aims and objectives

To assess the effect of an instructional WhatsApp group on self-care and HbA1c levels among female patients with type 2 diabetes mellitus (T2DM).

T2DM is a chronic disease that requires effective self-care. WhatsApp is a free application that can be effectively used for patient education.

This study used a quasi-experimental design.

A convenience sample of 62 female participants was recruited from the medical outpatient clinic of a tertiary hospital. The Diabetes Self-Care Scale was used to assess the self-care profiles of the participants pre- and post-intervention. HbA1c samples were also collected at baseline and three months after receiving instructions from the WhatsApp group. Sociodemographic and clinical data were collected during the pre-intervention stage.

The mean HbA1c level decreased from 8.61 ± 1.70 to 7.92 ± 1.60 after implementing the WhatsApp group instructions; the values showed a significant difference (t-value = 5.107 and P -value < 0.001). The post-test mean score of total self-care was higher than the pre-test mean score (t-value = 12.359, P -value <0.001), indicating a highly significant difference.


The study demonstrated that the instructional WhatsApp group is an effective method for improving self-care and HbA1c levels in patients with T2DM. This study suggests the use of WhatsApp group instructions as a teaching method in the healthcare system for the education and follow-up of patients with T2DM.

Relevance to clinical practice

The findings support the need to initiate effective and dynamic interventional follow-ups through WhatsApp groups for patients with T2DM to improve their self-care and HbA1c levels and ultimately reduce the burden on hospitals and governments.

Citation: Alhazmy RS, Khalil AH, Almutary H (2024) Effects of an instructional WhatsApp group on self-care and HbA1c among female patients with Type 2 diabetes mellitus. PLoS ONE 19(9): e0305845. https://doi.org/10.1371/journal.pone.0305845

Editor: Nimesh Lageju, BP Koirala Institute of Health Sciences, NEPAL

Received: February 19, 2024; Accepted: June 5, 2024; Published: September 18, 2024

Copyright: © 2024 Alhazmy et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.


Diabetes mellitus (DM) is an alarming global health issue that can lead to serious complications if uncontrolled or not managed appropriately. It is one of the fastest growing global health emergencies of the 21st century [ 1 ]. In 2019, DM reached pandemic proportions with a worldwide prevalence of 9% (463 million adults) [ 2 ]. More than half a billion people worldwide have developed DM, and approximately 1 in 10 adults have the disease [ 1 ]. The number of cases has increased over the past two years [ 1 ]. In addition, DM is one of the most common diseases that cause mortality and morbidity in Saudi Arabia. According to a review of national data, DM affects 8.5% of the total adult population of Saudi Arabia [ 3 ].

DM complications are associated with frequent and prolonged hospitalizations, which increase the burden on individuals and the healthcare system [ 4 ]. According to the American Diabetes Association (2018), the total estimated cost of a diabetes diagnosis in the United States in 2017 was $327 billion; this value includes medical costs and reduced patient productivity [ 5 ]. The chronic nature of this disease requires self-care and self-management to prevent possible complications.

The digital health revolution provides helpful tools that support healthy practices among people with chronic noncommunicable diseases (NCDs), such as diabetes [ 6 ]. This includes using mobile health applications in patients’ education. A growing number of studies demonstrate the effectiveness of using mobile apps for lifestyle changes and self-management in people with chronic NCDs such as diabetes, hypertension, and cardiac diseases [ 7 , 8 ]. According to a cross-sectional study with 1119 participants, the majority of respondents believed that using mobile health to prevent NCDs would be beneficial (62%), and that it would enable patients to manage their lifestyle modifications (59%) [ 9 ]. In addition, mobile apps were found to be effective tools for those in rural areas [ 8 ] and across all age groups [ 9 ]. However, choosing the appropriate mobile apps, such as WhatsApp, to enhance health still needs more investigation.

Type 2 diabetes mellitus (T2DM) is a metabolic disorder that occurs as a result of insulin resistance and impaired insulin production by islet β cells in the pancreas; this condition leads to elevated blood glucose levels, resulting in increased glycated hemoglobin (HbA1c) levels [ 10 , 11 ]. In 1990, T2DM was the 18th leading cause of mortality and the 9th cause of morbidity. In 2020, it ranked as the 9th cause of worldwide mortality and the 7th cause of morbidity [ 12 ]. According to the World Health Organization (2020), DM is the 7th leading cause of death among women in Saudi Arabia [ 13 ]. Various factors contribute to the increase in the total number of patients with DM, and they include an aging population and a rising obesity rate [ 14 , 15 ]. In addition, the rate of obesity in women is higher than that in men [ 16 ].

HbA1c is a diagnostic tool and objective measure that healthcare providers and researchers use to assess the clinical outcomes of patients with diabetes. It represents the average blood glucose levels of individuals over the previous 2–3 months based on the presumed half-life of red blood cells [ 17 ]. According to the Saudi Diabetes Clinical Practice Guidelines, the normal HbA1c level is 4%–5.6% [ 18 ]. The American Diabetes Association (2021) recommended that A1c should be less than 7.0% in adults with diabetes [ 19 ], and A1c 8% indicates poor diabetes control [ 17 ].

DM is a complex, long-term illness that requires regular medical assistance and multifaceted risk-reduction methods that are beyond glucose management. Given the unavailability of a definitive cure, secondary prevention is the best approach. Appropriate patient education regarding self-care can delay or prevent the onset of acute and chronic complications [ 20 ]. Self-care involves many aspects, such as diet, physical activities, medication adherence, blood glucose monitoring, problem solving, and coping skills [ 21 ]. The critical element for controlling diabetes is patients’ self-care management.

Several recent studies suggested the use of diabetes self-management education (DSME) to control the disease [ 22 – 24 ]. In these studies, the HbA1c levels of patients with T2DM who participated in DSME decreased by 0.71%–1.57% relative to those of patients on standard therapy [ 22 – 24 ]. With the development of technology, mobile applications have been broadly used to communicate and deliver information in a simple and easy manner. WhatsApp is a free messenger application that can be used across multiple platforms such as Android and iPhone devices [ 25 ]. Instructional WhatsApp groups can create a competitive environment to decrease the level of glycosylated hemoglobin by providing instructions by educators. Additionally, group members can encourage one another to achieve their primary goals [ 26 ]. The recent literature suggests that WhatsApp is an effective medical learning tool [ 27 ]. In Saudi Arabia, 71% of the total population uses WhatsApp, and people spend an average of three hours and two minutes on social media [ 25 ]. In addition, a study conducted in Saudi Arabia showed that Saudi women tend to learn through WhatsApp [ 28 ]. The theoretical framework for this study is based on the trans-theoretical model (TTM) of stages of change established by James Prochaska and Carlo DiClemente (the 1980s) [ 29 ]. The model has been used to help people develop healthy behaviors, including weight loss, exercise, and quitting unhealthy behaviors. In addition, it presents a health-promotion strategy that considers behavioral change as a series of steps [ 30 ]. At present, few studies have focused on the effects of instructional WhatsApp groups on self-care and HbA1c levels in female patients with T2DM, especially those in Saudi Arabia. Hence, the current study aimed to assess the effect of instructional WhatsApp groups on self-care and HbA1c levels among female patients with T2DM. Therefore, the findings of this study may help identify new strategies for managing T2DM.

Research objectives

The aim was achieved through the following objectives:

  • Assessing the level of self-care and HbA1c among type 2 diabetic female patients.
  • Providing diabetes self-care-related instruction through WhatsApp group.
  • Measuring the effect of instructional WhatsApp group on self-care and level of HbA1c among type 2 diabetic patients.

Research hypothesis

• Type 2 diabetes female patients’ self-care will improve post implementing WhatsApp group instruction.

• HbA1c will decrease among female patients with type 2 diabetes post implementing WhatsApp group instruction.

Materials and methods

A quasi-experimental design (pre- and post-test) was used in this study.


The inclusion criteria were female adults with T2DM who could read and write Arabic and use WhatsApp on their cellphones. The exclusion criteria were patients using a different application, pregnant women with gestational diabetes, post-operative patients, patients with hearing or visual disabilities, and those with other comorbidities that would prevent them from participating in the study (e.g., mental illness and cerebrovascular accident).

Data were gathered from the medical outpatient clinic of the Rabigh General Hospital in the Western region of Saudi Arabia. The sample size was 89, which was calculated using the Raosoft program based on a report of the population size at this hospital in 2020 (114 female patients with T2DM), a level of confidence of 95%, a margin of error of 0.05, and a probability value of 0.5. A total of 70 patients who met the inclusion and exclusion criteria were recruited to participate in the study; 8 of them withdrew from the study during the intervention phase (3 left the WhatsApp group, and 5 did not complete the post-test). Therefore, the final sample size was 62 patients who completed three months of the WhatsApp intervention and the post-test ( Fig 1 ).


Data collection

The data collection process was divided into three phases: pre-test, intervention, and post-test.

Pre-test phase.

After obtaining ethical approval, the researcher met with the head nurse of the clinic to facilitate data collection. Initially, the medical records of female patients with diabetes were reviewed to identify those who met the inclusion criteria. Patient names, file numbers, and phone numbers were recorded to facilitate the communication with potential participants (files without phone numbers were excluded).

Female patients with T2DM were contacted to check if they could read and write Arabic and if they had a smartphone with the WhatsApp application. A total of 76 females who met the inclusion criteria were invited to participate in the study; 6 of them refused to participate. Those who agreed to participate in the study signed an informed consent form and were then provided with pre-test questionnaires to gather baseline data about their self-care. Blood samples were collected for HbA1c analysis.

Intervention phase.

The duration of the intervention phase was three months. Initially, the WhatsApp group was created and moderated by one researcher who is a registered nurse and diabetes educator and two other researchers who are associate professors in medical–surgical nursing. Instructions regarding diabetes self-care were provided through the WhatsApp group in the form of pictures, videos, and daily messages. The WhatsApp group was open daily from Sundays to Thursdays, from 6:00 p.m. to 8:00 p.m., for discussions, questions, or any clarifications or concerns. All instructions were sent to the group daily, and questions were answered by the moderator accordingly. Private conversations were not permitted. Depending on the amount of information presented during the first month, each topic was presented for 1–3 days. Patients were motivated to follow the instructions in the WhatsApp groups for the next 2 months.

Post-test phase.

After the completion of the intervention phase, an appointment schedule was sent for the post-test and HbA1c analysis and was sent to the patients. The patients were divided according to their code numbers, with 15–19 patients scheduled for the post-test and HbA1c analysis per day. The patients were invited again the day before the appointment, and a private message through WhatsApp was sent on the day of the appointment as a reminder. After completing the same questionnaires, a blood sample was withdrawn for HbA1c analysis. The results were then compared with the previous ones.

Data were collected using two structured, validated questionnaires. The first one was aimed at assessing the patients’ demographic and clinical data. It included demographic data such as age, marital status, level of education, and working status. It also covered clinical data such as duration of disease, methods used to treat diabetes, family history related to diabetes, comorbidities, and education about self-care for diabetes. The second questionnaire used was the Diabetes Self-Care Scale. This scale was adapted from Lee and Fisher (2005) and modified in the current study to measure self-care practices related to diabetes [ 31 ]. The modified scale includes 28 statements that are related to self-care activities and are grouped into seven domains. These domains are dietary control (five statements); exercise (three statements); blood glucose monitoring (two statements); medication adherence (three statements); follow-up (three statements); foot care (five statements); and other self-care practices related to hygiene, diabetes identification, and avoidance of complications (seven statements). The responses to these statements were rated on a 6-point Likert scale, with the choices ranging from 1 “strongly disagree” to 6 (“strongly agree”). The scores for each domain and the total scale were calculated, and the mean scores were calculated and categorized as good, moderate, or poor. The intervals between the three categories were calculated by subtracting the lowest value from the highest value for every domain and the total and then dividing the results by 3.

Ethical consideration

Ethical approval was obtained from the Ethics Committee of the Faculty of Nursing, King Abdulaziz University (Ref No. 2M. 79) and from the National Board Review of the Ministry of Health at Jeddah Research Center (IRB No. H-02-J-002) to collect data from the hospital in the Rabigh, Makkah region where the study was conducted.

Written informed consent was obtained from all participants who agreed to participate in the study after explaining the research objective during the interviews. A summary of the search, purpose, duration, advantages, and disadvantages of the intervention was provided in Arabic. The ethical aspects of the study were based on research ethics and principles. The patients were informed that their participation was voluntary and that they had the right to continue or withdraw. Confidentiality and anonymity were protected by providing a code number for each participant at the data collection stage. In addition, all data gathered during the study were kept confidential, and only the researchers had access to personal information.

Statistical analysis

Data analysis was performed using the Statistical Package for Social Science (version 23.0). Descriptive analyses using frequency, percentage, mean, and standard deviation (SD) were performed to determine the distribution of the study participants’ sociodemographic variables and clinical data. Normal distribution was evaluated using the kurtosis and skewness test and Jarque–Bera test. A paired sample t-test was used to compare the self-care domains and HbA1c before and after implementing the WhatsApp group instructions. The significance of the results was categorized using P-values: P ≤ 0.05 was considered statistically significant; P ≤ 0.01, P ≤ 0.001 was considered highly statistically significant, and P > 0.05 was considered non-significant. Cohen’s d was used to measure the effect size, with d ≤ 0.2, 0.2 < d < 0.8, and d > 0.8 indicating small, moderate, and large effect sizes, respectively.

Demographic and clinical characteristics

Table 1 presents the participants’ demographic characteristics. The mean age of the study participants was 47.6 ± 9.74. Most study participants were married (66.1%) while a few (8.1%) were single. Regarding education level, 58.1% of the study participants had less than a secondary level of education while 17.7% had a bachelor’s degree. In terms of occupation, 77.4% of the study participants were not employed.



The clinical characteristics of the patients are presented in Table 2 . Approximately one-quarter of the sample (21%) had T2DM for 15 years or more, and only a small percentage (8%) had T2DM for less than a year. In terms of treatment, 58.1% of the participants used oral antidiabetics while 4.8% used diet and exercise. Furthermore, 64.5% of the patients had a family history of diabetes. More than half of them (54.8%) indicated that they had previously received diabetes self-care education.



Diabetes self-care among study participants before and after the implementation of the WhatsApp group instructions

As shown in Table 3 , the paired samples t-test was used to compare the mean scores of the self-care domains among the study participants before and after implementing the WhatsApp group instructions at a significance level of α = 0.05. Moreover, the effect size was calculated using Cohen’s d, with the values d ≤ 0.2, 0.2 < d < 0.8, and d ≥ 0.8 indicating small, moderate, and large effect sizes, respectively.



With regard to dietary control, Table 3 shows that the post-test mean score is higher than the pre-test mean score with a calculated t value = 5.176 and P-value < 0.001, indicating a highly statistically significant difference between them. Measuring the effect size of the implementation of the WhatsApp group instructions on the level of dietary control revealed a Cohen’s d = 0.657 (> 0.2 and < 0.8), indicating that the effect of the implementation was moderate.

Regarding the exercise domain, the Table 3 shows that the post-test mean score was higher than the pre-test mean score. The calculated paired t-value = 10.079 and P-value < 0.001 denoted the highly statistically significant difference between the scores. Specifically, the level of exercise among the study participants increased and improved because of the implementation of the WhatsApp group instructions, with Cohen’s d = 1.280 > 0.8, which indicated a large effect size.

With regard to blood glucose monitoring, the same table shows that the post-test mean score was higher than the pre-test mean score. The calculated paired t-value was 10.479 while the P—value < 0.001, indicating the highly statistically significant difference between the scores. Measuring the effect size of the implementation of the WhatsApp group instructions on the level of blood glucose monitoring revealed a Cohen’s d = 1.331 > 0.8, indicating a large effect.

Regarding medication adherence as a self-care domain, Table 3 shows that the post-test mean score was higher than the pre-test mean score. The paired t-value = 4.237 and P-value < 0.001 indicated a highly statistically significant difference. Measuring the effect size of the implementation of the WhatsApp group instructions on the level of medication adherence based on Cohen’s d revealed a value of d = 0.538 > 0.2, indicating a moderate effect.

In relation to the participants’ follow-up before and after implementing the WhatsApp group instructions, the same table shows that the post-test mean score increased more than the pre-test mean score, with the tabulated t-value being 6.478 and P-value < 0.001, which indicated the highly statistically significant difference between them. Measuring the effect size of the implementation of the WhatsApp group instructions on the level of follow-up using Cohen’s revealed a value of d = 0.823 > 0.8, which indicated a large effect.

Regarding foot care after implementing the WhatsApp group instructions, Table 3 shows that the post-test mean score was higher than the pre-test mean score. The paired t-value = 6.725 and P-value < 0.001 indicated the highly statistically significant difference between them. The effect size of implementing the WhatsApp group instructions was large, with Cohen’s d = 0.854 > 0.8.

For the other self-care practices related to hygiene, diabetes identification, and avoidance of complications, Table 3 shows that the post-test mean score was higher than the pre-test mean score. The paired t-value was 10.921 while the P-value < 0.001, indicating a highly statistically significant difference. The effect size of implementing the WhatsApp group instructions was large, with Cohen’s d = 1.387 > 0.8.

In relation to total self-care, the post-test mean score was higher than the pre-test mean score. The t-value was 12.359 while P-value < 0.001, indicating a highly statistically significant difference between them. The effect size of implementing the WhatsApp group instructions on the level of total self-care was large, with Cohen’s d = 1.570 > 0.8.

HbA1c among study participants before and after the implementation of the WhatsApp group instructions

Table 4 illustrates the difference between the mean score of the HbA1c levels among the study participants before and after the implementation of the WhatsApp group instructions. The mean HbA1c level among the study participants decreased from 8.61 ± 1.70 to 7.92 ± 1.60 after implementing the WhatsApp group instructions. The t-value = 5.107 and P-value < 0.001, with an absolute reduction of 0.69, indicated the highly statistically significant difference between them. The effect size of implementing the WhatsApp group instructions on the level of HbA1c was moderate given Cohen’s d = 0.649 > 0.2.



The current study demonstrated the effectiveness of using the instructional WhatsApp group on self-care and HbA1c among female patients with T2DM. Subjective and objective data were used to assess the changes in clinical outcomes. The results showed that the instructional WhatsApp group improved all domains of self-care and HbA1c levels.

Regarding dietary control, the mean score of all dietary control items increased, indicating an improvement in self-care under this domain. This finding is in line with those of previous studies [ 22 , 32 ]. Changing one’s lifestyle, particularly eating habits, necessitates regular reminders and increased motivation. In this regard, WhatsApp group instructions appear to be an effective method.

The study showed that the total mean scores in the exercise domain improved significantly after the implementation of the WhatsApp group instructions. This finding is consistent with the study of ElGerges (2020), who used traditional education and followed the patients for three months; their study revealed a significant improvement in the exercise domain [ 22 ]. A similar finding was reported in another study conducted in Saudi Arabia that measured the effect of a WhatsApp-based intervention on promoting physical activity among female college students in Abha [ 33 ]. This study found that social network-based interventions (WhatsApp) contribute to improvements in physical activity. However, some studies revealed that the exercise domain did not improve significantly after participation in the studies [ 32 , 34 , 35 ]. These findings could be due to the continuous support and encouragement through the group in which participants are asked to download a step-counting application and share photos of their walking areas after receiving the related knowledge.

For blood glucose monitoring, the total mean scores improved significantly after the implementation of the WhatsApp group instructions. These findings are congruent with those of ElGerges (2020) and Zheng et al. (2019), who reported a positive relationship between DSME and blood glucose monitoring and a significant improvement in the post-test mean scores in relation to blood glucose monitoring [ 22 , 36 ]. Meanwhile, Dinar et al. (2019) and Hailu et al. (2019) found no relationship between DSME and blood glucose monitoring [ 32 , 34 ]. The discrepancy in some of the findings across studies may be related to several factors, including the strategies used to remind participants. The findings of the current study may be attributed to the fact that the patients were constantly reminded of the need to monitor their blood glucose, document it, and analyze the readings. In addition, notebooks were distributed to the participants during the pre-test visit to record their blood glucose levels. These practices motivated the participants to follow instructions and change their lifestyle.

Medication adherence in patients with chronic diseases remains challenging. The clinical outcomes of patients with DM are usually related to medication adherence. In this study, an instructional WhatsApp group was used to assess its effect on patient adherence to medications. The results showed that the total mean score of the medication adherence domain improved significantly after the implementation of the WhatsApp group instructions. This finding is consistent with those of ElGerges (2020) and Zheng et al. (2019), who found positive patient outcomes regarding medication adherence after implementing diabetes self-management education [ 22 , 36 ]. Furthermore, a study conducted in the Kingdom of Saudi Arabia (KSA) showed that compliance rates for individuals with diabetes range from 60% to 80% for insulin and from 65% to 85% for oral antidiabetic drugs [ 37 ]. However, Sartori et al. (2020), who used the WhatsApp application to assess the impact of education on medication adherence, reported that the findings were clinically significant but not statistically significant [ 38 ]. In addition, previous studies revealed that medication adherence for T2DM did not improve significantly with the WhatsApp application [ 34 , 35 ]. Regardless of these differences in the findings of previous studies, the recent literature has reported high medication compliance among patients with diabetes after using WhatsApp group instructions [ 37 ]. Increasing knowledge, awareness, and correction of concepts related to medicines through WhatsApp groups may convince patients and contribute to great adherence to medicines.

This study also found that the intervention had a positive effect on patient follow-up. The total mean score in this domain improved significantly after the implementation of the WhatsApp group instructions; this result is similar to the findings of a previous study [ 34 ]. The use of WhatsApp instructions contributed to increased compliance with follow-ups through online clinical and face-to-face visits. Consultations with physicians when experiencing extremely high or extremely low blood glucose levels also increased. Not following-up is usually due to the fear of censure from healthcare providers and concerns about laboratory results. Such issues are often attributable to noncompliance with medical regimens. In the current study, the patients showed high compliance with their regimens. The motivation for patients to visit the clinic may be their enthusiasm for knowing their laboratory results after committing to medication, exercise, and nutrition.

The current study also demonstrated significant improvements in foot care following the implementation of the WhatsApp group instructions. Several studies have reported similar findings [ 22 , 32 , 34 , 36 ]. Patients with DM seem to be interested in this aspect. In addition, KSA is a country of Islam and Muslims who pray five times a day. Hence, the feet should be inspected five times as well through ablution ( wudu ). During the study intervention, the patients were encouraged to practice foot care by giving them simple instructions to follow and reminding them continuously. In addition, the complications associated with diabetic feet were explained to them.

For other self-care practices related to hygiene, diabetes identification, and avoidance of complications, significant improvements were noted in the mean scores following the implementation of the WhatsApp group instructions. In the WhatsApp group, the patients were encouraged to wear diabetes identification, maintain their self-cleaning regimen to prevent infections, and search the Internet or ask a healthcare provider when they have a new issue. In doing so, they may increase their awareness regarding these points.

Overall, self-care improved significantly following the implementation of the WhatsApp group instruction. This result is consistent with studies that lasted for three months [ 22 , 24 , 36 ]. By contrast, Waller et al. (2021) revealed that the total mean self-care score of the patients with T2DM did not improve significantly [ 35 ]. The positive findings regarding self-care in our study may be attributed to the fact that the instructions given through WhatsApp were carefully designed to suit different age groups according to their educational and social levels. The instructions were also validated by specialists in the field (endocrine consultant, medical consultant, and diabetic educator). Each part of the self-care program was provided separately, and feedback was obtained daily to ensure adherence to the recommended instructions. In the group, the patients were encouraged to share their experiences with one another through a group chat where they also shared new food recipes with pictures. Thus, they were motivated to learn and follow the instructions to achieve their goals.

The current study assessed the effect of the instructional WhatsApp group on HbA1c and found a significant improvement in the level of HbA1c with an absolute reduction of 0.69 after implementing the WhatsApp instructions. Previous studies also found a positive impact of using WhatsApp groups on HbA1c levels [ 22 , 23 , 26 , 39 – 41 ]. A few studies also demonstrated an improvement in HbA1c [ 35 ]. Often, commitments in the overall domains (i.e., diet control, exercise, blood glucose monitoring, medication adherence, foot care, and follow-up in self-care) would reflect objective results such as HbA1c results. In addition, sharing blood glucose level readings during daily follow-up could contribute to improving HbA1c levels.

The strength of this study lies in using a convenient teaching method through a WhatsApp group for giving instructions to female patients with type 2 DM and assessing its effect on their self-care and HbA1c level, where there is scanty research on this field. However, this study had some limitations. There was no follow-up measurement for self-care or HbA1c after six months or one year. Thus, longitudinal studies are recommended to assess the continued benefit of the intervention. Also, using a small sample size from one clinical site could restrict the generalizability of the study to Saudi Arabia. In addition, the quasi-experimental designs may be associated with the Hawthorne affect [ 42 ]. To reduce the possibility of this bias, we use an objective measure (the HbA1c test) to evaluate the effectiveness of the applied intervention.

T2DM is one of the most remarkable diseases of the 21st century, threatening patients’ physical and psychological well-being. The instructional WhatsApp group effectively improved patient self-care and HbA1c levels. We recommend the adoption of WhatsApp group instructions as a teaching method in the healthcare system for the continuous education and follow-up of patients with diabetes.

Nurses in administration, bedside, and clinics play an important role in providing care to females with T2DM. Secondary prevention is required to avoid or prevent complications and involves the development of standardized guidelines to improve self-care practices and HbA1c levels among patients in healthcare centers and hospitals. These guidelines include the following: providing T2DM patients with a recording book before discharge; assessing self-care levels during hospital discharge and during the follow-up period in the clinics; and initiating effective, dynamic interventional follow-up through WhatsApp groups for female patients with T2DM to improve their self-care and HbA1c levels.

  • http://orcid.org/0009-0005-4771-7441 Huihui Song 1 ,
  • http://orcid.org/0000-0002-3907-8396 Anwen Zhang 2 ,
  • http://orcid.org/0000-0002-4208-9475 Benjamin Barr 1 ,
  • Sophie Wickham 1
  • 1 Department of Public Health, Policy and Systems , University of Liverpool , Liverpool , UK
  • 2 Adam Smith Business School , University of Glasgow , Glasgow , UK
  • Correspondence to Dr Huihui Song; hss6c{at}liverpool.ac.uk

Background Child mental health has become an increasingly important issue in the UK, especially in the context of significant welfare reforms. Universal Credit (UC) has introduced substantial changes to the UK’s social security system, significantly impacting low-income families. Our aim was to assess the effects of UC’s introduction on children’s mental health for families eligible for UC versus a comparable non-eligible sample.

Methods Using Understanding Society data from 5806 observations of 4582 children (aged 5 or 8 years) in Great Britain between 2012 and 2018, we created two groups: children whose parents were eligible for UC (intervention group) and children whose parents were ineligible for UC (comparison group). Child mental health was assessed using a parent-reported Strengths and Difficulties Questionnaire. The OR and percentage point change in the prevalence of children experiencing mental health difficulties between the intervention group and the comparison group following the introduction of UC were analysed. We also investigated whether the utilisation of childcare services and changes in household income were mechanisms by which UC impacted children’s mental health.

Published on July 31, 2020 by Lauren Thomas. Revised on January 22, 2024.

Conclusions UC has led to an increase in mental health problems among recipient children, particularly for children in larger families and those aged 8. Policymakers should carefully evaluate the potential health consequences for specific demographics when introducing new welfare policies.


Data availability statement

Data are available upon reasonable request.

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See:  https://creativecommons.org/licenses/by/4.0/ .


Previous research has focused on the health implications of Universal Credit for adult populations; however, a gap remains regarding evaluations of the policy’s influence on the mental health of children post implementation.


Employing a quasi-experimental study design, this research identified that the implementation of UC was linked to an increase in children’s mental health issues. These findings suggested that benefits policy shocks affected not only adult recipients but also extended to their children, highlighting the broader impact of such policy changes on family well-being.


In light of the evidence demonstrating the adverse effects of welfare changes on children’s mental health, it is imperative to establish a comprehensive health impact assessment of children’s well-being within any welfare reform evaluation. Furthermore, the more pronounced impact of UC on children in larger families and those aged 8 years underscores the importance of considering household-specific effects in policy implementation. The health outcomes of children should be a central consideration when redesigning welfare systems.


Childhood is a critical phase for mental health, characterised by rapid brain growth and development. 1 During this period, children acquire cognitive skills that shape their future mental well-being and are essential for assuming adult roles in society. This underscores the vital importance of providing children with the best possible start in life, particularly in early childhood. In the UK, there has been a concerning trend of worsening mental health among young children. For example, rates of mental health issues among children aged 5–10 years rose from 9% in 2017 to 14% in 2020. 2 Addressing the causes of this increase is a public health priority.

Policy actions can yield unintended consequences, and welfare reform stands out as a potential contributor to such mental health outcomes. 3 Universal Credit (UC) is arguably the biggest overhaul of the welfare system in the UK since the Beveridge reforms of the 1940s. UC has been gradually implemented across the UK for different groups of people. The rollout of UC commenced in April 2013, with eligibility extended to families with children starting in May 2016 ( figure 1 illustrates the timeline of the national expansion of the UC rollout). The Department for Work and Pensions data show that as of February 2022, there were 3.8 million children in over 2 million households who were receiving UC, accounting for 49% of all households under UC. Three-quarters of families with children on UC had a child of primary school age or younger. 4

  • Download figure
  • Open in new tab
  • Download powerpoint

Timeline of UC. Note: Live service started in April 2013 in the North West. It did not involve online applications; only single, childless, unemployed adults without housing costs were eligible initially. In April 2016, full service commenced, accepting new claims from all types of claimants and concluded in December 2018. Natural migration refers to situations when existing claimants of legacy benefits and tax credits experienced a change in circumstances, such as unemployment, and were migrated to UC. Managed migration refers to the process of transferring the remaining claimants of legacy benefits and tax credits to UC. Source: National Audit Office (2018). 4 UC, Universal Credit.

UC has been criticised for its digitalised implementation style, wait for first payment and increased use of conditionality and sanctions. Studies have found negative impacts of UC on employment outcomes, 5 debt, 6 food bank usage, 7 housing insecurity 8 and higher crime rates. 9 10 There are several papers, both quantitative and qualitative, that have explored the impact of UC on the mental health and psychological distress of working-age adults, finding that individuals entering UC experienced a deterioration in their mental health. 11–13 There have not, however, to our knowledge, been any previous studies investigating the impact on children. Therefore, understanding the impact of UC on child mental health is now urgently needed, given the rise in child mental illness we are seeing in the UK. 2

In exploring the mechanisms by which UC impacts children’s mental health, there are multiple factors to consider (see figure 2 ). For example, a reduction in household income under UC may be detrimental to children’s well-being and development. 14 Compared with entitlements under the tax credit system, the majority of working families are worse off under UC, experiencing an average loss of entitlement of £41 per week. 15 In addition, under UC, the introduction of the two-child limit restricts child-related benefits to the first two children in a family with a third or additional baby since April 2017. This means that families lose roughly £237 per child, which reduces the overall income available to households with larger families and has pushed more families into poverty or deeper into poverty. Moreover, the requirement for compulsory intensive job searches for unemployed or low-income claimants may lead parents to rely more on childcare services, reducing the time spent with their children, which could affect children’s mental health.

Possible pathways to poor mental health outcomes of young children under UC. Note: This simplified figure illustrates potential pathways leading to adverse mental health outcomes in young children under UC. For a more detailed illustration of these pathways, please refer to online supplemental appendix 1 . UC, Universal Credit.

Supplemental material

Hence, there is an urgent need to understand both the potential differential effects of UC for different groups of children and the potential mechanisms through which UC impacts children’s mental health to inform future policy implementation. Several important pathways through which UC may impact children’s mental health are depicted in figure 2 .

In this paper, we examined the potential impact of UC on young children’s socioemotional behavioural difficulties, which have been recognised as a critical factor in understanding mental health outcomes, 16 with a growing body of research consistently identifying it as a pivotal marker on the pathway to mental illness. 17 We compared the socioemotional behavioural difficulties of children whose parents were unemployed and therefore eligible for UC to those not eligible for UC before and after UC became available for families with children in 2016. We also explored if there were differential effects of the introduction of UC for younger versus older children and single-child versus multiple-child households. Finally, we explored whether the reduction in household income or changes in childcare service usage were the pathways through which UC impacted children’s socioemotional behavioural difficulties.

Study design and participants

We used data from the UK Household Longitudinal Study (UKHLS). The UKHLS is a large and nationally representative panel survey of approximately 40 000 households. It includes information on households’ social, economic and demographic status, health, employment, and social benefits across the UK from 2009 onwards. 18 The Strengths and Difficulties Questionnaire (SDQ) has been collected in the UKHLS since Wave 3 and is only asked of children aged 5 and 8 years. Therefore, we included data from 2012 to 2018, covering eight waves of data. According to the inclusion and exclusion criteria, data from 5806 observations of 4582 children aged 5 and 8 years were included in the study population. A flowchart of participants and details of the study sample can be found in online supplemental appendix 2 .

Eligibility and policy exposure

From May 2016, households with children became eligible for UC in England, Scotland and Wales. 19 We took a conservative approach to eligibility and classified children’s exposure to UC based on their parents' unemployment status. Children were assigned to the intervention group if at least one of their working-age parents (18–64 years) identified as unemployed and therefore eligible to receive UC. They were assigned to the comparison group in a given wave if their parents identified as anything other than unemployed. Eligibility could vary over time. The interview year was used to determine the period before (<2016) and after policy exposure (≥2016).

The primary outcome of interest was young children’s socioemotional behavioural difficulties using the parent-reported SDQ. This is a short behavioural screening questionnaire for children and was only asked of parents whose young children were aged 5 and 8 years. The composition of the SDQ is detailed in online supplemental appendix 3 . A total difficulties score was created by summing the first four subscales (range 0–40). We used a dichotomised score and constructed a dummy variable indicating mental difficulty, where 0–16 indicated no difficulties and 17–40 indicated socioemotional behavioural difficulties. 20 The dichotomised score better reflects a clinically meaningful effect on child’s mental health, and it is likely that the effect of the policy on social and behavioural difficulties is non-linear, potentially having a greater effect at higher levels of SDQ score. We tested this assumption using quantile regression (see online supplemental appendix 4 ) and repeated the analysis using the continuous score (see online supplemental appendix 7 ).

Following the literature, 21 continuous covariates included the logarithm of household inflation‐adjusted income (household income was measured as the logarithm of the contemporaneous monthly net income from the labour market and all other sources taking away any taxes, deductions and benefits in GB 2010 prices) and the mother’s mental health (measured using the 12‐item General Health Questionnaire). 22

Categorical covariates included the child’s age (either 5 or 8), gender (female=0 and male=1), long-term health condition (‘Excellent’ compared with ‘very good’, ‘good’, ‘fair’ and ‘poor’), the mother’s education level (‘Degree’ compared with ‘other higher’, ‘A levels’, ‘GCSE’ and ‘no/other qualification’), and whether there was only one child in the family (only one child in family=1 and additional children in family=0). Childcare utilisation was measured based on maternal reports, with a value of 1 indicating that the mother reported using childcare services, and 0 otherwise.

Statistical analysis

Main analysis.

To understand if the introduction of UC has had an effect on child socioemotional behavioural difficulties of parents eligible for UC, we first analysed whether the trends in socioemotional behavioural difficulties ran in parallel prior to the intervention. This comparison of trends in the outcome focused on the percentage of children with mental health issues, specifically those with SDQ scores equal to or exceeding 17. This comparison was conducted between the intervention and comparison groups during the preintervention period.

Next, we employed a generalised difference-in-differences framework and logistic models to identify the treatment effect of UC on children’s socioemotional behavioural difficulties between 2012 and 2018. This analysis compared children of parents eligible for UC with those of ineligible parents, adjusting for the covariates described above. Therefore, changes in mental health for the children limit potential biases after controlling for covariates.

We conducted several robustness tests to investigate whether our results were sensitive to model specifications. First, we repeated our main analysis using an alternative approach to eligibility. We classified children’s exposure based on their parent’s reports of working-age benefits. Children were assigned to the intervention group if at least one of their parents reported receiving either UC or one of the six legacy benefits and were therefore eligible to receive (or move onto) UC. They were assigned to the comparison group in a given wave if their parents did not report receiving UC or legacy benefits.

Second, we repeated the main analysis using the continuous measure of SDQ as the outcome. For this model, we used a linear rather than logistic regression model. Third, we constructed a ‘stable treatment’ status for children in order to implement canonical difference-in-differences to overcome potential issues around treatment status staggering. Children were assigned to the intervention group if their parents were unemployed for any period. Once assigned, they were considered to belong to the intervention group for the entire period. The comparison group was defined as children whose parents were always employed, which allowed us to construct a time-invariant comparison group. Fourth, we repeated the analysis, excluding families with more than two children, as UC initially only allowed families with two or fewer children to apply. Fifth, we repeated our main analysis and excluded the top 25% of households with the highest income to improve comparability between the intervention and comparison groups. Sixth, we only included children with two or more observations of the outcome in a linear probability model with individual fixed effects to re-estimate the main findings. Seventh, we used propensity score matching with bootstrapping to overcome demographic variation between the intervention and comparison groups. Eighth, we used the linear ramp model to explore the potential temporal variation in the rollout of UC. Finally, to overcome potential bias in the missing data (ie, structural missingness in the outcome variable and other forms of non-random missingness), we have repeated the main analysis using multiple imputation and inverse probability weighting (IPW).

Exploring heterogeneity effects and mechanisms

We conducted two heterogeneity tests to explore if there were differential effects based on child’s age and household composition, specifically whether the household had only one child or multiple children. We repeated the main analysis using subgroups to explore variability in effects.

To explore the mechanism through which UC potentially affected children’s socioemotional behavioural difficulties, we investigated two policy elements embedded in UC. First, we used the utilisation of childcare as a potential proxy for reduced time spent with parents (and potentially the changes in conditionality under UC). Second, we explored changes in household income as a proxy for changes in benefit income. We repeated the main analysis, substituting the socioemotional behavioural outcome with the mechanism variable. A detailed description of all methodologies is described in online supplemental appendix 4 .

We included 5806 observations from 4582 children (aged 5 or 8 years) in England, Wales and Scotland who participated in the UKHLS between 2012 and 2018. The baseline characteristics of the intervention and comparison groups in the years prior to UC’s introduction are presented in table 1 ( online supplemental appendix 6 outlines the number of observations in the intervention and comparison groups). The socioemotional behavioural difficulty was more prevalent in the intervention group compared with the comparison group. Consistently, the average SDQ scores for the intervention group were 1.8 points higher than those of the comparison group. The comparison group used childcare services more frequently and had a higher household income. There were no large differences between participants in the intervention group and the comparison group in terms of age and gender. Children in the intervention group exhibited worse long-term health conditions and lived in households with more than one child. Additionally, the intervention group had a higher prevalence of mothers experiencing mental health issues and lower levels of educational attainment.

  • View inline

Baseline characteristics in the years before Universal Credit was introduced

The trend in the proportion of children with socioemotional behavioural difficulties in both the intervention and comparison groups before and after the introduction of UC is displayed in figure 3 . While the intervention and comparison groups differed in terms of their difficulties prior to UC, this difference, however, should not introduce bias in the analysis, as the difference between the two groups would persist at the same level in the absence of UC (see online supplemental appendix 5 for full regression results of the parallel trend analysis). The parallel trend graph suggested that a greater number of children in the intervention group experienced mental health issues compared with the comparison group following the implementation of UC.

Graphical representation of socioemotional behavioural difficulties in the intervention and comparison groups before and after Universal Credit was introduced. (Observing parallel trends in the preintervention period.)

The difference-in-difference results in table 2 indicated that UC exacerbated children’s socioemotional behavioural difficulties in households with unemployed parents. The effect of UC was to increase the prevalence of difficulties by an OR of 2.18 (95% CI 1.14 to 4.18), equivalent to an 8-percentage point increase (95% CI 1 to 14) among eligible children.

Difference-in-difference estimates of the impact of UC on children’s socioemotional behavioural difficulties

The outcomes of a series of robustness analyses are presented in figure 4 . These included the alternative definition of eligibility criteria, adjustments to the scope of the study population, and the utilisation of different model specifications. Except for the analysis limited to families with fewer than two children, which suggested a weaker effect, the results remained consistent across various checks. More detailed results, including parallel trends and regression outcomes, are provided in online supplemental appendix 7 .

Outcomes of a series of robustness analyses with 95% CIs. Note: When employing continuous SDQ scores to assess children’s mental health conditions, the changes were measured on a different scale rather than in percentage points; thus, this outcome was not included in the graph. However, the results were similar, showing a 1.40 SDQ score increase (95% CI 0.49 to 2.39) for the intervention group after the introduction of Universal Credit. The results are presented in online supplemental appendix 7 . SDQ, Strengths and Difficulties Questionnaire.

The results of the heterogeneity effects showed that following the implementation of UC, the prevalence of socioemotional behavioural difficulties increased by an OR of 2.40 (95% CI 1.20 to 4.83), equivalent to a 9-percentage point increase (95% CI 2 to 16 percentage points) for eligible children in families with two or more children while exhibiting an insignificant effect on eligible children in one-child families ( table 3 ). Additionally, the results suggested that UC negatively impacted children aged 8 (95% CI 5 to 26 percentage points).

Heterogeneity results of the impact of Universal Credit on children’s mental health

Our exploration of the mechanism through which UC potentially affected children’s socioemotional behavioural difficulties found that neither the use of childcare services nor a reduction in household income were the main contributors to children’s experiences of worse socioemotional behavioural difficulties (see online supplemental appendix 9 ).

This paper has demonstrated that the implementation of UC has exacerbated socioemotional behavioural difficulties in children. This corresponded to an 8-percentage point (95% CI 1 to 14 percentage points) increase in the proportion of children with parents eligible for UC based on their employment status experiencing socioemotional behavioural difficulties. This estimation served as a conservative estimate, as some individuals in the comparison group might also have been eligible to apply for UC for reasons other than unemployment, although accounting for only 2% of the comparison group. 11 The findings were strengthened by the robustness tests showing similar effects from different model specifications.

This study emphasised that children in larger families and those aged 8 years may be more susceptible to the impacts of welfare reform, underscoring the importance of intervention strategies. To understand why children were affected by UC, we analysed the treatment effects on two mediators linked to the subelements of UC. The results suggested that neither lower household income nor parents’ use of childcare services were the main factors that caused the observed deteriorating child mental health. This might be because the income measure, based on survey-reported data, does not fully capture the effect, especially at very low-income levels where negative impacts might be concentrated. Additionally, delays in benefit payments, sanctions under UC and the anticipation of moving into employment could be alternative pathways affecting children’s mental health.

Our study adds to the growing body of evidence of the adverse effects of UC on various socioeconomic aspects 5–10 that have focused on the experiences of adults. Regarding the mental health of adults, both quantitative and qualitative studies have explored the impact of UC on working-age adults, consistently finding a decline in mental health among individuals transitioning to UC. 11–13 However, there is a gap in research concerning the impact of UC on children’s mental health.

Our research endeavours to augment the existing body of knowledge by furnishing longitudinal evidence that illuminates the mental health ramifications associated with the transition to UC for children with unemployed parents. By doing so, our study underscores that the unintended consequences of UC extend beyond the recipients themselves, also impacting the mental well-being of children.

This study has some limitations. First, the intervention group had a small sample size, introducing potential uncertainty. The CI suggested that the true impact of UC on children’s mental health may range between 1 and 14 percentage points. Second, the implementation of UC in the UK follows a staggered full-service rollout schedule, leading to variations in the timing of application for eligible children. However, estimating the impact of the UC rollout on children’s mental health is constrained by the limitation of a small sample size within a singular district. Third, we used reported unemployment. However, not all unemployed individuals received UC, and some participants in the comparison group may have become eligible over the course of the analysis, although only a small proportion (<2%) of the comparison group was affected. 11 Alternative definitions, however, indicated similar results. Fourth, the prevalence of missing data and attrition posed common challenges in longitudinal datasets and natural policy methodologies. Lastly, our measure of use of childcare services may not have been an accurate measure of time spent in childcare services and did not reflect the quality of those arrangements, thus hindering our ability to determine the potential mechanisms involved.

Considering the adverse influence of UC on children’s mental health, as outlined in this paper, it is imperative for future government policies in the UK and other countries to consider the well-being of children when reforming the welfare system. The mechanisms for this effect remain unclear. Further research should aim to understand the experience of families with children using UC and the potential pathways for negative and positive effects on child well-being, adapting the service to maximise positive benefits. Furthermore, specific policies related to children, including parental conditionality and the benefit cap, require further research to explore their impact on children and young people. Policymakers should give greater consideration to the health impact of changes to welfare systems on children.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

  • Margolis AE
  • Ford T , et al
  • Kinderman P ,
  • Whitehead M
  • National Audit Office
  • Alvarez-Vilanova J
  • Chris Drake
  • Loopstra R ,
  • Fledderjohann J ,
  • Reeves A , et al
  • Tiratelli M ,
  • Bradford B ,
  • Wickham S ,
  • Bentley L ,
  • Rose T , et al
  • Robertson L ,
  • Stewart ABR
  • Copeland W ,
  • Simeonova E
  • Ortuño-Sierra J ,
  • Aritio-Solana R ,
  • Fonseca-Pedrero E
  • Ganchimeg T ,
  • Naranbaatar N , et al
  • Understanding Society
  • Child Poverty Action Group

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Data supplement 1

Correction notice This article has been corrcected since it first published. The funding statement has been corrected.

Contributors HS is lead author and guarantor. HS, AZ, SW and BB planned the study and led the drafting and revising of the manuscript. HS and AZ analysed the data. HS, AZ, SW and BB contributed to interpreting the data and drafting and revising the manuscript. All authors approved the submitted version of the manuscript.

Funding SW was funded by a Welcome Trust Society and Ethics Fellowship (200335/Z/15/Z). BB and SW were supported by the UK National Institute of Health Research Public Health Research Programme (NIHR131709).

Competing interests None declared.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

Compressive Properties and Fracture Behaviours of Ti/Al Interpenetrating Phase Composites with Additive-Manufactured Triply Periodic Minimal Surface Porous Structures

  • Published: 21 September 2024

two types of quasi experimental design

  • Zhou Li 1 , 2 ,
  • Haotian Mo 1 , 2 ,
  • Jiahao Tian 1 , 2 ,
  • Junhao Li 1 , 2 ,
  • Shiqi Xia 1 , 2 ,
  • Xianshi Jia 1 , 2 ,
  • Libo Zhou 3 &

The triply periodic minimal surfaces (TPMS) structure is regarded as a highly promising artificial design, but the performance of composites constructed using this structure remains unexplored. Two porosity levels of Ti/Al interpenetrating phase composites (IPCs) were fabricated by infiltrating ZL102-Al melt into additive-manufactured TC4-Ti scaffolds with the TPMS porous in this study. The combination of the two-phase alloys exhibits structural integrity at the interfacial region, as evidenced by microscopic surfaces observed in uncompressed IPCs. Quasi-static compression tests were performed to demonstrate that the Young’s modulus, yield stress and maximum compressive stress of IPCs exhibit significant enhancement when compared to the individual TPMS scaffolds, due to the supporting and strengthening effect of the filling phase. In the compression process of IPCs, defects emerge initially at the interface between the ZL102 phase and TC4 phase, triggering the fracture and slip of the ZL102 phase, eventually propagating to involve fracture in the TC4 phase. The deformation behaviours obtained from numerical simulation were combined to support these experimental phenomena. The results show that the corresponding stress concentration region is the central region of the spiral surface, the maximum stress concentration region of the ZL102 phase is the same as that of the TC4 phase, and the ZL102 phase effectively shares part of the loading. The Ti/Al IPCs achieve equivalent load-bearing capacity through a simplified interpenetration process and the utilisation of lighter materials.

Graphical Abstract

two types of quasi experimental design

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

two types of quasi experimental design

X. Yang et al., Functional application of multi-element metal composite materials. J. Alloys Compd. 895 , 162622 (2022)

Article   CAS   Google Scholar  

Z. Li et al., The plastic deformation mechanism in nano-polycrystalline Al/Mg layered composites: a molecular dynamics study. 14 (1), 114 (2024)

F. Lin et al., Achieving balanced strength-ductility of heterostructured TiC/graphene nanoplatelets (GNPs) reinforced Al matrix composites by tuning TiC-to-GNPs ratio. Compos. Commun. 38 , 101529 (2023)

Article   Google Scholar  

F. Lin et al. Effects of TiB2 content on microstructural evolution, microhardness and tribological behaviours of Al matrix composites reinforced with TiB2 Particles. Ceram. Int. 50 , 11049–11059 (2024)

A. Singh, O. Al-Ketan, N. Karathanasopoulos, Mechanical performance of solid and sheet network-based stochastic interpenetrating phase composite materials. Compos. Part. B: Eng. 251 , 110478 (2023)

M. Zhang et al., On the damage tolerance of 3-D printed Mg-Ti interpenetrating-phase composites with bioinspired architectures. Nat. Commun. 13 (1), 3247 (2022)

Article   CAS   PubMed   PubMed Central   Google Scholar  

X.Q. Feng et al., Effective elastic and plastic properties of interpenetrating multiphase composites. Appl. Compos. Mater. 11 (1), 33–55 (2004)

L. Ai, X.L. Gao, Evaluation of effective elastic properties of 3D printable interpenetrating phase composites using the meshfree radial point interpolation method. Mech. Adv. Mater. Struct. 25 (15–16), 1241–1251 (2018)

N.V. Viet et al., Effective compressive behavior of functionally graded TPMS titanium implants with ingrown cortical or trabecular bone. Compos. Struct. 303 , 116288 (2023)

M. Kouzeli, D.C. Dunand, Effect of reinforcement connectivity on the elasto-plastic behavior of aluminum composites containing sub-micron alumina particles. Acta Mater. 51 (20), 6105–6121 (2003)

I.V. Okulov et al., Anomalously low modulus of the interpenetrating-phase composite of fe and mg obtained by liquid metal dealloying. Scripta Mater. 163 , 133–136 (2019)

C. Zhang et al., The wrinkling and buckling of graphene induced by nanotwinned copper matrix: a molecular dynamics study. Nano Mater. Sci. 3 (1), 95–103 (2021)

W. Zhang, X. Xue, H. Bai, Mechanical and electrical properties of Cu-Steel bimetallic porous composite with a double-helix entangled structure. Compos. Struct. 255 , 112886 (2021)

J. Zhu et al., Interfacial structure and stability of a co-continuous SiC/Al composite prepared by vacuum-pressure infiltration. Ceram. Int. 43 (8), 6563–6570 (2017)

J. Yang et al., Interpenetrating phase composite graded lattice structure integrated with load-bearing and sensing capabilities. Compos. Part A: Appl. Sci. Manufac. 164 , 107294 (2023)

Y. Holovenko et al., Effect of lattice surface treatment on performance of hardmetal - titanium interpenetrating phase composites. Int. J. Refract. Met. Hard Mater. 86 , 105087 (2020)

I.V. Okulov et al., Anomalous compliance of interpenetrating-phase composite of Ti and mg synthesized by liquid metal dealloying. Scripta Mater. 194 , 113660 (2021)

Y. Zheng et al., Synthesis and mechanical properties of TiC-Fe interpenetrating phase composites fabricated by infiltration process. Ceram. Int. 44 (17), 21742–21749 (2018)

Z. Li et al., A novel Ti/Al interpenetrating phase composite with enhanced mechanical properties. Mater. Lett. 357 , 135723 (2024)

J. Lei et al., Interfacial Microstructure Evolution for Coordinated Deformation of Mg/Al Composite Plates by Asymmetrical Rolling with Differential Temperature Rolls. J. Magnes. Alloy. (2023). https://doi.org/10.1016/j.jma.2023.04.012

X. Wang et al., 3D printing of polymer matrix composites: a review and prospective. Compos. Part. B: Eng. 110 , 442–458 (2017)

L. Zhang et al., 3D direct printing of mechanical and biocompatible hydrogel meta-structures. Bioactive Mater. 10 , 48–55 (2022)

X. Zhou et al., 3D printed scaffolds with hierarchical biomimetic structure for osteochondral regeneration. Nanomed. Nanotechnol. Biol. Med. 19 , 58–70 (2019)

S. Belhabib, S. Guessasma, Compression performance of hollow structures: from topology optimisation to design 3D printing. Int. J. Mech. Sci. 133 , 728–739 (2017)

L. Zhong et al., 3D printing of hollow fiber nanothermites with cavity-mediated self-accelerating combustion. J. Appl. Phys. 129 (10), 105105 (2021)

W. Li et al., 3D printing of heterogeneous microfibers with multi-hollow structure via microfluidic spinning. J. Tissue Eng. Regen. Med. 16 (10), 913–922 (2022)

Article   CAS   PubMed   Google Scholar  

N.T.H. Men et al., Porous structures prepared by a novel route: combination of digital light processing 3D printing and leaching method. J. Manuf. Process. 67 , 46–51 (2021)

B. Wang, L.J. Sun, B. Pan, Mapping internal deformation fields in 3D printed porous structure with digital volume correlation. Polym. Test. 78 , 105945 (2019)

Y. Lei et al., A scale-elastic discrete grid structure for voxel-based modeling and management of 3D data. Int. J. Appl. Earth Obs. Geoinf. 113 , 103009 (2022)

Google Scholar  

C. Lu et al., Mechanical performance of 3D-Printing Plastic Honeycomb Sandwich structure. Int. J. Precis. Eng. Manuf.-Green Tech. 5 (1), 47–54 (2018)

Y.Q. Ye et al., The effects of grid design on the performance of 3D-printed dry powder inhalers. Int. J. Pharm. 627 , 122230 (2022)

K. Li et al., A domain adversarial graph convolutional network for intelligent monitoring of tool wear in machine tools. Comput. Ind. Eng. 187 , 109795 (2024)

S. Kanwar, S. Vijayavenkataraman, 3D printable bone-mimicking functionally gradient stochastic scaffolds for tissue engineering and bone implant applications. Mater. Design. 223 , 111199 (2022)

Y. Wang et al., Three-dimensional Metal Printing Injection mold Gradient Space Structure is Manufactured by Performing Injection mold Designing According to Injection Characteristic, and Manufacturing Injection mold through Metal Printing mode. Univ Xian Jiaotong (Uyxj-C)

Y. Zeng et al., Fabrication of alumina ceramics with functional gradient structures by digital light processing 3D printing technology. Ceram. Int. 48 (8), 10613–10619 (2022)

M. Ahmadi et al., Review of selective laser melting of magnesium alloys: advantages, microstructure and mechanical characterizations, defects, challenges, and applications. J. Mater. Res. Technol. 19 , 1537–1562 (2022)

E.M. Sefene, State-of-the-art of selective laser melting process: a comprehensive review. J. Manuf. Syst. 63 , 250–274 (2022)

B. Liang, D. Zhou, X. Han, Selective laser sintering of phase change composites for thermal management systems. Mater. Today: Proc. 70 , 248–251 (2022)

D. Grossin et al., A review of additive manufacturing of ceramics by powder bed selective laser processing (sintering / melting): calcium phosphate, silicon carbide, zirconia, alumina, and their composites. Open. Ceram. 5 , 100073 (2021)

Z. Lin, S. Dadbakhsh, A. Rashid, Developing processing windows for powder pre-heating in electron beam melting. J. Manuf. Process. 83 , 180–191 (2022)

N. Uçak, A. Çiçek, K. Aslantas, Machinability of 3D printed metallic materials fabricated by selective laser melting and electron beam melting: a review. J. Manuf. Process. 80 , 414–457 (2022)

J.K. Guest, J.H. Prévost, Optimizing multifunctional materials: design of microstructures for maximized stiffness and fluid permeability. Int. J. Solids Struct. 43 (22), 7028–7047 (2006)

L. Yang et al., Compression–compression fatigue behaviour of gyroid-type triply periodic minimal surface porous structures fabricated by selective laser melting. Acta Mater. 181 , 49–66 (2019)

T. Gao et al., Elastic mechanical property hybridization of configuration-varying TPMS with geometric continuity. Mater. Design. 221 , 110995 (2022)

C. Yan et al., Advanced lightweight 316L stainless steel cellular lattice structures fabricated via selective laser melting. Mater. Design. 55 , 533–541 (2014)

P.J.F. Gandy et al., Exact computation of the triply periodic D (`diamond’) minimal surface. Chem. Phys. Lett. 314 (5), 543–551 (1999)

P.J.F. Gandy, J. Klinowski, Exact computation of the triply periodic G (`Gyroid’) minimal surface. Chem. Phys. Lett. 321 (5), 363–371 (2000)

W. Tang et al., Analysis on the convective heat transfer process and performance evaluation of Triply Periodic Minimal Surface (TPMS) based on Diamond, Gyroid and Iwp. Int. J. Heat Mass Transf. 201 , 123642 (2023)

J. Cai, Y. Ma, Z. Deng, On the effective elastic modulus of the ribbed structure based on Schwarz primitive triply periodic minimal surface. Thin-Walled Struct. 170 , 108642 (2022)

C. Bonatti, D. Mohr, Smooth-shell metamaterials of cubic symmetry: anisotropic elasticity, yield strength and specific energy absorption. Acta Mater. 164 , 301–321 (2019)

H. Jia et al., An experimental and numerical investigation of compressive response of designed Schwarz primitive triply periodic minimal surface with non-uniform shell thickness. Extreme Mech. Lett. 37 , 100671 (2020)

N. Novak et al., Quasi-static and dynamic compressive behaviour of sheet TPMS cellular structures. Compos. Struct. 266 , 113801 (2021)

Z.A. Qureshi et al., On the effect of porosity and functional grading of 3D printable triply periodic minimal surface (TPMS) based architected lattices embedded with a phase change material. Int. J. Heat Mass Transf. 183 , 122111 (2022)

N. Sreedhar et al., Mass transfer analysis of ultrafiltration using spacers based on triply periodic minimal surfaces: effects of spacer design, directionality and voidage. J. Membr. Sci. 561 , 89–98 (2018)

L. Wallat et al., Energy absorption capability of graded and non-graded sheet-based gyroid structures fabricated by microcast processing. J. Mater. Res. Technol. 21 , 1798–1810 (2022)

Z. Zhou et al., Ultra-fine Nbss/Nb5Si3 in situ composites with remarkable properties prepared by ultrasonic melt treatment. J. Alloys Compd. 940 , 168940 (2023)

X.J. Wang et al., Processing, microstructure and mechanical properties of micro-SiC particles reinforced magnesium matrix composites fabricated by stir casting assisted by ultrasonic treatment processing. Mater. Design. 57 , 638–645 (2014)

Q. Sun et al., Compressive mechanical properties and energy absorption characteristics of SLM fabricated Ti6Al4V triply periodic minimal surface cellular structures. Mech. Mater. 166 , 104241 (2022)

I. Tulpan et al., Effect of the lattice structure on the interface zone and the final properties of novel PrintCast Ti64-AlSi9Cu3 interpenetrating phase composites. Additive Manuf. 79 , 103902 (2024)

M. Dunand, D. Mohr, On the predictive capabilities of the shear modified Gurson and the modified Mohr–coulomb fracture models over a wide range of stress triaxialities and lode angles. J. Mech. Phys. Solids. 59 (7), 1374–1394 (2011)

D. Mohr, S.J. Marcadet, Micromechanically-motivated phenomenological hosford–coulomb model for predicting ductile fracture initiation at low stress triaxialities. Int. J. Solids Struct. 67–68 , 40–55 (2015)

M.G. Cockcroft, D.J. Latham, Ductility and the workability of metals. J. Inst. Met. 96 , 33–39 (1968)

CAS   Google Scholar  

Y. Li et al., Corrosion fatigue behavior of additively manufactured biodegradable porous zinc. Acta Biomater. 106 , 439–449 (2020)

M. Zhang et al., 3D printed Mg-NiTi interpenetrating-phase composites with high strength, damping capacity, and energy absorption efficiency. Sci. Adv. 6 (19), eaba5581 (2020)

A. Vyatskikh et al., Additive manufacturing of 3D nano-architected metals. Nat. Commun. 9 (1), 593 (2018)

Article   PubMed   PubMed Central   Google Scholar  

G. Hu et al., Ultrasonic-assisted direct writing metal additive manufacturing technique. J. Mater. Process. Technol. 312 , 117830 (2023)

The authors wish to gratefully acknowledge the financial support from the National Natural Science Foundation of China (Grant No. 52105418), the Natural Science Foundation of Hunan Province (Grant No. 2023JJ20069), and the key scientific research project of Hunan Provincial Department of Education (Grant No. 23A0001).

Author information

Authors and affiliations.

College of Mechanical and Electrical Engineering, Central South University, Changsha, 410083, China

Zhou Li, Haotian Mo, Jiahao Tian, Junhao Li, Shiqi Xia & Xianshi Jia

State Key Laboratory of Precision Manufacturing for Extreme Service Performance, Changsha, 410083, China

School of Energy and Power Engineering, Changsha University of Science & Technology, Changsha, 410114, China

Welding and Additive Manufacturing Centre, Cranfield University, Bedfordshire, MK43 0AL, UK

Corresponding authors

Correspondence to Shiqi Xia or Xianshi Jia .

Ethics declarations

Competing interests.

There is no conflict of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Received: 21 March 2024

Accepted: 25 August 2024

Published: 21 September 2024

    This is the most common type of quasi-experimental design. Example: Nonequivalent groups design. You hypothesize that a new after-school program will lead to higher grades. You choose two similar groups of children who attend different schools, one of which implements the new program while the other does not.

  2. Quasi-Experimental Research Design

    The purpose of quasi-experimental design is to investigate the causal relationship between two or more variables when it is not feasible or ethical to conduct a randomized controlled trial (RCT). Quasi-experimental designs attempt to emulate the randomized control trial by mimicking the control group and the intervention group as much as possible.

  3. 7.3 Quasi-Experimental Research

    Describe three different types of quasi-experimental research designs (nonequivalent groups, pretest-posttest, and interrupted time series) and identify examples of each one. The prefix quasi means "resembling.". Thus quasi-experimental research is research that resembles experimental research but is not true experimental research.

  4. Quasi-Experimental Design: Definition, Types, Examples

    In traditional experimental designs, randomization is a powerful tool for ensuring that groups are equivalent at the outset of a study. However, quasi-experimental design often involves non-randomization due to the nature of the research. This means that participants are not randomly assigned to treatment and control groups.

  5. Quasi-Experimental Design: Types, Examples, Pros, and Cons

    See why leading organizations rely on MasterClass for learning & development. A quasi-experimental design can be a great option when ethical or practical concerns make true experiments impossible, but the research methodology does have its drawbacks. Learn all the ins and outs of a quasi-experimental design.

  6. Quasi Experimental Design Overview & Examples

    Types of Quasi-Experimental Designs and Examples. Quasi-experimental studies use various methods, depending on the scenario. Natural Experiments. ... The researchers matched two schools with similar demographics, baseline academic performance, and resources. The school using the traditional methodology is the control, while the other uses the ...

  7. Chapter 7 Quasi-Experimental Research

    7.4 Combination Designs. A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest.

  8. 8.4: Quasi-Experimental Research (Summary ...

    Quasi-experimental research involves the manipulation of an independent variable without the random assignment of participants to conditions or counterbalancing of orders of conditions. There are three types of quasi-experimental designs that are within-subjects in nature. These are the one-group posttest only design, the one-group pretest ...

  9. Quasi-experimental Research: What It Is, Types & Examples

    Quasi-experimental research designs are a type of research design that is similar to experimental designs but doesn't give full control over the independent variable (s) like true experimental designs do. In a quasi-experimental design, the researcher changes or watches an independent variable, but the participants are not put into groups at ...

  10. 14

    In this chapter, we discuss the logic and practice of quasi-experimentation. Specifically, we describe four quasi-experimental designs - one-group pretest-posttest designs, non-equivalent group designs, regression discontinuity designs, and interrupted time-series designs - and their statistical analyses in detail.

  11. Quasi-Experimental Design

    Quasi-Experimental Research Designs by Bruce A. Thyer. This pocket guide describes the logic, design, and conduct of the range of quasi-experimental designs, encompassing pre-experiments, quasi-experiments making use of a control or comparison group, and time-series designs. An introductory chapter describes the valuable role these types of ...

  12. Quasi-Experimental Design

    A quasi-experimental design is common in social research when a true experimental design may not be possible. Overall, the design types are very similar, except that quasi-experimental design does ...

  13. Introduction to Experimental and Quasi-Experimental Design

    Abstract. This chapter introduces readers to main concepts in experimental and quasi-experimental design. First, randomized control trials are introduced as the primary example of experimental design. Next, nonexperimental contexts, and particularly the use of propensity score matching to approximate the conditions of randomized control trials ...

  14. Selecting and Improving Quasi-Experimental Designs in Effectiveness and

    Quasi-experimental designs (QEDs) are increasingly employed to achieve a better balance between internal and external validity. Although these designs are often referred to and summarized in terms of logistical benefits versus threats to internal validity, there is still uncertainty about: (1) how to select from among various QEDs, and (2 ...

  15. Experiments and Quasi-Experiments

    There are two basic types of research design: True experiments; Quasi-experiments; The purpose of both is to examine the cause of certain phenomena. True experiments, in which all the important factors that might affect the phenomena of interest are completely controlled, are the preferred design.

  16. Experimental vs Quasi-Experimental Design: Which to Choose?

    A quasi-experimental design is a non-randomized study design used to evaluate the effect of an intervention. The intervention can be a training program, a policy change or a medical treatment. Unlike a true experiment, in a quasi-experimental study the choice of who gets the intervention and who doesn't is not randomized.

  17. Quasi-Experimental Design

    This is the most common type of quasi-experimental design. Example: Nonequivalent groups design. You hypothesise that a new after-school program will lead to higher grades. You choose two similar groups of children who attend different schools, one of which implements the new program while the other does not.

  18. Quasi-Experimental Research

    A type of quasi-experimental design that is generally better than either the nonequivalent groups design or the pretest-posttest design is one that combines elements of both. There is a treatment group that is given a pretest, receives a treatment, and then is given a posttest. ... Practice: Imagine that two professors decide to test the effect ...

  19. 5 Chapter 5: Experimental and Quasi-Experimental Designs

    This section will discuss three types of quasi-experiments: nonequivalent group design, one-group longitudinal design, and two-group longitudinal design. Nonequivalent Group Design The nonequivalent group design is perhaps the most common type of quasi-experiment. 23 Notice that it is very similar to the classic experimental design with the ...

  20. 8.2 Non-Equivalent Groups Designs

    There are three types of quasi-experimental designs that are within-subjects in nature. These are the one-group posttest only design, the one-group pretest-posttest design, and the interrupted time-series design. There are five types of quasi-experimental designs that are between-subjects in nature.

  21. The Use and Interpretation of Quasi-Experimental Studies in Medical

    In medical informatics, the quasi-experimental, sometimes called the pre-post intervention, design often is used to evaluate the benefits of specific interventions. The increasing capacity of health care institutions to collect routine clinical data has led to the growing use of quasi-experimental study designs in the field of medical ...

  22. Experimental Design: Definition and Types

    An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental ...

  23. Experimental Design: Types, Examples & Methods

    Three types of experimental designs are commonly used: 1. Independent Measures. Independent measures design, also known as between-groups, is an experimental design where different participants are used in each condition of the independent variable. This means that each condition of the experiment includes a different group of participants.

  24. 5: Experimental Design

    Experimental design is a discipline within statistics concerned with the analysis and design of experiments. Design is intended to help research create experiments such that cause and effect can be established from tests of the hypothesis. We introduced elements of experimental design in Chapter 2.4. Here, we expand our discussion of ...

  25. 14.6: Some other ANOVA designs

    The blocking effect is the individual (see Chapter 14.4), and, therefore, a random effect (see Chapters 12.3 and 14.3) in this type of experimental design. Although straightforward in concept, repeated measure designs have many complications in practice.

  26. Effects of an instructional WhatsApp group on self-care and HbA1c among

    Aims and objectives To assess the effect of an instructional WhatsApp group on self-care and HbA1c levels among female patients with type 2 diabetes mellitus (T2DM). Background T2DM is a chronic disease that requires effective self-care. WhatsApp is a free application that can be effectively used for patient education. Design This study used a quasi-experimental design.

  27. Effect of Universal Credit on young children's mental health: quasi

    Employing a quasi-experimental study design, this research identified that the implementation of UC was linked to an increase in children's mental health issues. These findings suggested that benefits policy shocks affected not only adult recipients but also extended to their children, highlighting the broader impact of such policy changes on ...

  28. Compressive Properties and Fracture Behaviours of Ti/Al

    The triply periodic minimal surfaces (TPMS) structure is regarded as a highly promising artificial design, but the performance of composites constructed using this structure remains unexplored. Two porosity levels of Ti/Al interpenetrating phase composites (IPCs) were fabricated by infiltrating ZL102-Al melt into additive-manufactured TC4-Ti scaffolds with the TPMS porous in this study. The ...