An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

Indian Journal of Thoracic and Cardiovascular Surgery logo

The importance of randomization in clinical research

Varun sundaram, padmini selvaganesan, mohamad karnib.

  • Author information
  • Article notes
  • Copyright and License information

Corresponding author.

Received 2022 Apr 4; Revised 2022 Jul 1; Accepted 2022 Jul 21; Issue date 2022 Sep.

Studies evaluating average treatment effects (ATE) of an intervention could broadly be classified into those with observational and randomized designs. Observational studies are limited by confounding, in addition to selection and information bias, making the evaluation of ATE hypothesis generating and not hypothesis testing. Randomization attempts to reduce the systemic error introduced by observational studies by ensuring equal distribution of prognostic factors between the treatment and control groups, thereby confirming that any difference in outcomes observed between the two groups is attributable to the treatment. While randomized controlled trials (RCT) remain the gold standard in estimating ATE of therapeutic interventions, they do have inherent limitations due to uncertain external validity. Observational studies can have a complementary role in enhancing RCTs’ ability to inform routine clinical practice. In this review, we focus on the limitations of observational studies, the need for randomization, interpretation, and the limitations of RCTs.

Keywords: Randomization, RCT, Average treatment effects

Introduction

The field of cardiology and cardiac surgery has significantly evolved in the past 3 decades. For instance, the annual mortality of patients with heart failure and reduced ejection fraction (HFrEF) in the placebo group of the Cooperative North Scandinavian Enalapril Survival Study (CONSENSUS) trial (effects of enalapril on mortality in severe congestive heart failure trial) in 1987 was nearly 50% [ 1 ]. Contrastingly, the annual mortality in the placebo group of the most recently conducted Empagliflozin in Patients with Heart Failure, Reduced Ejection Fraction (EMPEROR-REDUCED) trial was only 7.5% [ 2 ]. This striking improvement in survival in HFrEF over the last 3 decades is largely attributable to robust randomized trials testing innovations in pharmacotherapy and device therapy [ 3 – 5 ]. In this review, the author will focus on the limitations of observational studies, the need for randomization, interpretation, and the limitations of randomized controlled trials (RCTs).

Limitations of observational studies

Studies evaluating average treatment effects (ATE) of an intervention could broadly be classified into those with observational and randomized designs. Observational studies evaluating treatment effects are often considered to be hypothesis generating and not hypothesis testing. This is owing to patients not being randomized to treatment and control groups, resulting in significant bias. Mauri et al. compared long-term outcomes of bare-metal vs. drug-eluting stent in patients with acute myocardial infarction (AMI). The investigators conducted this retrospective observational study on an unselected, population-based cohort of patients presenting with AMI to acute care, to nonfederal hospitals in Massachusetts (from the Massachusetts state registry). They observed a significant reduction in the hazard for 2-year mortality in favor of drug-eluting stents [ 6 ]. However, subsequent RCTs evaluating long-term outcomes between bare-metal and drug-eluting stents failed to demonstrate this observed difference in mortality [ 7 , 8 ]. Further investigation revealed that the differences in outcomes between the bare-metal and drug-eluting stents in the observational study were plausibly related to selection bias due to systematic pretreatment differences between the two groups, as opposed to the intervention (i.e., drug-eluting stents being systematically avoided in patients who had poor prognosis, such as those who were terminally ill, or with increased co-morbidity burden) [ 9 ]. In addition to selection bias , other important methodological limitations of observational research in estimating ATE include confounding and information bias . Confounding is the situation where the apparent association between the treatment and outcome can be entirely or partially explained by association with another risk factor. While advances and innovations in statistical methodologies have helped minimize confounding, residual confounding (from both measured and unmeasured covariates) is still a major issue in observational epidemiological research. Information bias, also known as observation, classification, or measurement bias, is related to the incorrect measurement of exposures and outcomes. The information on exposure and outcome should be gathered in the same way for both the treatment and control groups. However, in observational studies, the information is often gathered differentially for one group, than for another [ 10 , 11 ].

Need for randomization

Most treatments in modern medicine demonstrate modest effect size, i.e., 10–30% relative risk reduction, especially when it comes to hard clinical outcomes. Despite extensive adjustment and matching techniques, modest treatment effects could be plausibly missed due to systematic error introduced by observational studies. Large, randomized studies attempt to reduce systematic error, enabling investigators to elicit the modest treatment effects [ 10 ]. The process of randomization minimizes selection bias by ensuring equal distribution of prognostic factors between the treatment and control groups, thereby confirming that any difference in outcomes observed between the two groups is attributable to the treatment. Furthermore, randomization renders the treatment and control groups also comparable with regard to unknown or unmeasured prognostic factors, e.g., genetic factors, which might influence the outcome of interest [ 12 , 13 ]. Common protocols of data gathering for both treatment and control groups, coupled with blinding in a prospective RCT, mitigate the observer’s bias. However, information bias could still be an issue in surgical or procedural RCTs as they are often unblinded. Randomized trials are often done in a multicenter setting at a national or international level. Multicentric trials can lead to faster recruitment and increased sample size, and usually cover a broader population sample. Efforts should be made to maintain a similar quality of care across all centers regarding diagnostic criteria, treatments, interventions, and follow-up.

Methods of randomization

Randomization can occur using different methods. The main schemes include unrestricted, restricted, and stratified randomization.

“Unrestricted randomization” is also known as “simple randomization,” where the allocation of interventions to the study subjects occurs by generating a list derived from random numbers (e.g., flipping a coin). This method might result in an imbalance between groups if the number of participants is not large.

“Restricted” or “block randomization,” also referred to as “balanced randomization,” is a method by which allocation sequencing is split into blocks. There are equal numbers of each allocation (treatment and control) within the blocks. This method ensures that the number of study participants randomized to each arm of intervention is balanced. When using this method, it is crucial for the randomization to be double blinded and to conceal the size of the blocks to the investigators to avoid selection bias.

In “stratified randomization,” all study participants are stratified equally into sub-groups according to a selected risk factor, which is believed to be essential for the outcome (e.g., gender or age). Then, random allocation of interventions is carried out within the sub-groups.

Traditionally, RCTs are designed to demonstrate that one intervention is better than the other, labeled as superiority trials. Non-inferiority trials are RCTs where the new treatment is compared to the standard of care to show that it is not an unacceptably worse option.

Multiple RCTs can be combined in a meta-analysis, which can provide a summary of ATE. However, this comes with a significant risk of selection bias, mainly when selecting studies and extracting the data for the meta-analysis.

Interpretation of the strength of evidence of an RCT, using p values and confidence intervals

There are two main errors that are usually encountered in studies investigating ATE. The first error is the systematic error, which could be mitigated by randomization. However, this error cannot be quantified. The next is the random error, which can be quantified with the estimation of the p value. A well-done large RCT should have no systematic error, with minimal random error ( p  < 0.05).

(i) The first step in the interpretation of an RCT is to assess if the p value is strong enough to reject the null hypothesis. While the scientific community considers a significance level of 5% for the primary outcome ( p  < 0.05), a dichotomous interpretation of the p value as < or > 0.5 is a superficial interpretation and does not give information on the strength of evidence. This is akin to interpreting laboratory values as normal or abnormal (e.g., leukocytosis or no leukocytosis). A p value of < 0.001 indicates a 1/1000 chance effect of the observed findings, signifying evidence beyond reasonable doubt. In short, p values should be interpreted as a continuum, with a decreasing p value indicating increasing evidence against the null hypothesis. (ii) The next step in the interpretation of the strength of evidence would involve assessment of 95% confidence intervals (CI). A CI close to 1 indicates a null effect, and the closer the upper limit of 95% CI is to 1, the lesser is the evidence to reject the null hypothesis. Table 1 demonstrates various scenarios using p values and confidence intervals for the strength of evidence [ 14 – 17 ]. Scenario A is an example of an RCT where the level of evidence is beyond reasonable doubt ( p  < 0.001 and the upper limit of the 95% CI is 0.92, reasonably away from 1). Scenario B indicates evidence that demonstrates a moderately significant difference in favor of the treatment group ( p  = 0.02 and the upper limit of the 95% CI is 0.98: closer to 1). Scenario C and D are examples of decreasing evidence against the null hypothesis.

Summary of scenarios based on p values and confidence intervals

HR hazard ratio, CI confidence interval, DM diabetes mellitus

Limitations of randomized trials; clinical effectiveness vs clinical efficacy

RCTs are unquestionably the gold standard for determining therapeutic efficacy (i.e., performance of an intervention under ideal circumstances) due to their strong internal validity [ 12 ]. However, RCTs do have inherent limitations. The role of traditional RCTs in evaluating therapeutic effectiveness (i.e., performance of therapy under real-world conditions) may be restricted due to its uncertain external validity. Even after careful randomization is done while allocating treatments, generalization of ATE should be done with caution as the studied population may be very different from the population treated in normal life. Multiple studies have demonstrated highly efficacious therapies to be less effective in clinical practice. In traditional RCTs, the strict selection criteria to enroll a defined, homogenous patient population are considered the most common reason for the gap between therapeutic efficacy and effectiveness. The patient characteristics and the clinical outcomes of those enrolled in RCTs may be different than those seen in routine clinical practice. For instance, in the Optimal Medical Therapy with or without Percutaneous Coronary Intervention for Stable Coronary Disease (COURAGE) trial, only 10% of the patients initially screened were randomized, thereby excluding a significant proportion of patients seen in clinical practice, i.e., decreased external validity [ 18 ]. The investigators from the Scandinavian countries have attempted to overcome the limitation of external validity by performing studies using randomized registries. The Thrombus Aspiration during ST-Segment Elevation Myocardial Infarction (TASTE) trial and the Norwegian Coronary Stent Trial (NORSTENT) had randomized more than 90% of the initial screened patients, thereby increasing external validity without compromising internal validity [ 19 , 20 ].

RCTs are indispensable to clinical research and continue to remain the gold standard in estimating ATE of therapeutic interventions. Randomization continues to be the only reliable mechanism to eliminate or minimize systematic error and ensure uniform distribution of measured and unmeasured confounders between the treatment and control groups. While observational studies cannot replace RCTs, they can be performed to establish the clinical effectiveness, which may considerably enhance the ability of RCTs to inform routine clinical practice, guidelines, and health policy.

Declarations

Ethics approval.

Not applicable as this is a review.

Informed consent

Not applicable.

Human and animal rights

Conflict of interest.

The authors declare no competing interests.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • 1. CONSENSUS Trial Study Group Effects of enalapril on mortality in severe congestive heart failure. Results of the Cooperative North Scandinavian Enalapril Survival Study (CONSENSUS) N Engl J Med. 1987;316:1429–35. doi: 10.1056/NEJM198706043162301. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 2. Packer M, Anker SD, Butler J, et al. Cardiovascular and renal outcomes with empagliflozin in heart failure. N Engl J Med. 2020;383:1413–1424. doi: 10.1056/NEJMoa2022190. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 3. Cleland JGF, Daubert J-C, Erdmann E, et al. The effect of cardiac resynchronization on morbidity and mortality in heart failure. N Engl J Med. 2005;352:1539–1549. doi: 10.1056/NEJMoa050496. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 4. McMurray JJV, Packer M, Desai AS, et al. Angiotensin-neprilysin inhibition versus enalapril in heart failure. N Engl J Med. 2014;371:993–1004. doi: 10.1056/NEJMoa1409077. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 5. Packer M, Bristow MR, Cohn JN, et al. The effect of carvedilol on morbidity and mortality in patients with chronic heart failure. U.S. Carvedilol Heart Failure Study Group. N Engl J Med. 1996;334:1349–55. doi: 10.1056/NEJM199605233342101. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 6. Mauri L, Silbaugh TS, Garg P, et al. Drug-eluting or bare-metal stents for acute myocardial infarction. N Engl J Med. 2008;359:1330–1342. doi: 10.1056/NEJMoa0801485. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 7. Kastrati A, Dibra A, Spaulding C, et al. Meta-analysis of randomized trials on drug-eluting stents vs. bare-metal stents in patients with acute myocardial infarction. Eur Heart J. 2007;28:2706–13. doi: 10.1093/eurheartj/ehm402. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 8. Feinberg J, Nielsen EE, Greenhalgh J, et al. Drug-eluting stents versus bare-metal stents for acute coronary syndrome. Cochrane Database Syst Rev. 2017;8:CD012481. doi: 10.1002/14651858.CD012481.pub2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 9. Wong BYL. Drug-eluting versus bare-metal stents in acute myocardial infarction. N Engl J Med. 2009;360:300. doi: 10.1056/NEJMc082174. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 10. Grimes DA, Schulz KF. Bias and causal associations in observational research. Lancet. 2002;359:248–252. doi: 10.1016/S0140-6736(02)07451-2. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 11. Sackett DL. Bias in analytic research. J Chronic Dis. 1979;32:51–63. doi: 10.1016/0021-9681(79)90012-2. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 12. Rosenberger WF, Uschner D, Wang Y. Randomization: the forgotten component of the randomized clinical trial. Stat Med. 2019;38:1–12. doi: 10.1002/sim.7901. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 13. Baggerly K. Experimental design, randomization, and validation. Clin Chem. 2018;64:1534–1535. doi: 10.1373/clinchem.2017.273334. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 14. Wallentin L, Becker RC, Budaj A, et al. Ticagrelor versus clopidogrel in patients with acute coronary syndromes. N Engl J Med. 2009;361:1045–1057. doi: 10.1056/NEJMoa0904327. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 15. Ruwald ACH, Westergaard B, Sehestedt T, et al. Losartan versus atenolol-based antihypertensive treatment reduces cardiovascular events especially well in elderly patients: the Losartan Intervention For Endpoint reduction in hypertension (LIFE) study. J Hypertens. 2012;30:1252–1259. doi: 10.1097/HJH.0b013e328352f7f6. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 16. Pitt B, Pfeffer MA, Assmann SF, et al. Spironolactone for heart failure with preserved ejection fraction. N Engl J Med. 2014;370:1383–1392. doi: 10.1056/NEJMoa1313731. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 17. Raz I, Wilson PWF, Strojek K, et al. Effects of prandial versus fasting glycemia on cardiovascular outcomes in type 2 diabetes: the HEART2D trial. Diabetes Care. 2009;32:381–386. doi: 10.2337/dc08-1671. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 18. Boden WE, O'Rourke RA, Teo KK, et al. Optimal medical therapy with or without PCI for stable coronary disease. N Engl J Med. 2007;356:1503–1516. doi: 10.1056/NEJMoa070829. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 19. Fröbert O, Lagerqvist B, Olivecrona GK, et al. Thrombus aspiration during ST-segment elevation myocardial infarction. N Engl J Med. 2013;369:1587–1597. doi: 10.1056/NEJMoa1308789. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 20. Bønaa KH, Mannsverk J, Wiseth R, et al. Drug-eluting or bare-metal stents for coronary artery disease. N Engl J Med. 2016;375:1242–1252. doi: 10.1056/NEJMoa1607991. [ DOI ] [ PubMed ] [ Google Scholar ]
  • View on publisher site
  • PDF (634.6 KB)
  • Collections

Similar articles

Cited by other articles, links to ncbi databases.

  • Download .nbib .nbib
  • Format: AMA APA MLA NLM

Add to Collections

  • Search Menu
  • Sign in through your institution
  • Advance Articles
  • Author Guidelines
  • Open Access Policy
  • Self-Archiving Policy
  • About Significance
  • About The Royal Statistical Society
  • Editorial Board
  • Advertising & Corporate Services
  • X (formerly Twitter)
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

What is randomisation, why do we randomise, choosing a randomisation method, implementing the chosen randomisation method.

  • < Previous

Randomisation: What, Why and How?

  • Article contents
  • Figures & tables
  • Supplementary Data

Zoë Hoare, Randomisation: What, Why and How?, Significance , Volume 7, Issue 3, September 2010, Pages 136–138, https://doi.org/10.1111/j.1740-9713.2010.00443.x

  • Permissions Icon Permissions

Randomisation is a fundamental aspect of randomised controlled trials, but how many researchers fully understand what randomisation entails or what needs to be taken into consideration to implement it effectively and correctly? Here, for students or for those about to embark on setting up a trial, Zoë Hoare gives a basic introduction to help approach randomisation from a more informed direction.

Most trials of new medical treatments, and most other trials for that matter, now implement some form of randomisation. The idea sounds so simple that defining it becomes almost a joke: randomisation is “putting participants into the treatment groups randomly”. If only it were that simple. Randomisation can be a minefield, and not everyone understands what exactly it is or why they are doing it.

A key feature of a randomised controlled trial is that it is genuinely not known whether the new treatment is better than what is currently offered. The researchers should be in a state of equipoise; although they may hope that the new treatment is better, there is no definitive evidence to back this hypothesis up. This evidence is what the trial is trying to provide.

You will have, at its simplest, two groups: patients who are getting the new treatment, and those getting the control or placebo. You do not hand-select which patient goes into which group, because that would introduce selection bias. Instead you allocate your patients randomly. In its simplest form this can be done by the tossing of a fair coin: heads, the patient gets the trial treatment; tails, he gets the control. Simple randomisation is a fair way of ensuring that any differences that occur between the treatment groups arise completely by chance. But – and this is the first but of many here – simple randomisation can lead to unbalanced groups, that is, groups of unequal size. This is particularly true if the trial is only small. For example, tossing a fair coin 10 times will only result in five heads and five tails about 25% of the time. We would have a 66% chance of getting 6 heads and 4 tails, 5 and 5, or 4 and 6; 33% of the time we would get an even larger imbalance, with 7, 8, 9 or even all 10 patients in one group and the other group correspondingly undersized.

The impact of an imbalance like this is far greater for a small trial than for a larger trial. Tossing a fair coin 100 times will result in imbalance larger than 60–40 less than 1% of the time. One important part of the trial design process is the statement of intention of using randomisation; then we need to establish which method to use, when it will be used, and whether or not it is in fact random.

Randomisation needs to be controlled: You would not want all the males under 30 to be in one trial group and all the women over 70 in the other

It is partly true to say that we do it because we have to. The Consolidated Standards of Reporting Trials (CONSORT) 1 , to which we should all adhere, tells us: “Ideally, participants should be assigned to comparison groups in the trial on the basis of a chance (random) process characterized by unpredictability.” The requirement is there for a reason. Randomisation of the participants is crucial because it allows the principles of statistical theory to stand and as such allows a thorough analysis of the trial data without bias. The exact method of randomisation can have an impact on the trial analyses, and this needs to be taken into account when writing the statistical analysis plan.

Ideally, simple randomisation would always be the preferred option. However, in practice there often needs to be some control of the allocations to avoid severe imbalances within treatments or within categories of patient. You would not want, for example, all the males under 30 to be in one group and all the females over 70 in the other. This is where restricted or stratified randomisation comes in.

Restricted randomisation relates to using any method to control the split of allocations to each of the treatment groups based on certain criteria. This can be as simple as generating a random list, such as AAABBBABABAABB …, and allocating each participant as they arrive to the next treatment on the list. At certain points within the allocations we know that the groups will be balanced in numbers – here at the sixth, eighth, tenth and 14th participants – and we can control the maximum imbalance at any one time.

Stratified randomisation sets out to control the balance in certain baseline characteristics of the participants – such as sex or age. This can be thought of as producing an individual randomisation list for each of the characteristics concerned.

© iStockphoto.com/dra_schwartz

© iStockphoto.com/dra_schwartz

Stratification variables are the baseline characteristics that you think might influence the outcome your trial is trying to measure. For example, if you thought gender was going to have an effect on the efficacy of the treatment then you would use it as one of your stratification variables. A stratified randomisation procedure would aim to ensure a balance of the two gender groups between the two treatment groups.

If you also thought age would be affecting the treatment then you could also stratify by age (young/old) with some sensible limits on what old and young are. Once you start stratifying by age and by gender, you have to start taking care. You will need to use a stratified randomisation process that balances at the stratum level (i.e. at the level of those characteristics) to ensure that all four strata (male/young, male/old, female/young and female/old) have equivalent numbers of each of the treatment groups represented.

“Great”, you might think. “I'll just stratify by all my baseline characteristics!” Better not. Stop and consider what this would mean. As the number of stratification variables increases linearly, the number of strata increases exponentially. This reduces the number of participants that would appear in each stratum. In our example above, with our two stratification variables of age and sex we had four strata; if we added, say “blue-eyed” and “overweight” to our criteria to give four stratification variables each with just two levels we would get 16 represented strata. How likely is it that each of those strata will be represented in the population targeted by the trial? In other words, will we be sure of finding a blue-eyed young male who is also overweight among our patients? And would one such overweight possible Adonis be statistically enough? It becomes evident that implementing pre-generated lists within each stratification level or stratum and maintaining an overall balance of group sizes becomes much more complicated with many stratification variables and the uncertainty of what type of participant will walk through the door next.

Does it matter? There are a wide variety of methods for randomisation, and which one you choose does actually matter. It needs to be able to do everything that is required of it. Ask yourself these questions, and others:

Can the method accommodate enough treatment groups? Some methods are limited to two treatment groups; many trials involve three or more.

What type of randomness, if any, is injected into the method? The level of randomness dictates how predictable a method is.

A deterministic method has no randomness, meaning that with all the previous information you can tell in advance which group the next patient to appear will be allocated to. Allocating alternate participants to the two treatments using ABABABABAB … would be an example.

A static random element means that each allocation is made with a pre-defined probability. The coin-toss method does this.

With a dynamic element the probability of allocation is always changing in relation to the information received, meaning that the probability of allocation can only be worked out with knowledge of the algorithm together with all its settings. A biased coin toss does this where the bias is recalculated for each participant.

Can the method accommodate stratification variables, and if so how many? Not all of them can. And can it cope with continuous stratification variables? Most variables are divided into mutually exclusive categories (e.g. male or female), but sometimes it may be necessary (or preferable) to use a continuous scale of the variable – such as weight, or body mass index.

Can the method use an unequal allocation ratio? Not all trials require equal-sized treatment groups. There are many reasons why it might be wise to have more patients receiving treatment A than treatment B 2 . However, an allocation ratio being something other than 1:1 does impact on the study design and on the calculation of the sample size, so is not something to be changing mid-trial. Not all allocation methods can cope with this inequality.

Is thresholding used in the method? Thresholding handles imbalances in allocation. A threshold is set and if the imbalance becomes greater than the threshold then the allocation becomes deterministic to reduce the imbalance back below the threshold.

Can the method be implemented sequentially? In other words, does it require that the total number of participants be known at the beginning of the allocations? Some methods generate lists requiring exactly N participants to be recruited in order to be effective – and recruiting participants is often one of the more problematic parts of a trial.

Is the method complex? If so, then its practical implementation becomes an issue for the day-to-day running of the trial.

Is the method suitable to apply to a cluster randomisation? Cluster randomisations are used when randomising groups of individuals to a treatment rather than the individuals themselves. This can be due to the nature of the treatment, such as a new teaching method for schools or a dietary intervention for families. Using clusters is a big part of the trial design and the randomisation needs to be handled slightly differently.

Should a response-adaptive method be considered? If there is some evidence that one treatment is better than another, then a response-adaptive method works by taking into account the outcomes of previous allocations and works to minimise the number of participants on the “wrong” treatment.

For multi-centred trials, how to handle the randomisations across the centres should be considered at this point. Do all centres need to be completely balanced? Are all centres the same size? Considering the various centres as stratification variables is one way of dealing with more than one centre.

Once the method of randomisation has been established the next important step is to consider how to implement it. The recommended way is to enlist the services of a central randomisation office that can offer robust, validated techniques with the security and back-up needed to implement many of the methods proposed today. How the method is implemented must be as clearly reported as the method chosen. As part of the implementation it is important to keep the allocations concealed, both those already done and any future ones, from as many people as possible. This helps prevent selection bias: a clinician may withhold a participant if he believes that based on previous allocations the next allocations would not be the “preferred” ones – see the section below on subversion.

Part of the trial design will be to note exactly who should know what about how each participant has been allocated. Researchers and participants may be equally blinded, but that is not always the case.

For example, in a blinded trial there may be researchers who do not know which group the participants have been allocated to. This enables them to conduct the assessments without any bias for the allocation. They may, however, start to guess, on the basis of the results they see. A measure of blinding may be incorporated for the researchers to indicate whether they have remained blind to the treatment allocated. This can be in the form of a simple scale tool for the researcher to indicate how confident they are in knowing which allocated group the participant is in by the end of an assessment. With psychosocial interventions it is often impossible to hide from the participants, let alone the clinicians, which treatment group they have been allocated to.

In a drug trial where a placebo can be prescribed a coded system can ensure that neither patients nor researchers know which group is which until after the analysis stage.

With any level of blinding there may be a requirement to unblind participants or clinicians at any point in the trial, and there should be a documented procedure drawn up on how to unblind a particular participant without risking the unblinding of a trial. For drug trials in particular, the methods for unblinding a participant must be stated in the trial protocol. Wherever possible the data analysts and statisticians should remain blind to the allocation until after the main analysis has taken place.

Blinding should not be confused with allocation concealment. Blinding prevents performance and ascertainment bias within a trial, while allocation concealment prevents selection bias. Bias introduced by poor allocation concealment may be thought of as a predictive bias, trying to influence the results from the outset, while the biases introduced by non-blinding can be thought of as a reactive bias, creating causal links in outcomes because of being in possession of information about the treatment group.

In the literature on randomisation there are numerous tales of how allocation schemes have been subverted by clinicians trying to do the best for the trial or for their patient or both. This includes anecdotal tales of clinicians holding sealed envelopes containing the allocations up to X-ray lights and confessing to breaking into locked filing cabinets to get at the codes 3 . This type of behaviour has many explanations and reasons, but does raise the question whether these clinicians were in a state of equipoise with regard to the trial, and whether therefore they should really have been involved with the trial. Randomisation schemes and their implications must be signed up to by the whole team and are not something that only the participants need to consent to.

Clinicians have been known to X-ray sealed allocation envelopes to try to get their patients into the preferred group in a trial

The 2010 CONSORT statement can be found at http://www.consort-statement.org/consort-statement/ .

Dumville , J. C. , Hahn , S. , Miles , J. N. V. and Torgerson , D. J. ( 2006 ) The use of unequal randomisation ratios in clinical trials: A review . Contemporary Clinical Trials , 27 , 1 – 12 .

Google Scholar

Shulz , K. F. ( 1995 ) Subverting randomisation in controlled trials . Journal of the American Medical Association , 274 , 1456 – 1458 .

Email alerts

Citing articles via.

  • Recommend to Your Librarian
  • Advertising & Corporate Services
  • Journals Career Network

Affiliations

  • Online ISSN 1740-9713
  • Print ISSN 1740-9705
  • Copyright © 2024 Royal Statistical Society
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

IMAGES

  1. PPT

    role of randomization in experimental design

  2. PPT

    role of randomization in experimental design

  3. PPT

    role of randomization in experimental design

  4. Experimental Designs 1 Completely Randomized Design 2 Randomized

    role of randomization in experimental design

  5. Chapter 19 Experimental Design

    role of randomization in experimental design

  6. Introduction to Experimental Designs; Principles; Randomization; Replication; Local Control

    role of randomization in experimental design

VIDEO

  1. 1.4b Experimental method of data collection

  2. CHARACTERISTICS OF EXPERIMENTAL RESEARCH

  3. QUASI

  4. Design of Experiments

  5. Experimental design part 5

  6. Experimental Design Explainer Part 2