Home Blog Design Understanding Data Presentations (Guide + Examples)

Understanding Data Presentations (Guide + Examples)

Cover for guide on data presentation by SlideModel

In this age of overwhelming information, the skill to effectively convey data has become extremely valuable. Initiating a discussion on data presentation types involves thoughtful consideration of the nature of your data and the message you aim to convey. Different types of visualizations serve distinct purposes. Whether you’re dealing with how to develop a report or simply trying to communicate complex information, how you present data influences how well your audience understands and engages with it. This extensive guide leads you through the different ways of data presentation.

Table of Contents

What is a Data Presentation?

What should a data presentation include, line graphs, treemap chart, scatter plot, how to choose a data presentation type, recommended data presentation templates, common mistakes done in data presentation.

A data presentation is a slide deck that aims to disclose quantitative information to an audience through the use of visual formats and narrative techniques derived from data analysis, making complex data understandable and actionable. This process requires a series of tools, such as charts, graphs, tables, infographics, dashboards, and so on, supported by concise textual explanations to improve understanding and boost retention rate.

Data presentations require us to cull data in a format that allows the presenter to highlight trends, patterns, and insights so that the audience can act upon the shared information. In a few words, the goal of data presentations is to enable viewers to grasp complicated concepts or trends quickly, facilitating informed decision-making or deeper analysis.

Data presentations go beyond the mere usage of graphical elements. Seasoned presenters encompass visuals with the art of data storytelling , so the speech skillfully connects the points through a narrative that resonates with the audience. Depending on the purpose – inspire, persuade, inform, support decision-making processes, etc. – is the data presentation format that is better suited to help us in this journey.

To nail your upcoming data presentation, ensure to count with the following elements:

  • Clear Objectives: Understand the intent of your presentation before selecting the graphical layout and metaphors to make content easier to grasp.
  • Engaging introduction: Use a powerful hook from the get-go. For instance, you can ask a big question or present a problem that your data will answer. Take a look at our guide on how to start a presentation for tips & insights.
  • Structured Narrative: Your data presentation must tell a coherent story. This means a beginning where you present the context, a middle section in which you present the data, and an ending that uses a call-to-action. Check our guide on presentation structure for further information.
  • Visual Elements: These are the charts, graphs, and other elements of visual communication we ought to use to present data. This article will cover one by one the different types of data representation methods we can use, and provide further guidance on choosing between them.
  • Insights and Analysis: This is not just showcasing a graph and letting people get an idea about it. A proper data presentation includes the interpretation of that data, the reason why it’s included, and why it matters to your research.
  • Conclusion & CTA: Ending your presentation with a call to action is necessary. Whether you intend to wow your audience into acquiring your services, inspire them to change the world, or whatever the purpose of your presentation, there must be a stage in which you convey all that you shared and show the path to staying in touch. Plan ahead whether you want to use a thank-you slide, a video presentation, or which method is apt and tailored to the kind of presentation you deliver.
  • Q&A Session: After your speech is concluded, allocate 3-5 minutes for the audience to raise any questions about the information you disclosed. This is an extra chance to establish your authority on the topic. Check our guide on questions and answer sessions in presentations here.

Bar charts are a graphical representation of data using rectangular bars to show quantities or frequencies in an established category. They make it easy for readers to spot patterns or trends. Bar charts can be horizontal or vertical, although the vertical format is commonly known as a column chart. They display categorical, discrete, or continuous variables grouped in class intervals [1] . They include an axis and a set of labeled bars horizontally or vertically. These bars represent the frequencies of variable values or the values themselves. Numbers on the y-axis of a vertical bar chart or the x-axis of a horizontal bar chart are called the scale.

Presentation of the data through bar charts

Real-Life Application of Bar Charts

Let’s say a sales manager is presenting sales to their audience. Using a bar chart, he follows these steps.

Step 1: Selecting Data

The first step is to identify the specific data you will present to your audience.

The sales manager has highlighted these products for the presentation.

  • Product A: Men’s Shoes
  • Product B: Women’s Apparel
  • Product C: Electronics
  • Product D: Home Decor

Step 2: Choosing Orientation

Opt for a vertical layout for simplicity. Vertical bar charts help compare different categories in case there are not too many categories [1] . They can also help show different trends. A vertical bar chart is used where each bar represents one of the four chosen products. After plotting the data, it is seen that the height of each bar directly represents the sales performance of the respective product.

It is visible that the tallest bar (Electronics – Product C) is showing the highest sales. However, the shorter bars (Women’s Apparel – Product B and Home Decor – Product D) need attention. It indicates areas that require further analysis or strategies for improvement.

Step 3: Colorful Insights

Different colors are used to differentiate each product. It is essential to show a color-coded chart where the audience can distinguish between products.

  • Men’s Shoes (Product A): Yellow
  • Women’s Apparel (Product B): Orange
  • Electronics (Product C): Violet
  • Home Decor (Product D): Blue

Accurate bar chart representation of data with a color coded legend

Bar charts are straightforward and easily understandable for presenting data. They are versatile when comparing products or any categorical data [2] . Bar charts adapt seamlessly to retail scenarios. Despite that, bar charts have a few shortcomings. They cannot illustrate data trends over time. Besides, overloading the chart with numerous products can lead to visual clutter, diminishing its effectiveness.

For more information, check our collection of bar chart templates for PowerPoint .

Line graphs help illustrate data trends, progressions, or fluctuations by connecting a series of data points called ‘markers’ with straight line segments. This provides a straightforward representation of how values change [5] . Their versatility makes them invaluable for scenarios requiring a visual understanding of continuous data. In addition, line graphs are also useful for comparing multiple datasets over the same timeline. Using multiple line graphs allows us to compare more than one data set. They simplify complex information so the audience can quickly grasp the ups and downs of values. From tracking stock prices to analyzing experimental results, you can use line graphs to show how data changes over a continuous timeline. They show trends with simplicity and clarity.

Real-life Application of Line Graphs

To understand line graphs thoroughly, we will use a real case. Imagine you’re a financial analyst presenting a tech company’s monthly sales for a licensed product over the past year. Investors want insights into sales behavior by month, how market trends may have influenced sales performance and reception to the new pricing strategy. To present data via a line graph, you will complete these steps.

First, you need to gather the data. In this case, your data will be the sales numbers. For example:

  • January: $45,000
  • February: $55,000
  • March: $45,000
  • April: $60,000
  • May: $ 70,000
  • June: $65,000
  • July: $62,000
  • August: $68,000
  • September: $81,000
  • October: $76,000
  • November: $87,000
  • December: $91,000

After choosing the data, the next step is to select the orientation. Like bar charts, you can use vertical or horizontal line graphs. However, we want to keep this simple, so we will keep the timeline (x-axis) horizontal while the sales numbers (y-axis) vertical.

Step 3: Connecting Trends

After adding the data to your preferred software, you will plot a line graph. In the graph, each month’s sales are represented by data points connected by a line.

Line graph in data presentation

Step 4: Adding Clarity with Color

If there are multiple lines, you can also add colors to highlight each one, making it easier to follow.

Line graphs excel at visually presenting trends over time. These presentation aids identify patterns, like upward or downward trends. However, too many data points can clutter the graph, making it harder to interpret. Line graphs work best with continuous data but are not suitable for categories.

For more information, check our collection of line chart templates for PowerPoint and our article about how to make a presentation graph .

A data dashboard is a visual tool for analyzing information. Different graphs, charts, and tables are consolidated in a layout to showcase the information required to achieve one or more objectives. Dashboards help quickly see Key Performance Indicators (KPIs). You don’t make new visuals in the dashboard; instead, you use it to display visuals you’ve already made in worksheets [3] .

Keeping the number of visuals on a dashboard to three or four is recommended. Adding too many can make it hard to see the main points [4]. Dashboards can be used for business analytics to analyze sales, revenue, and marketing metrics at a time. They are also used in the manufacturing industry, as they allow users to grasp the entire production scenario at the moment while tracking the core KPIs for each line.

Real-Life Application of a Dashboard

Consider a project manager presenting a software development project’s progress to a tech company’s leadership team. He follows the following steps.

Step 1: Defining Key Metrics

To effectively communicate the project’s status, identify key metrics such as completion status, budget, and bug resolution rates. Then, choose measurable metrics aligned with project objectives.

Step 2: Choosing Visualization Widgets

After finalizing the data, presentation aids that align with each metric are selected. For this project, the project manager chooses a progress bar for the completion status and uses bar charts for budget allocation. Likewise, he implements line charts for bug resolution rates.

Data analysis presentation example

Step 3: Dashboard Layout

Key metrics are prominently placed in the dashboard for easy visibility, and the manager ensures that it appears clean and organized.

Dashboards provide a comprehensive view of key project metrics. Users can interact with data, customize views, and drill down for detailed analysis. However, creating an effective dashboard requires careful planning to avoid clutter. Besides, dashboards rely on the availability and accuracy of underlying data sources.

For more information, check our article on how to design a dashboard presentation , and discover our collection of dashboard PowerPoint templates .

Treemap charts represent hierarchical data structured in a series of nested rectangles [6] . As each branch of the ‘tree’ is given a rectangle, smaller tiles can be seen representing sub-branches, meaning elements on a lower hierarchical level than the parent rectangle. Each one of those rectangular nodes is built by representing an area proportional to the specified data dimension.

Treemaps are useful for visualizing large datasets in compact space. It is easy to identify patterns, such as which categories are dominant. Common applications of the treemap chart are seen in the IT industry, such as resource allocation, disk space management, website analytics, etc. Also, they can be used in multiple industries like healthcare data analysis, market share across different product categories, or even in finance to visualize portfolios.

Real-Life Application of a Treemap Chart

Let’s consider a financial scenario where a financial team wants to represent the budget allocation of a company. There is a hierarchy in the process, so it is helpful to use a treemap chart. In the chart, the top-level rectangle could represent the total budget, and it would be subdivided into smaller rectangles, each denoting a specific department. Further subdivisions within these smaller rectangles might represent individual projects or cost categories.

Step 1: Define Your Data Hierarchy

While presenting data on the budget allocation, start by outlining the hierarchical structure. The sequence will be like the overall budget at the top, followed by departments, projects within each department, and finally, individual cost categories for each project.

  • Top-level rectangle: Total Budget
  • Second-level rectangles: Departments (Engineering, Marketing, Sales)
  • Third-level rectangles: Projects within each department
  • Fourth-level rectangles: Cost categories for each project (Personnel, Marketing Expenses, Equipment)

Step 2: Choose a Suitable Tool

It’s time to select a data visualization tool supporting Treemaps. Popular choices include Tableau, Microsoft Power BI, PowerPoint, or even coding with libraries like D3.js. It is vital to ensure that the chosen tool provides customization options for colors, labels, and hierarchical structures.

Here, the team uses PowerPoint for this guide because of its user-friendly interface and robust Treemap capabilities.

Step 3: Make a Treemap Chart with PowerPoint

After opening the PowerPoint presentation, they chose “SmartArt” to form the chart. The SmartArt Graphic window has a “Hierarchy” category on the left.  Here, you will see multiple options. You can choose any layout that resembles a Treemap. The “Table Hierarchy” or “Organization Chart” options can be adapted. The team selects the Table Hierarchy as it looks close to a Treemap.

Step 5: Input Your Data

After that, a new window will open with a basic structure. They add the data one by one by clicking on the text boxes. They start with the top-level rectangle, representing the total budget.  

Treemap used for presenting data

Step 6: Customize the Treemap

By clicking on each shape, they customize its color, size, and label. At the same time, they can adjust the font size, style, and color of labels by using the options in the “Format” tab in PowerPoint. Using different colors for each level enhances the visual difference.

Treemaps excel at illustrating hierarchical structures. These charts make it easy to understand relationships and dependencies. They efficiently use space, compactly displaying a large amount of data, reducing the need for excessive scrolling or navigation. Additionally, using colors enhances the understanding of data by representing different variables or categories.

In some cases, treemaps might become complex, especially with deep hierarchies.  It becomes challenging for some users to interpret the chart. At the same time, displaying detailed information within each rectangle might be constrained by space. It potentially limits the amount of data that can be shown clearly. Without proper labeling and color coding, there’s a risk of misinterpretation.

A heatmap is a data visualization tool that uses color coding to represent values across a two-dimensional surface. In these, colors replace numbers to indicate the magnitude of each cell. This color-shaded matrix display is valuable for summarizing and understanding data sets with a glance [7] . The intensity of the color corresponds to the value it represents, making it easy to identify patterns, trends, and variations in the data.

As a tool, heatmaps help businesses analyze website interactions, revealing user behavior patterns and preferences to enhance overall user experience. In addition, companies use heatmaps to assess content engagement, identifying popular sections and areas of improvement for more effective communication. They excel at highlighting patterns and trends in large datasets, making it easy to identify areas of interest.

We can implement heatmaps to express multiple data types, such as numerical values, percentages, or even categorical data. Heatmaps help us easily spot areas with lots of activity, making them helpful in figuring out clusters [8] . When making these maps, it is important to pick colors carefully. The colors need to show the differences between groups or levels of something. And it is good to use colors that people with colorblindness can easily see.

Check our detailed guide on how to create a heatmap here. Also discover our collection of heatmap PowerPoint templates .

Pie charts are circular statistical graphics divided into slices to illustrate numerical proportions. Each slice represents a proportionate part of the whole, making it easy to visualize the contribution of each component to the total.

The size of the pie charts is influenced by the value of data points within each pie. The total of all data points in a pie determines its size. The pie with the highest data points appears as the largest, whereas the others are proportionally smaller. However, you can present all pies of the same size if proportional representation is not required [9] . Sometimes, pie charts are difficult to read, or additional information is required. A variation of this tool can be used instead, known as the donut chart , which has the same structure but a blank center, creating a ring shape. Presenters can add extra information, and the ring shape helps to declutter the graph.

Pie charts are used in business to show percentage distribution, compare relative sizes of categories, or present straightforward data sets where visualizing ratios is essential.

Real-Life Application of Pie Charts

Consider a scenario where you want to represent the distribution of the data. Each slice of the pie chart would represent a different category, and the size of each slice would indicate the percentage of the total portion allocated to that category.

Step 1: Define Your Data Structure

Imagine you are presenting the distribution of a project budget among different expense categories.

  • Column A: Expense Categories (Personnel, Equipment, Marketing, Miscellaneous)
  • Column B: Budget Amounts ($40,000, $30,000, $20,000, $10,000) Column B represents the values of your categories in Column A.

Step 2: Insert a Pie Chart

Using any of the accessible tools, you can create a pie chart. The most convenient tools for forming a pie chart in a presentation are presentation tools such as PowerPoint or Google Slides.  You will notice that the pie chart assigns each expense category a percentage of the total budget by dividing it by the total budget.

For instance:

  • Personnel: $40,000 / ($40,000 + $30,000 + $20,000 + $10,000) = 40%
  • Equipment: $30,000 / ($40,000 + $30,000 + $20,000 + $10,000) = 30%
  • Marketing: $20,000 / ($40,000 + $30,000 + $20,000 + $10,000) = 20%
  • Miscellaneous: $10,000 / ($40,000 + $30,000 + $20,000 + $10,000) = 10%

You can make a chart out of this or just pull out the pie chart from the data.

Pie chart template in data presentation

3D pie charts and 3D donut charts are quite popular among the audience. They stand out as visual elements in any presentation slide, so let’s take a look at how our pie chart example would look in 3D pie chart format.

3D pie chart in data presentation

Step 03: Results Interpretation

The pie chart visually illustrates the distribution of the project budget among different expense categories. Personnel constitutes the largest portion at 40%, followed by equipment at 30%, marketing at 20%, and miscellaneous at 10%. This breakdown provides a clear overview of where the project funds are allocated, which helps in informed decision-making and resource management. It is evident that personnel are a significant investment, emphasizing their importance in the overall project budget.

Pie charts provide a straightforward way to represent proportions and percentages. They are easy to understand, even for individuals with limited data analysis experience. These charts work well for small datasets with a limited number of categories.

However, a pie chart can become cluttered and less effective in situations with many categories. Accurate interpretation may be challenging, especially when dealing with slight differences in slice sizes. In addition, these charts are static and do not effectively convey trends over time.

For more information, check our collection of pie chart templates for PowerPoint .

Histograms present the distribution of numerical variables. Unlike a bar chart that records each unique response separately, histograms organize numeric responses into bins and show the frequency of reactions within each bin [10] . The x-axis of a histogram shows the range of values for a numeric variable. At the same time, the y-axis indicates the relative frequencies (percentage of the total counts) for that range of values.

Whenever you want to understand the distribution of your data, check which values are more common, or identify outliers, histograms are your go-to. Think of them as a spotlight on the story your data is telling. A histogram can provide a quick and insightful overview if you’re curious about exam scores, sales figures, or any numerical data distribution.

Real-Life Application of a Histogram

In the histogram data analysis presentation example, imagine an instructor analyzing a class’s grades to identify the most common score range. A histogram could effectively display the distribution. It will show whether most students scored in the average range or if there are significant outliers.

Step 1: Gather Data

He begins by gathering the data. The scores of each student in class are gathered to analyze exam scores.

NamesScore
Alice78
Bob85
Clara92
David65
Emma72
Frank88
Grace76
Henry95
Isabel81
Jack70
Kate60
Liam89
Mia75
Noah84
Olivia92

After arranging the scores in ascending order, bin ranges are set.

Step 2: Define Bins

Bins are like categories that group similar values. Think of them as buckets that organize your data. The presenter decides how wide each bin should be based on the range of the values. For instance, the instructor sets the bin ranges based on score intervals: 60-69, 70-79, 80-89, and 90-100.

Step 3: Count Frequency

Now, he counts how many data points fall into each bin. This step is crucial because it tells you how often specific ranges of values occur. The result is the frequency distribution, showing the occurrences of each group.

Here, the instructor counts the number of students in each category.

  • 60-69: 1 student (Kate)
  • 70-79: 4 students (David, Emma, Grace, Jack)
  • 80-89: 7 students (Alice, Bob, Frank, Isabel, Liam, Mia, Noah)
  • 90-100: 3 students (Clara, Henry, Olivia)

Step 4: Create the Histogram

It’s time to turn the data into a visual representation. Draw a bar for each bin on a graph. The width of the bar should correspond to the range of the bin, and the height should correspond to the frequency.  To make your histogram understandable, label the X and Y axes.

In this case, the X-axis should represent the bins (e.g., test score ranges), and the Y-axis represents the frequency.

Histogram in Data Presentation

The histogram of the class grades reveals insightful patterns in the distribution. Most students, with seven students, fall within the 80-89 score range. The histogram provides a clear visualization of the class’s performance. It showcases a concentration of grades in the upper-middle range with few outliers at both ends. This analysis helps in understanding the overall academic standing of the class. It also identifies the areas for potential improvement or recognition.

Thus, histograms provide a clear visual representation of data distribution. They are easy to interpret, even for those without a statistical background. They apply to various types of data, including continuous and discrete variables. One weak point is that histograms do not capture detailed patterns in students’ data, with seven compared to other visualization methods.

A scatter plot is a graphical representation of the relationship between two variables. It consists of individual data points on a two-dimensional plane. This plane plots one variable on the x-axis and the other on the y-axis. Each point represents a unique observation. It visualizes patterns, trends, or correlations between the two variables.

Scatter plots are also effective in revealing the strength and direction of relationships. They identify outliers and assess the overall distribution of data points. The points’ dispersion and clustering reflect the relationship’s nature, whether it is positive, negative, or lacks a discernible pattern. In business, scatter plots assess relationships between variables such as marketing cost and sales revenue. They help present data correlations and decision-making.

Real-Life Application of Scatter Plot

A group of scientists is conducting a study on the relationship between daily hours of screen time and sleep quality. After reviewing the data, they managed to create this table to help them build a scatter plot graph:

Participant IDDaily Hours of Screen TimeSleep Quality Rating
193
228
319
4010
519
637
747
856
956
1073
11101
1265
1373
1482
1592
1647
1756
1847
1992
2064
2137
22101
2328
2456
2537
2619
2782
2846
2973
3028
3174
3292
33101
34101
35101

In the provided example, the x-axis represents Daily Hours of Screen Time, and the y-axis represents the Sleep Quality Rating.

Scatter plot in data presentation

The scientists observe a negative correlation between the amount of screen time and the quality of sleep. This is consistent with their hypothesis that blue light, especially before bedtime, has a significant impact on sleep quality and metabolic processes.

There are a few things to remember when using a scatter plot. Even when a scatter diagram indicates a relationship, it doesn’t mean one variable affects the other. A third factor can influence both variables. The more the plot resembles a straight line, the stronger the relationship is perceived [11] . If it suggests no ties, the observed pattern might be due to random fluctuations in data. When the scatter diagram depicts no correlation, whether the data might be stratified is worth considering.

Choosing the appropriate data presentation type is crucial when making a presentation . Understanding the nature of your data and the message you intend to convey will guide this selection process. For instance, when showcasing quantitative relationships, scatter plots become instrumental in revealing correlations between variables. If the focus is on emphasizing parts of a whole, pie charts offer a concise display of proportions. Histograms, on the other hand, prove valuable for illustrating distributions and frequency patterns. 

Bar charts provide a clear visual comparison of different categories. Likewise, line charts excel in showcasing trends over time, while tables are ideal for detailed data examination. Starting a presentation on data presentation types involves evaluating the specific information you want to communicate and selecting the format that aligns with your message. This ensures clarity and resonance with your audience from the beginning of your presentation.

1. Fact Sheet Dashboard for Data Presentation

data presentation methods in quantitative research

Convey all the data you need to present in this one-pager format, an ideal solution tailored for users looking for presentation aids. Global maps, donut chats, column graphs, and text neatly arranged in a clean layout presented in light and dark themes.

Use This Template

2. 3D Column Chart Infographic PPT Template

data presentation methods in quantitative research

Represent column charts in a highly visual 3D format with this PPT template. A creative way to present data, this template is entirely editable, and we can craft either a one-page infographic or a series of slides explaining what we intend to disclose point by point.

3. Data Circles Infographic PowerPoint Template

data presentation methods in quantitative research

An alternative to the pie chart and donut chart diagrams, this template features a series of curved shapes with bubble callouts as ways of presenting data. Expand the information for each arch in the text placeholder areas.

4. Colorful Metrics Dashboard for Data Presentation

data presentation methods in quantitative research

This versatile dashboard template helps us in the presentation of the data by offering several graphs and methods to convert numbers into graphics. Implement it for e-commerce projects, financial projections, project development, and more.

5. Animated Data Presentation Tools for PowerPoint & Google Slides

Canvas Shape Tree Diagram Template

A slide deck filled with most of the tools mentioned in this article, from bar charts, column charts, treemap graphs, pie charts, histogram, etc. Animated effects make each slide look dynamic when sharing data with stakeholders.

6. Statistics Waffle Charts PPT Template for Data Presentations

data presentation methods in quantitative research

This PPT template helps us how to present data beyond the typical pie chart representation. It is widely used for demographics, so it’s a great fit for marketing teams, data science professionals, HR personnel, and more.

7. Data Presentation Dashboard Template for Google Slides

data presentation methods in quantitative research

A compendium of tools in dashboard format featuring line graphs, bar charts, column charts, and neatly arranged placeholder text areas. 

8. Weather Dashboard for Data Presentation

data presentation methods in quantitative research

Share weather data for agricultural presentation topics, environmental studies, or any kind of presentation that requires a highly visual layout for weather forecasting on a single day. Two color themes are available.

9. Social Media Marketing Dashboard Data Presentation Template

data presentation methods in quantitative research

Intended for marketing professionals, this dashboard template for data presentation is a tool for presenting data analytics from social media channels. Two slide layouts featuring line graphs and column charts.

10. Project Management Summary Dashboard Template

data presentation methods in quantitative research

A tool crafted for project managers to deliver highly visual reports on a project’s completion, the profits it delivered for the company, and expenses/time required to execute it. 4 different color layouts are available.

11. Profit & Loss Dashboard for PowerPoint and Google Slides

data presentation methods in quantitative research

A must-have for finance professionals. This typical profit & loss dashboard includes progress bars, donut charts, column charts, line graphs, and everything that’s required to deliver a comprehensive report about a company’s financial situation.

Overwhelming visuals

One of the mistakes related to using data-presenting methods is including too much data or using overly complex visualizations. They can confuse the audience and dilute the key message.

Inappropriate chart types

Choosing the wrong type of chart for the data at hand can lead to misinterpretation. For example, using a pie chart for data that doesn’t represent parts of a whole is not right.

Lack of context

Failing to provide context or sufficient labeling can make it challenging for the audience to understand the significance of the presented data.

Inconsistency in design

Using inconsistent design elements and color schemes across different visualizations can create confusion and visual disarray.

Failure to provide details

Simply presenting raw data without offering clear insights or takeaways can leave the audience without a meaningful conclusion.

Lack of focus

Not having a clear focus on the key message or main takeaway can result in a presentation that lacks a central theme.

Visual accessibility issues

Overlooking the visual accessibility of charts and graphs can exclude certain audience members who may have difficulty interpreting visual information.

In order to avoid these mistakes in data presentation, presenters can benefit from using presentation templates . These templates provide a structured framework. They ensure consistency, clarity, and an aesthetically pleasing design, enhancing data communication’s overall impact.

Understanding and choosing data presentation types are pivotal in effective communication. Each method serves a unique purpose, so selecting the appropriate one depends on the nature of the data and the message to be conveyed. The diverse array of presentation types offers versatility in visually representing information, from bar charts showing values to pie charts illustrating proportions. 

Using the proper method enhances clarity, engages the audience, and ensures that data sets are not just presented but comprehensively understood. By appreciating the strengths and limitations of different presentation types, communicators can tailor their approach to convey information accurately, developing a deeper connection between data and audience understanding.

[1] Government of Canada, S.C. (2021) 5 Data Visualization 5.2 Bar Chart , 5.2 Bar chart .  https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch9/bargraph-diagrammeabarres/5214818-eng.htm

[2] Kosslyn, S.M., 1989. Understanding charts and graphs. Applied cognitive psychology, 3(3), pp.185-225. https://apps.dtic.mil/sti/pdfs/ADA183409.pdf

[3] Creating a Dashboard . https://it.tufts.edu/book/export/html/1870

[4] https://www.goldenwestcollege.edu/research/data-and-more/data-dashboards/index.html

[5] https://www.mit.edu/course/21/21.guide/grf-line.htm

[6] Jadeja, M. and Shah, K., 2015, January. Tree-Map: A Visualization Tool for Large Data. In GSB@ SIGIR (pp. 9-13). https://ceur-ws.org/Vol-1393/gsb15proceedings.pdf#page=15

[7] Heat Maps and Quilt Plots. https://www.publichealth.columbia.edu/research/population-health-methods/heat-maps-and-quilt-plots

[8] EIU QGIS WORKSHOP. https://www.eiu.edu/qgisworkshop/heatmaps.php

[9] About Pie Charts.  https://www.mit.edu/~mbarker/formula1/f1help/11-ch-c8.htm

[10] Histograms. https://sites.utexas.edu/sos/guided/descriptive/numericaldd/descriptiven2/histogram/ [11] https://asq.org/quality-resources/scatter-diagram

Like this article? Please share

Data Analysis, Data Science, Data Visualization Filed under Design

Related Articles

How To Make a Graph on Google Slides

Filed under Google Slides Tutorials • June 3rd, 2024

How To Make a Graph on Google Slides

Creating quality graphics is an essential aspect of designing data presentations. Learn how to make a graph in Google Slides with this guide.

How to Make a Presentation Graph

Filed under Design • March 27th, 2024

How to Make a Presentation Graph

Detailed step-by-step instructions to master the art of how to make a presentation graph in PowerPoint and Google Slides. Check it out!

Turning Your Data into Eye-opening Stories

Filed under Presentation Ideas • February 12th, 2024

Turning Your Data into Eye-opening Stories

What is Data Storytelling is a question that people are constantly asking now. If you seek to understand how to create a data storytelling ppt that will complete the information for your audience, you should read this blog post.

Leave a Reply

data presentation methods in quantitative research

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

data presentation methods in quantitative research

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Creating Brand Value
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading Change and Organizational Renewal
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

17 Data Visualization Techniques All Professionals Should Know

Data Visualizations on a Page

  • 17 Sep 2019

There’s a growing demand for business analytics and data expertise in the workforce. But you don’t need to be a professional analyst to benefit from data-related skills.

Becoming skilled at common data visualization techniques can help you reap the rewards of data-driven decision-making , including increased confidence and potential cost savings. Learning how to effectively visualize data could be the first step toward using data analytics and data science to your advantage to add value to your organization.

Several data visualization techniques can help you become more effective in your role. Here are 17 essential data visualization techniques all professionals should know, as well as tips to help you effectively present your data.

Access your free e-book today.

What Is Data Visualization?

Data visualization is the process of creating graphical representations of information. This process helps the presenter communicate data in a way that’s easy for the viewer to interpret and draw conclusions.

There are many different techniques and tools you can leverage to visualize data, so you want to know which ones to use and when. Here are some of the most important data visualization techniques all professionals should know.

Data Visualization Techniques

The type of data visualization technique you leverage will vary based on the type of data you’re working with, in addition to the story you’re telling with your data .

Here are some important data visualization techniques to know:

  • Gantt Chart
  • Box and Whisker Plot
  • Waterfall Chart
  • Scatter Plot
  • Pictogram Chart
  • Highlight Table
  • Bullet Graph
  • Choropleth Map
  • Network Diagram
  • Correlation Matrices

1. Pie Chart

Pie Chart Example

Pie charts are one of the most common and basic data visualization techniques, used across a wide range of applications. Pie charts are ideal for illustrating proportions, or part-to-whole comparisons.

Because pie charts are relatively simple and easy to read, they’re best suited for audiences who might be unfamiliar with the information or are only interested in the key takeaways. For viewers who require a more thorough explanation of the data, pie charts fall short in their ability to display complex information.

2. Bar Chart

Bar Chart Example

The classic bar chart , or bar graph, is another common and easy-to-use method of data visualization. In this type of visualization, one axis of the chart shows the categories being compared, and the other, a measured value. The length of the bar indicates how each group measures according to the value.

One drawback is that labeling and clarity can become problematic when there are too many categories included. Like pie charts, they can also be too simple for more complex data sets.

3. Histogram

Histogram Example

Unlike bar charts, histograms illustrate the distribution of data over a continuous interval or defined period. These visualizations are helpful in identifying where values are concentrated, as well as where there are gaps or unusual values.

Histograms are especially useful for showing the frequency of a particular occurrence. For instance, if you’d like to show how many clicks your website received each day over the last week, you can use a histogram. From this visualization, you can quickly determine which days your website saw the greatest and fewest number of clicks.

4. Gantt Chart

Gantt Chart Example

Gantt charts are particularly common in project management, as they’re useful in illustrating a project timeline or progression of tasks. In this type of chart, tasks to be performed are listed on the vertical axis and time intervals on the horizontal axis. Horizontal bars in the body of the chart represent the duration of each activity.

Utilizing Gantt charts to display timelines can be incredibly helpful, and enable team members to keep track of every aspect of a project. Even if you’re not a project management professional, familiarizing yourself with Gantt charts can help you stay organized.

5. Heat Map

Heat Map Example

A heat map is a type of visualization used to show differences in data through variations in color. These charts use color to communicate values in a way that makes it easy for the viewer to quickly identify trends. Having a clear legend is necessary in order for a user to successfully read and interpret a heatmap.

There are many possible applications of heat maps. For example, if you want to analyze which time of day a retail store makes the most sales, you can use a heat map that shows the day of the week on the vertical axis and time of day on the horizontal axis. Then, by shading in the matrix with colors that correspond to the number of sales at each time of day, you can identify trends in the data that allow you to determine the exact times your store experiences the most sales.

6. A Box and Whisker Plot

Box and Whisker Plot Example

A box and whisker plot , or box plot, provides a visual summary of data through its quartiles. First, a box is drawn from the first quartile to the third of the data set. A line within the box represents the median. “Whiskers,” or lines, are then drawn extending from the box to the minimum (lower extreme) and maximum (upper extreme). Outliers are represented by individual points that are in-line with the whiskers.

This type of chart is helpful in quickly identifying whether or not the data is symmetrical or skewed, as well as providing a visual summary of the data set that can be easily interpreted.

7. Waterfall Chart

Waterfall Chart Example

A waterfall chart is a visual representation that illustrates how a value changes as it’s influenced by different factors, such as time. The main goal of this chart is to show the viewer how a value has grown or declined over a defined period. For example, waterfall charts are popular for showing spending or earnings over time.

8. Area Chart

Area Chart Example

An area chart , or area graph, is a variation on a basic line graph in which the area underneath the line is shaded to represent the total value of each data point. When several data series must be compared on the same graph, stacked area charts are used.

This method of data visualization is useful for showing changes in one or more quantities over time, as well as showing how each quantity combines to make up the whole. Stacked area charts are effective in showing part-to-whole comparisons.

9. Scatter Plot

Scatter Plot Example

Another technique commonly used to display data is a scatter plot . A scatter plot displays data for two variables as represented by points plotted against the horizontal and vertical axis. This type of data visualization is useful in illustrating the relationships that exist between variables and can be used to identify trends or correlations in data.

Scatter plots are most effective for fairly large data sets, since it’s often easier to identify trends when there are more data points present. Additionally, the closer the data points are grouped together, the stronger the correlation or trend tends to be.

10. Pictogram Chart

Pictogram Example

Pictogram charts , or pictograph charts, are particularly useful for presenting simple data in a more visual and engaging way. These charts use icons to visualize data, with each icon representing a different value or category. For example, data about time might be represented by icons of clocks or watches. Each icon can correspond to either a single unit or a set number of units (for example, each icon represents 100 units).

In addition to making the data more engaging, pictogram charts are helpful in situations where language or cultural differences might be a barrier to the audience’s understanding of the data.

11. Timeline

Timeline Example

Timelines are the most effective way to visualize a sequence of events in chronological order. They’re typically linear, with key events outlined along the axis. Timelines are used to communicate time-related information and display historical data.

Timelines allow you to highlight the most important events that occurred, or need to occur in the future, and make it easy for the viewer to identify any patterns appearing within the selected time period. While timelines are often relatively simple linear visualizations, they can be made more visually appealing by adding images, colors, fonts, and decorative shapes.

12. Highlight Table

Highlight Table Example

A highlight table is a more engaging alternative to traditional tables. By highlighting cells in the table with color, you can make it easier for viewers to quickly spot trends and patterns in the data. These visualizations are useful for comparing categorical data.

Depending on the data visualization tool you’re using, you may be able to add conditional formatting rules to the table that automatically color cells that meet specified conditions. For instance, when using a highlight table to visualize a company’s sales data, you may color cells red if the sales data is below the goal, or green if sales were above the goal. Unlike a heat map, the colors in a highlight table are discrete and represent a single meaning or value.

13. Bullet Graph

Bullet Graph Example

A bullet graph is a variation of a bar graph that can act as an alternative to dashboard gauges to represent performance data. The main use for a bullet graph is to inform the viewer of how a business is performing in comparison to benchmarks that are in place for key business metrics.

In a bullet graph, the darker horizontal bar in the middle of the chart represents the actual value, while the vertical line represents a comparative value, or target. If the horizontal bar passes the vertical line, the target for that metric has been surpassed. Additionally, the segmented colored sections behind the horizontal bar represent range scores, such as “poor,” “fair,” or “good.”

14. Choropleth Maps

Choropleth Map Example

A choropleth map uses color, shading, and other patterns to visualize numerical values across geographic regions. These visualizations use a progression of color (or shading) on a spectrum to distinguish high values from low.

Choropleth maps allow viewers to see how a variable changes from one region to the next. A potential downside to this type of visualization is that the exact numerical values aren’t easily accessible because the colors represent a range of values. Some data visualization tools, however, allow you to add interactivity to your map so the exact values are accessible.

15. Word Cloud

Word Cloud Example

A word cloud , or tag cloud, is a visual representation of text data in which the size of the word is proportional to its frequency. The more often a specific word appears in a dataset, the larger it appears in the visualization. In addition to size, words often appear bolder or follow a specific color scheme depending on their frequency.

Word clouds are often used on websites and blogs to identify significant keywords and compare differences in textual data between two sources. They are also useful when analyzing qualitative datasets, such as the specific words consumers used to describe a product.

16. Network Diagram

Network Diagram Example

Network diagrams are a type of data visualization that represent relationships between qualitative data points. These visualizations are composed of nodes and links, also called edges. Nodes are singular data points that are connected to other nodes through edges, which show the relationship between multiple nodes.

There are many use cases for network diagrams, including depicting social networks, highlighting the relationships between employees at an organization, or visualizing product sales across geographic regions.

17. Correlation Matrix

Correlation Matrix Example

A correlation matrix is a table that shows correlation coefficients between variables. Each cell represents the relationship between two variables, and a color scale is used to communicate whether the variables are correlated and to what extent.

Correlation matrices are useful to summarize and find patterns in large data sets. In business, a correlation matrix might be used to analyze how different data points about a specific product might be related, such as price, advertising spend, launch date, etc.

Other Data Visualization Options

While the examples listed above are some of the most commonly used techniques, there are many other ways you can visualize data to become a more effective communicator. Some other data visualization options include:

  • Bubble clouds
  • Circle views
  • Dendrograms
  • Dot distribution maps
  • Open-high-low-close charts
  • Polar areas
  • Radial trees
  • Ring Charts
  • Sankey diagram
  • Span charts
  • Streamgraphs
  • Wedge stack graphs
  • Violin plots

Business Analytics | Become a data-driven leader | Learn More

Tips For Creating Effective Visualizations

Creating effective data visualizations requires more than just knowing how to choose the best technique for your needs. There are several considerations you should take into account to maximize your effectiveness when it comes to presenting data.

Related : What to Keep in Mind When Creating Data Visualizations in Excel

One of the most important steps is to evaluate your audience. For example, if you’re presenting financial data to a team that works in an unrelated department, you’ll want to choose a fairly simple illustration. On the other hand, if you’re presenting financial data to a team of finance experts, it’s likely you can safely include more complex information.

Another helpful tip is to avoid unnecessary distractions. Although visual elements like animation can be a great way to add interest, they can also distract from the key points the illustration is trying to convey and hinder the viewer’s ability to quickly understand the information.

Finally, be mindful of the colors you utilize, as well as your overall design. While it’s important that your graphs or charts are visually appealing, there are more practical reasons you might choose one color palette over another. For instance, using low contrast colors can make it difficult for your audience to discern differences between data points. Using colors that are too bold, however, can make the illustration overwhelming or distracting for the viewer.

Related : Bad Data Visualization: 5 Examples of Misleading Data

Visuals to Interpret and Share Information

No matter your role or title within an organization, data visualization is a skill that’s important for all professionals. Being able to effectively present complex data through easy-to-understand visual representations is invaluable when it comes to communicating information with members both inside and outside your business.

There’s no shortage in how data visualization can be applied in the real world. Data is playing an increasingly important role in the marketplace today, and data literacy is the first step in understanding how analytics can be used in business.

Are you interested in improving your analytical skills? Learn more about Business Analytics , our eight-week online course that can help you use data to generate insights and tackle business decisions.

This post was updated on January 20, 2022. It was originally published on September 17, 2019.

data presentation methods in quantitative research

About the Author

University of Northern Iowa Home

  • Chapter Four: Quantitative Methods (Part 3 - Making Sense of Your Study)

After you have designed your study, collected your data, and analyzed it, you have to figure out what it means and communicate that to potential interested audiences. This section of the chapter is about how to make sense of your study, in terms of data interpretation, data write-up, and data presentation, as seen in the above diagram.

  • Chapter One: Introduction
  • Chapter Two: Understanding the distinctions among research methods
  • Chapter Three: Ethical research, writing, and creative work
  • Chapter Four: Quantitative Methods (Part 1)
  • Chapter Four: Quantitative Methods (Part 2 - Doing Your Study)
  • Chapter Five: Qualitative Methods (Part 1)
  • Chapter Five: Qualitative Data (Part 2)
  • Chapter Six: Critical / Rhetorical Methods (Part 1)
  • Chapter Six: Critical / Rhetorical Methods (Part 2)
  • Chapter Seven: Presenting Your Results

Image removed.

Data Interpretation

Once you have run your statistics, you have to figure out what your findings mean or interpret your data. To do this, you need to tie back your findings to your research questions and/or hypotheses, think about how your findings relate to what you discovered beforehand about the already existing literature, and determine how your findings take the literature or current theory in the field further. Your interpretation of the data you collected will be found in the last section of your paper, what is commonly called the "discussion" section.

Remember Your RQs/Hs

Your research questions and hypotheses, once developed, should guide your study throughout the research process. As you are choosing your research design, choosing how to operationalize your variables, and choosing/conducting your statistical tests, you should always keep your RQs and Hs in mind.

What were you wanting to discover by your study? What were you wanting to test? Make sure you answer these questions clearly for the reader of your study in both the results and discussion section of the paper. (Specific guidelines for these sections will be covered later in this chapter, including the common practice of placing the data as you present it with each research question in the results section.)

Tie Findings to Your Literature Review

As you have seen in chapter 3 and the Appendix, and will see in chapter 7, the literature review is what you use to set up your quantitative study and to show why there is a need for your study. It should start out broad, with the context for your study, and lead into showing what still needs to be known and studied about your topic area, justifying your focus in the study. It will be brought in again in the last section of the paper you write, i.e., the discussion section.

Your paper is like an hourglass – starting out broad and narrowing down in the middle with your actual study and findings, and then moving to broad implications for the larger context of your study near the end.

Image removed.

Think about Relationship of Findings to Theory

One of the things you will write about in your discussion or last section of your paper is the implications of what you found. These implications are both practical and theoretical. Practical implications are how the research can provide practical applications to real-world people and issues. Theoretical implications are how the research takes the current academic literature further, specifically, in relationship to theory-building.

Did any of the research you reviewed for your literature review mention a theory your findings could expand upon? If so, you should think about how your findings related to this theory. If not, then think about the theories you have already studied in your communication classes. Would any of them provide a possible explanation of what you found? Would your findings help expand that theory to a different context, the context you studied? Does a theory need to be developed in the area of your research? If so, then what aspects of that theory could your findings help explain?

Data Write-Up

All quantitative studies, when written, have four parts. The first part is the introduction and literature review, the second part is the methods section, the third section is the results or findings, and the fourth section is the discussion section. This portion of this chapter will explain what elements you will need to include in each of these sections.

Literature Review

The beginning of your paper and first few pages sets the tone for your study. It tells the reader what the context of your study is and what other people who are also interested in your topic have studied about your topic.

There are many ways to organize a literature review, as can be seen in the following website. Literature Reviews — The Writing Center at UNC-Chapel Hill

After you have done a thorough literature search on your topic, then you have to organize your literature into topics of some kind. Your main goal is to show what has been done and what still needs to be done, to show the need for your study, so at the end of each section of your literature review, you should identify what still needs to be known about that particular area.

For quantitative research, you should do your literature review before coming up with your research questions/hypotheses. Your questions and hypotheses should flow from the literature. This is different from the other two research methods discussed in this book, which do not rely so heavily on a literature review to situation the study before conducting it.

In the methods section, you should tell your reader how you conducted your study, from start to finish, explaining why you made the choices you did along the way. A reader should be able to replicate your study from the descriptions you provide in this section of your write-up. Common headings in the methods section include a description of the participants, procedures, and analysis.

Participants

For the participants' subheading of the methods section, you should minimally report the demographics of your sample in terms of biological sex (frequencies/percentages), age (range of ages and mean), and ethnicity (frequencies/percentages). If you collected data on other demographics, such as socioeconomic status, religious affiliation, type of occupation, etc., then you can report data for that also in the participants' sub-section.

For the procedures sub-section, you report everything you did to collect your data: how you recruited your participants, including what type of sampling you used (probability or non-probability) and informed consent procedures; how you operationalized your variables (including your survey questions, which often are explained in the methods section briefly while the whole survey can be found in an appendix of your paper); the validity and reliability of your survey instrument or methods you used; and what type of study design you had (experimental, quasi-experimental, or non-experimental). For each one of these design issues, in this sub-section of the methods part, you need to explain why you made the decisions you did in order to answer your research questions or test your hypotheses.

In this section, you explain how you converted your data for analysis and how you analyzed your data. You need to explain what statistics you chose to run for each of your research questions/hypotheses and why.

In this section of your paper, you organize the results by your research questions/hypotheses. For each research question/hypothesis, you should present any descriptive statistic results first and then your inferential statistics results. You do not make any interpretation of what your results mean or why you think you got the results you did. You merely report your results.

Reporting Significant Results

For each of the inferential statistics, there is a typical template you can follow when reporting significant results: reporting the test statistic value, the degrees of freedom  3 , and the probability level. Examples follow for each of the statistics we have talked about in this text.

T-test results

"T-tests results show there was a significant difference found between men and women on their levels of self-esteem,  t  (df) = t value,  p  < .05, with men's self-esteem being higher (or lower) (men's mean & standard deviation) than women's self-esteem (women's mean & standard deviation)."

ANOVA results

"ANOVA results indicate there was a significant difference found between [levels of independent variable] on [dependent variable],  F  (df) = F value,  p  < .05."

If doing a factorial ANOVA, you would report the above sentence for all of your independent variables (main effects), as well as for the interaction (interaction effect), with language something like: "ANOVA results indicate a significant main effect for [independent variable] on [dependent variable],  F  (df) = F value,  p  < .05. .... ANOVA results indicate a significant interaction effect between [independent variables] on [dependent variable],  F  (df) = F value,  p  < .05."

See example YouTube tutorial for writing up a two-way ANOVA at the following website.

Factorial Design (Part C): Writing Up Results

Chi-square results

For goodness of fit results, your write-up would look something like: "Using a chi-square goodness of fit test, there was a significant difference found between observed and expected values of [variable], χ2 (df) = chi-square value,  p  < .05." For test of independence results, it would like like: "Using a chi-square test of independence, there was a significant interaction between [your two variables], χ2 (df) = chi-square value,  p  < .05."

Correlation results

"Using Pearson's [or Spearman's] correlation coefficient, there was a significant relationship found between [two variables],  r  (df) = r value,  p  < .05." If there are a lot of significant correlation results, these results are often presented in a table form.

For more information on these types of tables, see the following website:  Correlation Tables .

Regression results

Reporting regression results is more complicated, but generally, you want to inform the reader about how much variance is accounted by the regression model, the significance level of the model, and the significance of the predictor variable. For example:

A regression analysis, predicting GPA scores from GRE scores, was statistically significant,  F (1,8) = 10.34,  p  < .05.

Coefficientsa

 Unstandardized 
Coefficients
Standardized 
Coefficients
tSig.
ModelBStd. ErrorBeta  
1 Constant
GRE
.411
.005
.907
.002

.751
.453
3.216
.662
.012

The regression equation is: Ŷ = .411 * .005X. For every one unit increase in GRE score, there is a corresponding increase in GPA of .005 (Walen-Frederick, n.d., p. 4).

For more write-up help on regression and other statistics, see the following website location:

Multiple Regression  (pp. 217-220)

Reporting Non-Significant Results

You can follow a similar template when reporting non-significant results for all of the above inferential statistics. It is the same as provided in the above examples, except the word "non-significant" replaces the word "significant," and the  p  values are adjusted to indicate  p > .05.

Many times readers of articles do not read the whole article, especially if they are afraid of the statistical sections. When this happens, they often read the discussion section, which makes this a very important section in your writing. You should include the following elements in your discussion section: (a) a summary of your findings, (b) implications, (c) limitations, and (d) future research ideas.

Summary of Findings

You should summarize the answers to your research questions or what you found when testing your hypotheses in this sub-section of the discussion section. You should not report any statistical data here, but just put your results into narrative form. What did you find out that you did not know before doing your study? Answer that question in this sub- section.

Implications

You need to indicate why your study was important, both theoretically and practically. For the theoretical implications, you should relate what you found to the already existing literature, as discussed earlier when the "hourglass" format was mentioned as a way of conceptualizing your whole paper. If your study added anything to the existing theory on a particular topic, you talk about this here as well.

For practical implications, you need to identify for the reader how this study can help people in their real-world experiences related to your topic. You do not want your study to just be important to academic researchers, but also to other professionals and persons interested in your topic.

Limitations

As you get through conducting your study, you are going to realize there are things you wish you had done differently. Rather than hide these things from the reader, it is better to forthrightly state these for the reader. Explain why your study is limited and what you wish you had done in this sub-section.

Future Research

The limitations sub-section usually is tied directly to the future research sub-section, as your limitations mean that future research should be done to deal with these limitations. There may also be other things that could be studied, however, as a result of what you have found. What would other people say are the "gaps" your study left unstudied on your topic? These should be identified, with some suggestions on how they might be studied.

Other Aspects of the Paper

There are other parts of the academic paper you should include in your final write-up. We have provided useful resources for you to consider when including these aspects as part of your paper. For an example paper that uses the required APA format for a research paper write-up, see the following source:  Varying Definitions of Online Communication .

Abstract & Titles.

Research Abstracts General Format

Tables, References, & Other Materials.

APA Tables and Figures 1 Reference List: Basic Rules

Data Presentation

You will probably be called upon to present your data in other venues besides in writing. Two of the most common venues are oral presentations such as in class or at conferences, and poster presentations, such as what you might find at conferences. You might also be called upon to not write an academic write-up of your study, but rather to provide an executive summary of the results of your study to the "powers that be," who do not have time to read more than 5 pages or so of a summary. There are good resources for doing all of these online, so we have provided these here.

Oral Presentations

Oral Presentations Delivering Presentations

Poster Presentations

Executive Summary

Executive Summaries Complete the Report Good & Poor Examples of Executive Summaries with the following link: http://unilearning.uow.edu.au/report/4bi1.html

Congratulations! You have learned a great deal about how to go about using quantitative methods for your future research projects. You have learned how to design a quantitative study, conduct a quantitative study, and write about a quantitative study. You have some good resources you can take with you when you leave this class. Now, you just have to apply what you have learned to projects that will come your way in the future.

Remember, just because you may not like one method the best does not mean you should not use it. Your research questions/hypotheses should ALWAYS drive your choice of which method you use. And remember also that you can do quantitative methods!

[NOTE: References are not provided for the websites cited in the text, even though if this was an actual research article, they would need to be cited.]

Baker, E., Baker, W., & Tedesco, J. C. (2007). Organizations respond to phishing: Exploring the public relations tackle box.  Communication Research Reports, 24  (4), 327-339.

Benoit, W. L., & Hansen, G. J. (2004). Presidential debate watching, issue knowledge, character evaluation, and vote choice.  Human Communication Research, 30  (1), 121-144.

Chatham, A. (1991).  Home vs. public schooling: What about relationships in adolescence? Doctoral dissertation, University of Oklahoma.

Cousineau, T. M., Rancourt, D., and Green, T. C. (2006). Web chatter before and after the women's health initiative results: A content analysis of on-line menopause message boards.  Journal of Health Communication, 11 (2), 133-147.

Derlega, V., Winstead, B. A., Mathews, A., and Braitman, A. L. (2008). Why does someone reveal highly personal information?: Attributions for and against self-disclosure in close relationships.  Communication Research Reports, 25 , 115-130.

Fischer, J., & Corcoran, K. (2007).  Measures for clinical practice and research: A sourcebook (volumes 1 & 2) . New York: Oxford University Press.

Guay, S., Boisvert, J.-M., & Freeston, M. H. (2003). Validity of three measures of communication for predicting relationship adjustment and stability among a sample of young couples.  Psychological Assessment , 15(3), 392-398.

Holbert, R. L., Tschida, D. A., Dixon, M., Cherry, K., Steuber, K., & Airne, D. (2005). The  West Wing  and depictions of the American Presidency: Expanding the domains of framing in political communication.  Communication Quarterly, 53  (4), 505-522.

Jensen, J. D. (2008). Scientific uncertainty in news coverage of cancer research: Effects of hedging on scientists' and journalists' credibility.  Human Communication Research, 34 , 347- 369.

Keyton, J. (2011).  Communicating research: Asking questions, finding answers . New York: McGraw Hill.

Lenhart, A., Ling, R., Campbell, S., & Purcell, K. (2010, Apr. 10).  Teens and mobile phones . Report from the Pew Internet and American Life Project, retrieved from  http://www.pewinternet.org/Reports/2010/Teens-and-Mobile-Phones.aspx .

Maddy, T. (2008).  Tests: A comprehensive reference for assessments in psychology, education, and business . Austin, TX: Pro-Ed.

McCollum Jr., J. F., & Bryant, J. (2003). Pacing in children's television programming.  Mass Communication and Society, 6  (2), 115-136.

Medved, C. E., Brogan, S. M., McClanahan, A. M., Morris, J. F., & Shepherd, G. J. (2006). Family and work socializing communication: Messages, gender, and ideological implications.  Journal of Family Communication, 6 (3), 161-180.

Moyer-Gusé, E., & Nabi, R. L. (2010). Explaining the effects of narrative in an entertainment television program: Overcoming resistance to persuasion.  Human Communication Research, 36 , 26-52.

Nabi, R. L. (2009). Cosmetic surgery makeover programs and intentions to undergo cosmetic enhancements: A consideration of three models of media effects.  Human Communication Research, 35 , 1-27.

Pearson, J. C., DeWitt, L., Child, J. T., Kahl Jr., D. H., and Dandamudi, V. (2007). Facing the fear: An analysis of speech-anxiety content in public-speaking textbooks.  Communication Research Reports, 24 (2), 159-168.

Rubin. R. B., Rubin, A. M., Graham, E., Perse, E. M., & Seibold, D. (2009).  Communication research measures II: A sourcebook . New York: Routledge.

Serota, K. B., Levine, T. R., and Boster, F. J. (2010). The prevalence of lying in America: Three studies of reported deception.  Human Communication Research, 36 , 1-24.

Sheldon, P. (2008). The relationship between unwillingness-to-communicate and students' facebook use.  Journal of Media Psychology, 20 (2), 67–75.

Trochim, W. M. K. (2006). Reliability and validity.  Research methods data base , retrieved from  http://www.socialresearchmethods.net/kb/relandval.php .

Walen-Frederick, H. (n.d.).  Help sheet for reading SPSS printouts . Retrieved from  http://www.scribd.com/doc/51982223/help-sheet-for-reading-spss-printouts .

Weaver, A. J., & Wilson, B. J. (2009). The role of graphic and sanitized violence in the enjoyment of television dramas.  Human Communication Research, 35 (3), 442-463.

Weber, K., Corrigan, M., Fornash, B., & Neupauer, N. C. (2003). The effect of interest on recall: An experiment.  Communication Research Reports, 20 (2), 116-123.

Witt, P. L., & Schrodt, P. (2006). The influence of instructional technology use and teacher immediacy on student affect for teacher and course.  Communication Reports, 19 (1), 1-15.

3 Degrees of freedom (df) relate to your sample size and to the number of groups being compared. SPSS always computes the df for your statistics. For more information on degrees of freedom, see the following web-based resources:  http://www.youtube.com/watch?v=wsvfasNpU2s  and  http://www.creative-wisdom.com/pub/df/index.htm .

Back to Previous Spot

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Present Your Data Like a Pro

  • Joel Schwartzberg

data presentation methods in quantitative research

Demystify the numbers. Your audience will thank you.

While a good presentation has data, data alone doesn’t guarantee a good presentation. It’s all about how that data is presented. The quickest way to confuse your audience is by sharing too many details at once. The only data points you should share are those that significantly support your point — and ideally, one point per chart. To avoid the debacle of sheepishly translating hard-to-see numbers and labels, rehearse your presentation with colleagues sitting as far away as the actual audience would. While you’ve been working with the same chart for weeks or months, your audience will be exposed to it for mere seconds. Give them the best chance of comprehending your data by using simple, clear, and complete language to identify X and Y axes, pie pieces, bars, and other diagrammatic elements. Try to avoid abbreviations that aren’t obvious, and don’t assume labeled components on one slide will be remembered on subsequent slides. Every valuable chart or pie graph has an “Aha!” zone — a number or range of data that reveals something crucial to your point. Make sure you visually highlight the “Aha!” zone, reinforcing the moment by explaining it to your audience.

With so many ways to spin and distort information these days, a presentation needs to do more than simply share great ideas — it needs to support those ideas with credible data. That’s true whether you’re an executive pitching new business clients, a vendor selling her services, or a CEO making a case for change.

data presentation methods in quantitative research

  • JS Joel Schwartzberg oversees executive communications for a major national nonprofit, is a professional presentation coach, and is the author of Get to the Point! Sharpen Your Message and Make Your Words Matter and The Language of Leadership: How to Engage and Inspire Your Team . You can find him on LinkedIn and X. TheJoelTruth

Partner Center

Logo for Rhode Island College Digital Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Quantitative Data Analysis

9 Presenting the Results of Quantitative Analysis

Mikaila Mariel Lemonik Arthur

This chapter provides an overview of how to present the results of quantitative analysis, in particular how to create effective tables for displaying quantitative results and how to write quantitative research papers that effectively communicate the methods used and findings of quantitative analysis.

Writing the Quantitative Paper

Standard quantitative social science papers follow a specific format. They begin with a title page that includes a descriptive title, the author(s)’ name(s), and a 100 to 200 word abstract that summarizes the paper. Next is an introduction that makes clear the paper’s research question, details why this question is important, and previews what the paper will do. After that comes a literature review, which ends with a summary of the research question(s) and/or hypotheses. A methods section, which explains the source of data, sample, and variables and quantitative techniques used, follows. Many analysts will include a short discussion of their descriptive statistics in the methods section. A findings section details the findings of the analysis, supported by a variety of tables, and in some cases graphs, all of which are explained in the text. Some quantitative papers, especially those using more complex techniques, will include equations. Many papers follow the findings section with a discussion section, which provides an interpretation of the results in light of both the prior literature and theory presented in the literature review and the research questions/hypotheses. A conclusion ends the body of the paper. This conclusion should summarize the findings, answering the research questions and stating whether any hypotheses were supported, partially supported, or not supported. Limitations of the research are detailed. Papers typically include suggestions for future research, and where relevant, some papers include policy implications. After the body of the paper comes the works cited; some papers also have an Appendix that includes additional tables and figures that did not fit into the body of the paper or additional methodological details. While this basic format is similar for papers regardless of the type of data they utilize, there are specific concerns relating to quantitative research in terms of the methods and findings that will be discussed here.

In the methods section, researchers clearly describe the methods they used to obtain and analyze the data for their research. When relying on data collected specifically for a given paper, researchers will need to discuss the sample and data collection; in most cases, though, quantitative research relies on pre-existing datasets. In these cases, researchers need to provide information about the dataset, including the source of the data, the time it was collected, the population, and the sample size. Regardless of the source of the data, researchers need to be clear about which variables they are using in their research and any transformations or manipulations of those variables. They also need to explain the specific quantitative techniques that they are using in their analysis; if different techniques are used to test different hypotheses, this should be made clear. In some cases, publications will require that papers be submitted along with any code that was used to produce the analysis (in SPSS terms, the syntax files), which more advanced researchers will usually have on hand. In many cases, basic descriptive statistics are presented in tabular form and explained within the methods section.

The findings sections of quantitative papers are organized around explaining the results as shown in tables and figures. Not all results are depicted in tables and figures—some minor or null findings will simply be referenced—but tables and figures should be produced for all findings to be discussed at any length. If there are too many tables and figures, some can be moved to an appendix after the body of the text and referred to in the text (e.g. “See Table 12 in Appendix A”).

Discussions of the findings should not simply restate the contents of the table. Rather, they should explain and interpret it for readers, and they should do so in light of the hypothesis or hypotheses that are being tested. Conclusions—discussions of whether the hypothesis or hypotheses are supported or not supported—should wait for the conclusion of the paper.

Creating Effective Tables

When creating tables to display the results of quantitative analysis, the most important goals are to create tables that are clear and concise but that also meet standard conventions in the field. This means, first of all, paring down the volume of information produced in the statistical output to just include the information most necessary for interpreting the results, but doing so in keeping with standard table conventions. It also means making tables that are well-formatted and designed, so that readers can understand what the tables are saying without struggling to find information. For example, tables (as well as figures such as graphs) need clear captions; they are typically numbered and referred to by number in the text. Columns and rows should have clear headings. Depending on the content of the table, formatting tools may need to be used to set off header rows/columns and/or total rows/columns; cell-merging tools may be necessary; and shading may be important in tables with many rows or columns.

Here, you will find some instructions for creating tables of results from descriptive, crosstabulation, correlation, and regression analysis that are clear, concise, and meet normal standards for data display in social science. In addition, after the instructions for creating tables, you will find an example of how a paper incorporating each table might describe that table in the text.

Descriptive Statistics

When presenting the results of descriptive statistics, we create one table with columns for each type of descriptive statistic and rows for each variable. Note, of course, that depending on level of measurement only certain descriptive statistics are appropriate for a given variable, so there may be many cells in the table marked with an — to show that this statistic is not calculated for this variable. So, consider the set of descriptive statistics below, for occupational prestige, age, highest degree earned, and whether the respondent was born in this country.

Table 1. SPSS Ouput: Selected Descriptive Statistics
Statistics
R’s occupational prestige score (2010) Age of respondent
N Valid 3873 3699
Missing 159 333
Mean 46.54 52.16
Median 47.00 53.00
Std. Deviation 13.811 17.233
Variance 190.745 296.988
Skewness .141 .018
Std. Error of Skewness .039 .040
Kurtosis -.809 -1.018
Std. Error of Kurtosis .079 .080
Range 64 71
Minimum 16 18
Maximum 80 89
Percentiles 25 35.00 37.00
50 47.00 53.00
75 59.00 66.00
Statistics
R’s highest degree
N Valid 4009
Missing 23
Median 2.00
Mode 1
Range 4
Minimum 0
Maximum 4
R’s highest degree
Frequency Percent Valid Percent Cumulative Percent
Valid less than high school 246 6.1 6.1 6.1
high school 1597 39.6 39.8 46.0
associate/junior college 370 9.2 9.2 55.2
bachelor’s 1036 25.7 25.8 81.0
graduate 760 18.8 19.0 100.0
Total 4009 99.4 100.0
Missing System 23 .6
Total 4032 100.0
Statistics
Was r born in this country
N Valid 3960
Missing 72
Mean 1.11
Mode 1
Was r born in this country
Frequency Percent Valid Percent Cumulative Percent
Valid yes 3516 87.2 88.8 88.8
no 444 11.0 11.2 100.0
Total 3960 98.2 100.0
Missing System 72 1.8
Total 4032 100.0

To display these descriptive statistics in a paper, one might create a table like Table 2. Note that for discrete variables, we use the value label in the table, not the value.

Table 2. Descriptive Statistics
46.54 52.16 1.11
47 53 1: Associates (9.2%) 1: Yes (88.8%)
2: High School (39.8%)
13.811 17.233
190.745 296.988
0.141 0.018
-0.809 -1.018
64 (16-80) 71 (18-89) Less than High School (0) –  Graduate (4)
35-59 37-66
3873 3699 4009 3960

If we were then to discuss our descriptive statistics in a quantitative paper, we might write something like this (note that we do not need to repeat every single detail from the table, as readers can peruse the table themselves):

This analysis relies on four variables from the 2021 General Social Survey: occupational prestige score, age, highest degree earned, and whether the respondent was born in the United States. Descriptive statistics for all four variables are shown in Table 2. The median occupational prestige score is 47, with a range from 16 to 80. 50% of respondents had occupational prestige scores scores between 35 and 59. The median age of respondents is 53, with a range from 18 to 89. 50% of respondents are between ages 37 and 66. Both variables have little skew. Highest degree earned ranges from less than high school to a graduate degree; the median respondent has earned an associate’s degree, while the modal response (given by 39.8% of the respondents) is a high school degree. 88.8% of respondents were born in the United States.

Crosstabulation

When presenting the results of a crosstabulation, we simplify the table so that it highlights the most important information—the column percentages—and include the significance and association below the table. Consider the SPSS output below.

Table 3. R’s highest degree * R’s subjective class identification Crosstabulation
R’s subjective class identification Total
lower class working class middle class upper class
R’s highest degree less than high school Count 65 106 68 7 246
% within R’s subjective class identification 18.8% 7.1% 3.4% 4.2% 6.2%
high school Count 217 800 551 23 1591
% within R’s subjective class identification 62.9% 53.7% 27.6% 13.9% 39.8%
associate/junior college Count 30 191 144 3 368
% within R’s subjective class identification 8.7% 12.8% 7.2% 1.8% 9.2%
bachelor’s Count 27 269 686 49 1031
% within R’s subjective class identification 7.8% 18.1% 34.4% 29.5% 25.8%
graduate Count 6 123 546 84 759
% within R’s subjective class identification 1.7% 8.3% 27.4% 50.6% 19.0%
Total Count 345 1489 1995 166 3995
% within R’s subjective class identification 100.0% 100.0% 100.0% 100.0% 100.0%
Chi-Square Tests
Value df Asymptotic Significance (2-sided)
Pearson Chi-Square 819.579 12 <.001
Likelihood Ratio 839.200 12 <.001
Linear-by-Linear Association 700.351 1 <.001
N of Valid Cases 3995
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 10.22.
Symmetric Measures
Value Asymptotic Standard Error Approximate T Approximate Significance
Interval by Interval Pearson’s R .419 .013 29.139 <.001
Ordinal by Ordinal Spearman Correlation .419 .013 29.158 <.001
N of Valid Cases 3995
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Based on normal approximation.

Table 4 shows how a table suitable for include in a paper might look if created from the SPSS output in Table 3. Note that we use asterisks to indicate the significance level of the results: * means p < 0.05; ** means p < 0.01; *** means p < 0.001; and no stars mean p > 0.05 (and thus that the result is not significant). Also note than N is the abbreviation for the number of respondents.

 
18.8% 7.1% 3.4% 4.2% 6.2%
62.9% 53.7% 27.6% 13.9% 39.8%
8.7% 12.8% 7.2% 1.8% 9.2%
7.8% 18.1% 34.4% 29.5% 25.8%
1.7% 8.3% 27.4% 50.6% 19.0%
N: 3995 Spearman Correlation 0.419***

If we were going to discuss the results of this crosstabulation in a quantitative research paper, the discussion might look like this:

A crosstabulation of respondent’s class identification and their highest degree earned, with class identification as the independent variable, is significant, with a Spearman correlation of 0.419, as shown in Table 4. Among lower class and working class respondents, more than 50% had earned a high school degree. Less than 20% of poor respondents and less than 40% of working-class respondents had earned more than a high school degree. In contrast, the majority of middle class and upper class respondents had earned at least a bachelor’s degree. In fact, 50% of upper class respondents had earned a graduate degree.

Correlation

When presenting a correlating matrix, one of the most important things to note is that we only present half the table so as not to include duplicated results. Think of the line through the table where empty cells exist to represent the correlation between a variable and itself, and include only the triangle of data either above or below that line of cells. Consider the output in Table 5.

Table 5. SPSS Output: Correlations
Age of respondent R’s occupational prestige score (2010) Highest year of school R completed R’s family income in 1986 dollars
Age of respondent Pearson Correlation 1 .087 .014 .017
Sig. (2-tailed) <.001 .391 .314
N 3699 3571 3683 3336
R’s occupational prestige score (2010) Pearson Correlation .087 1 .504 .316
Sig. (2-tailed) <.001 <.001 <.001
N 3571 3873 3817 3399
Highest year of school R completed Pearson Correlation .014 .504 1 .360
Sig. (2-tailed) .391 <.001 <.001
N 3683 3817 3966 3497
R’s family income in 1986 dollars Pearson Correlation .017 .316 .360 1
Sig. (2-tailed) .314 <.001 <.001
N 3336 3399 3497 3509
**. Correlation is significant at the 0.01 level (2-tailed).

Table 6 shows what the contents of Table 5 might look like when a table is constructed in a fashion suitable for publication.

Table 6. Correlation Matrix
1
0.087*** 1
0.014 0.504*** 1
0.017 0.316*** 0.360*** 1

If we were to discuss the results of this bivariate correlation analysis in a quantitative paper, the discussion might look like this:

Bivariate correlations were run among variables measuring age, occupational prestige, the highest year of school respondents completed, and family income in constant 1986 dollars, as shown in Table 6. Correlations between age and highest year of school completed and between age and family income are not significant. All other correlations are positive and significant at the p<0.001 level. The correlation between age and occupational prestige is weak; the correlations between income and occupational prestige and between income and educational attainment are moderate, and the correlation between education and occupational prestige is strong.

To present the results of a regression, we create one table that includes all of the key information from the multiple tables of SPSS output. This includes the R 2 and significance of the regression, either the B or the beta values (different analysts have different preferences here) for each variable, and the standard error and significance of each variable. Consider the SPSS output in Table 7.

Table 7. SPSS Output: Regression
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .395 .156 .155 36729.04841
a. Predictors: (Constant), Highest year of school R completed, Age of respondent, R’s occupational prestige score (2010)
ANOVA
Model Sum of Squares df Mean Square F Sig.
1 Regression 805156927306.583 3 268385642435.528 198.948 <.001
Residual 4351948187487.015 3226 1349022996.741
Total 5157105114793.598 3229
a. Dependent Variable: R’s family income in 1986 dollars
b. Predictors: (Constant), Highest year of school R completed, Age of respondent, R’s occupational prestige score (2010)
Coefficients
Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics
B Std. Error Beta Tolerance VIF
1 (Constant) -44403.902 4166.576 -10.657 <.001
Age of respondent 9.547 38.733 .004 .246 .805 .993 1.007
R’s occupational prestige score (2010) 522.887 54.327 .181 9.625 <.001 .744 1.345
Highest year of school R completed 3988.545 274.039 .272 14.555 <.001 .747 1.339
a. Dependent Variable: R’s family income in 1986 dollars

The regression output in shown in Table 7 contains a lot of information. We do not include all of this information when making tables suitable for publication. As can be seen in Table 8, we include the Beta (or the B), the standard error, and the significance asterisk for each variable; the R 2 and significance for the overall regression; the degrees of freedom (which tells readers the sample size or N); and the constant; along with the key to p/significance values.

Table 8. Regression Results for Dependent Variable Family Income in 1986 Dollars
Age 0.004
(38.733)
Occupational Prestige Score 0.181***
(54.327)
Highest Year of School Completed 0.272***
(274.039)
Degrees of Freedom 3229
Constant -44,403.902

If we were to discuss the results of this regression in a quantitative paper, the results might look like this:

Table 8 shows the results of a regression in which age, occupational prestige, and highest year of school completed are the independent variables and family income is the dependent variable. The regression results are significant, and all of the independent variables taken together explain 15.6% of the variance in family income. Age is not a significant predictor of income, while occupational prestige and educational attainment are. Educational attainment has a larger effect on family income than does occupational prestige. For every year of additional education attained, family income goes up on average by $3,988.545; for every one-unit increase in occupational prestige score, family income goes up on average by $522.887. [1]
  • Choose two discrete variables and three continuous variables from a dataset of your choice. Produce appropriate descriptive statistics on all five of the variables and create a table of the results suitable for inclusion in a paper.
  • Using the two discrete variables you have chosen, produce an appropriate crosstabulation, with significance and measure of association. Create a table of the results suitable for inclusion in a paper.
  • Using the three continuous variables you have chosen, produce a correlation matrix. Create a table of the results suitable for inclusion in a paper.
  • Using the three continuous variables you have chosen, produce a multivariate linear regression. Create a table of the results suitable for inclusion in a paper.
  • Write a methods section describing the dataset, analytical methods, and variables you utilized in questions 1, 2, 3, and 4 and explaining the results of your descriptive analysis.
  • Write a findings section explaining the results of the analyses you performed in questions 2, 3, and 4.
  • Note that the actual numberical increase comes from the B values, which are shown in the SPSS output in Table 7 but not in the reformatted Table 8. ↵

Social Data Analysis Copyright © 2021 by Mikaila Mariel Lemonik Arthur is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

data presentation methods in quantitative research

Quantitative Data Analysis 101

The lingo, methods and techniques, explained simply.

By: Derek Jansen (MBA)  and Kerryn Warren (PhD) | December 2020

Quantitative data analysis is one of those things that often strikes fear in students. It’s totally understandable – quantitative analysis is a complex topic, full of daunting lingo , like medians, modes, correlation and regression. Suddenly we’re all wishing we’d paid a little more attention in math class…

The good news is that while quantitative data analysis is a mammoth topic, gaining a working understanding of the basics isn’t that hard , even for those of us who avoid numbers and math . In this post, we’ll break quantitative analysis down into simple , bite-sized chunks so you can approach your research with confidence.

Quantitative data analysis methods and techniques 101

Overview: Quantitative Data Analysis 101

  • What (exactly) is quantitative data analysis?
  • When to use quantitative analysis
  • How quantitative analysis works

The two “branches” of quantitative analysis

  • Descriptive statistics 101
  • Inferential statistics 101
  • How to choose the right quantitative methods
  • Recap & summary

What is quantitative data analysis?

Despite being a mouthful, quantitative data analysis simply means analysing data that is numbers-based – or data that can be easily “converted” into numbers without losing any meaning.

For example, category-based variables like gender, ethnicity, or native language could all be “converted” into numbers without losing meaning – for example, English could equal 1, French 2, etc.

This contrasts against qualitative data analysis, where the focus is on words, phrases and expressions that can’t be reduced to numbers. If you’re interested in learning about qualitative analysis, check out our post and video here .

What is quantitative analysis used for?

Quantitative analysis is generally used for three purposes.

  • Firstly, it’s used to measure differences between groups . For example, the popularity of different clothing colours or brands.
  • Secondly, it’s used to assess relationships between variables . For example, the relationship between weather temperature and voter turnout.
  • And third, it’s used to test hypotheses in a scientifically rigorous way. For example, a hypothesis about the impact of a certain vaccine.

Again, this contrasts with qualitative analysis , which can be used to analyse people’s perceptions and feelings about an event or situation. In other words, things that can’t be reduced to numbers.

How does quantitative analysis work?

Well, since quantitative data analysis is all about analysing numbers , it’s no surprise that it involves statistics . Statistical analysis methods form the engine that powers quantitative analysis, and these methods can vary from pretty basic calculations (for example, averages and medians) to more sophisticated analyses (for example, correlations and regressions).

Sounds like gibberish? Don’t worry. We’ll explain all of that in this post. Importantly, you don’t need to be a statistician or math wiz to pull off a good quantitative analysis. We’ll break down all the technical mumbo jumbo in this post.

Need a helping hand?

data presentation methods in quantitative research

As I mentioned, quantitative analysis is powered by statistical analysis methods . There are two main “branches” of statistical methods that are used – descriptive statistics and inferential statistics . In your research, you might only use descriptive statistics, or you might use a mix of both , depending on what you’re trying to figure out. In other words, depending on your research questions, aims and objectives . I’ll explain how to choose your methods later.

So, what are descriptive and inferential statistics?

Well, before I can explain that, we need to take a quick detour to explain some lingo. To understand the difference between these two branches of statistics, you need to understand two important words. These words are population and sample .

First up, population . In statistics, the population is the entire group of people (or animals or organisations or whatever) that you’re interested in researching. For example, if you were interested in researching Tesla owners in the US, then the population would be all Tesla owners in the US.

However, it’s extremely unlikely that you’re going to be able to interview or survey every single Tesla owner in the US. Realistically, you’ll likely only get access to a few hundred, or maybe a few thousand owners using an online survey. This smaller group of accessible people whose data you actually collect is called your sample .

So, to recap – the population is the entire group of people you’re interested in, and the sample is the subset of the population that you can actually get access to. In other words, the population is the full chocolate cake , whereas the sample is a slice of that cake.

So, why is this sample-population thing important?

Well, descriptive statistics focus on describing the sample , while inferential statistics aim to make predictions about the population, based on the findings within the sample. In other words, we use one group of statistical methods – descriptive statistics – to investigate the slice of cake, and another group of methods – inferential statistics – to draw conclusions about the entire cake. There I go with the cake analogy again…

With that out the way, let’s take a closer look at each of these branches in more detail.

Descriptive statistics vs inferential statistics

Branch 1: Descriptive Statistics

Descriptive statistics serve a simple but critically important role in your research – to describe your data set – hence the name. In other words, they help you understand the details of your sample . Unlike inferential statistics (which we’ll get to soon), descriptive statistics don’t aim to make inferences or predictions about the entire population – they’re purely interested in the details of your specific sample .

When you’re writing up your analysis, descriptive statistics are the first set of stats you’ll cover, before moving on to inferential statistics. But, that said, depending on your research objectives and research questions , they may be the only type of statistics you use. We’ll explore that a little later.

So, what kind of statistics are usually covered in this section?

Some common statistical tests used in this branch include the following:

  • Mean – this is simply the mathematical average of a range of numbers.
  • Median – this is the midpoint in a range of numbers when the numbers are arranged in numerical order. If the data set makes up an odd number, then the median is the number right in the middle of the set. If the data set makes up an even number, then the median is the midpoint between the two middle numbers.
  • Mode – this is simply the most commonly occurring number in the data set.
  • In cases where most of the numbers are quite close to the average, the standard deviation will be relatively low.
  • Conversely, in cases where the numbers are scattered all over the place, the standard deviation will be relatively high.
  • Skewness . As the name suggests, skewness indicates how symmetrical a range of numbers is. In other words, do they tend to cluster into a smooth bell curve shape in the middle of the graph, or do they skew to the left or right?

Feeling a bit confused? Let’s look at a practical example using a small data set.

Descriptive statistics example data

On the left-hand side is the data set. This details the bodyweight of a sample of 10 people. On the right-hand side, we have the descriptive statistics. Let’s take a look at each of them.

First, we can see that the mean weight is 72.4 kilograms. In other words, the average weight across the sample is 72.4 kilograms. Straightforward.

Next, we can see that the median is very similar to the mean (the average). This suggests that this data set has a reasonably symmetrical distribution (in other words, a relatively smooth, centred distribution of weights, clustered towards the centre).

In terms of the mode , there is no mode in this data set. This is because each number is present only once and so there cannot be a “most common number”. If there were two people who were both 65 kilograms, for example, then the mode would be 65.

Next up is the standard deviation . 10.6 indicates that there’s quite a wide spread of numbers. We can see this quite easily by looking at the numbers themselves, which range from 55 to 90, which is quite a stretch from the mean of 72.4.

And lastly, the skewness of -0.2 tells us that the data is very slightly negatively skewed. This makes sense since the mean and the median are slightly different.

As you can see, these descriptive statistics give us some useful insight into the data set. Of course, this is a very small data set (only 10 records), so we can’t read into these statistics too much. Also, keep in mind that this is not a list of all possible descriptive statistics – just the most common ones.

But why do all of these numbers matter?

While these descriptive statistics are all fairly basic, they’re important for a few reasons:

  • Firstly, they help you get both a macro and micro-level view of your data. In other words, they help you understand both the big picture and the finer details.
  • Secondly, they help you spot potential errors in the data – for example, if an average is way higher than you’d expect, or responses to a question are highly varied, this can act as a warning sign that you need to double-check the data.
  • And lastly, these descriptive statistics help inform which inferential statistical techniques you can use, as those techniques depend on the skewness (in other words, the symmetry and normality) of the data.

Simply put, descriptive statistics are really important , even though the statistical techniques used are fairly basic. All too often at Grad Coach, we see students skimming over the descriptives in their eagerness to get to the more exciting inferential methods, and then landing up with some very flawed results.

Don’t be a sucker – give your descriptive statistics the love and attention they deserve!

Examples of descriptive statistics

Branch 2: Inferential Statistics

As I mentioned, while descriptive statistics are all about the details of your specific data set – your sample – inferential statistics aim to make inferences about the population . In other words, you’ll use inferential statistics to make predictions about what you’d expect to find in the full population.

What kind of predictions, you ask? Well, there are two common types of predictions that researchers try to make using inferential stats:

  • Firstly, predictions about differences between groups – for example, height differences between children grouped by their favourite meal or gender.
  • And secondly, relationships between variables – for example, the relationship between body weight and the number of hours a week a person does yoga.

In other words, inferential statistics (when done correctly), allow you to connect the dots and make predictions about what you expect to see in the real world population, based on what you observe in your sample data. For this reason, inferential statistics are used for hypothesis testing – in other words, to test hypotheses that predict changes or differences.

Inferential statistics are used to make predictions about what you’d expect to find in the full population, based on the sample.

Of course, when you’re working with inferential statistics, the composition of your sample is really important. In other words, if your sample doesn’t accurately represent the population you’re researching, then your findings won’t necessarily be very useful.

For example, if your population of interest is a mix of 50% male and 50% female , but your sample is 80% male , you can’t make inferences about the population based on your sample, since it’s not representative. This area of statistics is called sampling, but we won’t go down that rabbit hole here (it’s a deep one!) – we’ll save that for another post .

What statistics are usually used in this branch?

There are many, many different statistical analysis methods within the inferential branch and it’d be impossible for us to discuss them all here. So we’ll just take a look at some of the most common inferential statistical methods so that you have a solid starting point.

First up are T-Tests . T-tests compare the means (the averages) of two groups of data to assess whether they’re statistically significantly different. In other words, do they have significantly different means, standard deviations and skewness.

This type of testing is very useful for understanding just how similar or different two groups of data are. For example, you might want to compare the mean blood pressure between two groups of people – one that has taken a new medication and one that hasn’t – to assess whether they are significantly different.

Kicking things up a level, we have ANOVA, which stands for “analysis of variance”. This test is similar to a T-test in that it compares the means of various groups, but ANOVA allows you to analyse multiple groups , not just two groups So it’s basically a t-test on steroids…

Next, we have correlation analysis . This type of analysis assesses the relationship between two variables. In other words, if one variable increases, does the other variable also increase, decrease or stay the same. For example, if the average temperature goes up, do average ice creams sales increase too? We’d expect some sort of relationship between these two variables intuitively , but correlation analysis allows us to measure that relationship scientifically .

Lastly, we have regression analysis – this is quite similar to correlation in that it assesses the relationship between variables, but it goes a step further to understand cause and effect between variables, not just whether they move together. In other words, does the one variable actually cause the other one to move, or do they just happen to move together naturally thanks to another force? Just because two variables correlate doesn’t necessarily mean that one causes the other.

Stats overload…

I hear you. To make this all a little more tangible, let’s take a look at an example of a correlation in action.

Here’s a scatter plot demonstrating the correlation (relationship) between weight and height. Intuitively, we’d expect there to be some relationship between these two variables, which is what we see in this scatter plot. In other words, the results tend to cluster together in a diagonal line from bottom left to top right.

Sample correlation

As I mentioned, these are are just a handful of inferential techniques – there are many, many more. Importantly, each statistical method has its own assumptions and limitations .

For example, some methods only work with normally distributed (parametric) data, while other methods are designed specifically for non-parametric data. And that’s exactly why descriptive statistics are so important – they’re the first step to knowing which inferential techniques you can and can’t use.

Remember that every statistical method has its own assumptions and limitations,  so you need to be aware of these.

How to choose the right analysis method

To choose the right statistical methods, you need to think about two important factors :

  • The type of quantitative data you have (specifically, level of measurement and the shape of the data). And,
  • Your research questions and hypotheses

Let’s take a closer look at each of these.

Factor 1 – Data type

The first thing you need to consider is the type of data you’ve collected (or the type of data you will collect). By data types, I’m referring to the four levels of measurement – namely, nominal, ordinal, interval and ratio. If you’re not familiar with this lingo, check out the video below.

Why does this matter?

Well, because different statistical methods and techniques require different types of data. This is one of the “assumptions” I mentioned earlier – every method has its assumptions regarding the type of data.

For example, some techniques work with categorical data (for example, yes/no type questions, or gender or ethnicity), while others work with continuous numerical data (for example, age, weight or income) – and, of course, some work with multiple data types.

If you try to use a statistical method that doesn’t support the data type you have, your results will be largely meaningless . So, make sure that you have a clear understanding of what types of data you’ve collected (or will collect). Once you have this, you can then check which statistical methods would support your data types here .

If you haven’t collected your data yet, you can work in reverse and look at which statistical method would give you the most useful insights, and then design your data collection strategy to collect the correct data types.

Another important factor to consider is the shape of your data . Specifically, does it have a normal distribution (in other words, is it a bell-shaped curve, centred in the middle) or is it very skewed to the left or the right? Again, different statistical techniques work for different shapes of data – some are designed for symmetrical data while others are designed for skewed data.

This is another reminder of why descriptive statistics are so important – they tell you all about the shape of your data.

Factor 2: Your research questions

The next thing you need to consider is your specific research questions, as well as your hypotheses (if you have some). The nature of your research questions and research hypotheses will heavily influence which statistical methods and techniques you should use.

If you’re just interested in understanding the attributes of your sample (as opposed to the entire population), then descriptive statistics are probably all you need. For example, if you just want to assess the means (averages) and medians (centre points) of variables in a group of people.

On the other hand, if you aim to understand differences between groups or relationships between variables and to infer or predict outcomes in the population, then you’ll likely need both descriptive statistics and inferential statistics.

So, it’s really important to get very clear about your research aims and research questions, as well your hypotheses – before you start looking at which statistical techniques to use.

Never shoehorn a specific statistical technique into your research just because you like it or have some experience with it. Your choice of methods must align with all the factors we’ve covered here.

Time to recap…

You’re still with me? That’s impressive. We’ve covered a lot of ground here, so let’s recap on the key points:

  • Quantitative data analysis is all about  analysing number-based data  (which includes categorical and numerical data) using various statistical techniques.
  • The two main  branches  of statistics are  descriptive statistics  and  inferential statistics . Descriptives describe your sample, whereas inferentials make predictions about what you’ll find in the population.
  • Common  descriptive statistical methods include  mean  (average),  median , standard  deviation  and  skewness .
  • Common  inferential statistical methods include  t-tests ,  ANOVA ,  correlation  and  regression  analysis.
  • To choose the right statistical methods and techniques, you need to consider the  type of data you’re working with , as well as your  research questions  and hypotheses.

data presentation methods in quantitative research

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

77 Comments

Oddy Labs

Hi, I have read your article. Such a brilliant post you have created.

Derek Jansen

Thank you for the feedback. Good luck with your quantitative analysis.

Abdullahi Ramat

Thank you so much.

Obi Eric Onyedikachi

Thank you so much. I learnt much well. I love your summaries of the concepts. I had love you to explain how to input data using SPSS

MWASOMOLA, BROWN

Very useful, I have got the concept

Lumbuka Kaunda

Amazing and simple way of breaking down quantitative methods.

Charles Lwanga

This is beautiful….especially for non-statisticians. I have skimmed through but I wish to read again. and please include me in other articles of the same nature when you do post. I am interested. I am sure, I could easily learn from you and get off the fear that I have had in the past. Thank you sincerely.

Essau Sefolo

Send me every new information you might have.

fatime

i need every new information

Dr Peter

Thank you for the blog. It is quite informative. Dr Peter Nemaenzhe PhD

Mvogo Mvogo Ephrem

It is wonderful. l’ve understood some of the concepts in a more compréhensive manner

Maya

Your article is so good! However, I am still a bit lost. I am doing a secondary research on Gun control in the US and increase in crime rates and I am not sure which analysis method I should use?

Joy

Based on the given learning points, this is inferential analysis, thus, use ‘t-tests, ANOVA, correlation and regression analysis’

Peter

Well explained notes. Am an MPH student and currently working on my thesis proposal, this has really helped me understand some of the things I didn’t know.

Jejamaije Mujoro

I like your page..helpful

prashant pandey

wonderful i got my concept crystal clear. thankyou!!

Dailess Banda

This is really helpful , thank you

Lulu

Thank you so much this helped

wossen

Wonderfully explained

Niamatullah zaheer

thank u so much, it was so informative

mona

THANKYOU, this was very informative and very helpful

Thaddeus Ogwoka

This is great GRADACOACH I am not a statistician but I require more of this in my thesis

Include me in your posts.

Alem Teshome

This is so great and fully useful. I would like to thank you again and again.

Mrinal

Glad to read this article. I’ve read lot of articles but this article is clear on all concepts. Thanks for sharing.

Emiola Adesina

Thank you so much. This is a very good foundation and intro into quantitative data analysis. Appreciate!

Josyl Hey Aquilam

You have a very impressive, simple but concise explanation of data analysis for Quantitative Research here. This is a God-send link for me to appreciate research more. Thank you so much!

Lynnet Chikwaikwai

Avery good presentation followed by the write up. yes you simplified statistics to make sense even to a layman like me. Thank so much keep it up. The presenter did ell too. i would like more of this for Qualitative and exhaust more of the test example like the Anova.

Adewole Ikeoluwa

This is a very helpful article, couldn’t have been clearer. Thank you.

Samih Soud ALBusaidi

Awesome and phenomenal information.Well done

Nūr

The video with the accompanying article is super helpful to demystify this topic. Very well done. Thank you so much.

Lalah

thank you so much, your presentation helped me a lot

Anjali

I don’t know how should I express that ur article is saviour for me 🥺😍

Saiqa Aftab Tunio

It is well defined information and thanks for sharing. It helps me a lot in understanding the statistical data.

Funeka Mvandaba

I gain a lot and thanks for sharing brilliant ideas, so wish to be linked on your email update.

Rita Kathomi Gikonyo

Very helpful and clear .Thank you Gradcoach.

Hilaria Barsabal

Thank for sharing this article, well organized and information presented are very clear.

AMON TAYEBWA

VERY INTERESTING AND SUPPORTIVE TO NEW RESEARCHERS LIKE ME. AT LEAST SOME BASICS ABOUT QUANTITATIVE.

Tariq

An outstanding, well explained and helpful article. This will help me so much with my data analysis for my research project. Thank you!

chikumbutso

wow this has just simplified everything i was scared of how i am gonna analyse my data but thanks to you i will be able to do so

Idris Haruna

simple and constant direction to research. thanks

Mbunda Castro

This is helpful

AshikB

Great writing!! Comprehensive and very helpful.

himalaya ravi

Do you provide any assistance for other steps of research methodology like making research problem testing hypothesis report and thesis writing?

Sarah chiwamba

Thank you so much for such useful article!

Lopamudra

Amazing article. So nicely explained. Wow

Thisali Liyanage

Very insightfull. Thanks

Melissa

I am doing a quality improvement project to determine if the implementation of a protocol will change prescribing habits. Would this be a t-test?

Aliyah

The is a very helpful blog, however, I’m still not sure how to analyze my data collected. I’m doing a research on “Free Education at the University of Guyana”

Belayneh Kassahun

tnx. fruitful blog!

Suzanne

So I am writing exams and would like to know how do establish which method of data analysis to use from the below research questions: I am a bit lost as to how I determine the data analysis method from the research questions.

Do female employees report higher job satisfaction than male employees with similar job descriptions across the South African telecommunications sector? – I though that maybe Chi Square could be used here. – Is there a gender difference in talented employees’ actual turnover decisions across the South African telecommunications sector? T-tests or Correlation in this one. – Is there a gender difference in the cost of actual turnover decisions across the South African telecommunications sector? T-tests or Correlation in this one. – What practical recommendations can be made to the management of South African telecommunications companies on leveraging gender to mitigate employee turnover decisions?

Your assistance will be appreciated if I could get a response as early as possible tomorrow

Like

This was quite helpful. Thank you so much.

kidane Getachew

wow I got a lot from this article, thank you very much, keep it up

FAROUK AHMAD NKENGA

Thanks for yhe guidance. Can you send me this guidance on my email? To enable offline reading?

Nosi Ruth Xabendlini

Thank you very much, this service is very helpful.

George William Kiyingi

Every novice researcher needs to read this article as it puts things so clear and easy to follow. Its been very helpful.

Adebisi

Wonderful!!!! you explained everything in a way that anyone can learn. Thank you!!

Miss Annah

I really enjoyed reading though this. Very easy to follow. Thank you

Reza Kia

Many thanks for your useful lecture, I would be really appreciated if you could possibly share with me the PPT of presentation related to Data type?

Protasia Tairo

Thank you very much for sharing, I got much from this article

Fatuma Chobo

This is a very informative write-up. Kindly include me in your latest posts.

naphtal

Very interesting mostly for social scientists

Boy M. Bachtiar

Thank you so much, very helpfull

You’re welcome 🙂

Dr Mafaza Mansoor

woow, its great, its very informative and well understood because of your way of writing like teaching in front of me in simple languages.

Opio Len

I have been struggling to understand a lot of these concepts. Thank you for the informative piece which is written with outstanding clarity.

Eric

very informative article. Easy to understand

Leena Fukey

Beautiful read, much needed.

didin

Always greet intro and summary. I learn so much from GradCoach

Mmusyoka

Quite informative. Simple and clear summary.

Jewel Faver

I thoroughly enjoyed reading your informative and inspiring piece. Your profound insights into this topic truly provide a better understanding of its complexity. I agree with the points you raised, especially when you delved into the specifics of the article. In my opinion, that aspect is often overlooked and deserves further attention.

Shantae

Absolutely!!! Thank you

Thazika Chitimera

Thank you very much for this post. It made me to understand how to do my data analysis.

lule victor

its nice work and excellent job ,you have made my work easier

Pedro Uwadum

Wow! So explicit. Well done.

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Quantitative Data Analysis: A Comprehensive Guide

By: Ofem Eteng | Published: May 18, 2022

Related Articles

data presentation methods in quantitative research

A healthcare giant successfully introduces the most effective drug dosage through rigorous statistical modeling, saving countless lives. A marketing team predicts consumer trends with uncanny accuracy, tailoring campaigns for maximum impact.

Table of Contents

These trends and dosages are not just any numbers but are a result of meticulous quantitative data analysis. Quantitative data analysis offers a robust framework for understanding complex phenomena, evaluating hypotheses, and predicting future outcomes.

In this blog, we’ll walk through the concept of quantitative data analysis, the steps required, its advantages, and the methods and techniques that are used in this analysis. Read on!

What is Quantitative Data Analysis?

Quantitative data analysis is a systematic process of examining, interpreting, and drawing meaningful conclusions from numerical data. It involves the application of statistical methods, mathematical models, and computational techniques to understand patterns, relationships, and trends within datasets.

Quantitative data analysis methods typically work with algorithms, mathematical analysis tools, and software to gain insights from the data, answering questions such as how many, how often, and how much. Data for quantitative data analysis is usually collected from close-ended surveys, questionnaires, polls, etc. The data can also be obtained from sales figures, email click-through rates, number of website visitors, and percentage revenue increase. 

Quantitative Data Analysis vs Qualitative Data Analysis

When we talk about data, we directly think about the pattern, the relationship, and the connection between the datasets – analyzing the data in short. Therefore when it comes to data analysis, there are broadly two types – Quantitative Data Analysis and Qualitative Data Analysis.

Quantitative data analysis revolves around numerical data and statistics, which are suitable for functions that can be counted or measured. In contrast, qualitative data analysis includes description and subjective information – for things that can be observed but not measured.

Let us differentiate between Quantitative Data Analysis and Quantitative Data Analysis for a better understanding.

Numerical data – statistics, counts, metrics measurementsText data – customer feedback, opinions, documents, notes, audio/video recordings
Close-ended surveys, polls and experiments.Open-ended questions, descriptive interviews
What? How much? Why (to a certain extent)?How? Why? What are individual experiences and motivations?
Statistical programming software like R, Python, SAS and Data visualization like Tableau, Power BINVivo, Atlas.ti for qualitative coding.
Word processors and highlighters – Mindmaps and visual canvases
Best used for large sample sizes for quick answers.Best used for small to middle sample sizes for descriptive insights

Data Preparation Steps for Quantitative Data Analysis

Quantitative data has to be gathered and cleaned before proceeding to the stage of analyzing it. Below are the steps to prepare a data before quantitative research analysis:

  • Step 1: Data Collection

Before beginning the analysis process, you need data. Data can be collected through rigorous quantitative research, which includes methods such as interviews, focus groups, surveys, and questionnaires.

  • Step 2: Data Cleaning

Once the data is collected, begin the data cleaning process by scanning through the entire data for duplicates, errors, and omissions. Keep a close eye for outliers (data points that are significantly different from the majority of the dataset) because they can skew your analysis results if they are not removed.

This data-cleaning process ensures data accuracy, consistency and relevancy before analysis.

  • Step 3: Data Analysis and Interpretation

Now that you have collected and cleaned your data, it is now time to carry out the quantitative analysis. There are two methods of quantitative data analysis, which we will discuss in the next section.

However, if you have data from multiple sources, collecting and cleaning it can be a cumbersome task. This is where Hevo Data steps in. With Hevo, extracting, transforming, and loading data from source to destination becomes a seamless task, eliminating the need for manual coding. This not only saves valuable time but also enhances the overall efficiency of data analysis and visualization, empowering users to derive insights quickly and with precision

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

Start for free now!

Now that you are familiar with what quantitative data analysis is and how to prepare your data for analysis, the focus will shift to the purpose of this article, which is to describe the methods and techniques of quantitative data analysis.

Methods and Techniques of Quantitative Data Analysis

Quantitative data analysis employs two techniques to extract meaningful insights from datasets, broadly. The first method is descriptive statistics, which summarizes and portrays essential features of a dataset, such as mean, median, and standard deviation.

Inferential statistics, the second method, extrapolates insights and predictions from a sample dataset to make broader inferences about an entire population, such as hypothesis testing and regression analysis.

An in-depth explanation of both the methods is provided below:

  • Descriptive Statistics
  • Inferential Statistics

1) Descriptive Statistics

Descriptive statistics as the name implies is used to describe a dataset. It helps understand the details of your data by summarizing it and finding patterns from the specific data sample. They provide absolute numbers obtained from a sample but do not necessarily explain the rationale behind the numbers and are mostly used for analyzing single variables. The methods used in descriptive statistics include: 

  • Mean:   This calculates the numerical average of a set of values.
  • Median: This is used to get the midpoint of a set of values when the numbers are arranged in numerical order.
  • Mode: This is used to find the most commonly occurring value in a dataset.
  • Percentage: This is used to express how a value or group of respondents within the data relates to a larger group of respondents.
  • Frequency: This indicates the number of times a value is found.
  • Range: This shows the highest and lowest values in a dataset.
  • Standard Deviation: This is used to indicate how dispersed a range of numbers is, meaning, it shows how close all the numbers are to the mean.
  • Skewness: It indicates how symmetrical a range of numbers is, showing if they cluster into a smooth bell curve shape in the middle of the graph or if they skew towards the left or right.

2) Inferential Statistics

In quantitative analysis, the expectation is to turn raw numbers into meaningful insight using numerical values, and descriptive statistics is all about explaining details of a specific dataset using numbers, but it does not explain the motives behind the numbers; hence, a need for further analysis using inferential statistics.

Inferential statistics aim to make predictions or highlight possible outcomes from the analyzed data obtained from descriptive statistics. They are used to generalize results and make predictions between groups, show relationships that exist between multiple variables, and are used for hypothesis testing that predicts changes or differences.

There are various statistical analysis methods used within inferential statistics; a few are discussed below.

  • Cross Tabulations: Cross tabulation or crosstab is used to show the relationship that exists between two variables and is often used to compare results by demographic groups. It uses a basic tabular form to draw inferences between different data sets and contains data that is mutually exclusive or has some connection with each other. Crosstabs help understand the nuances of a dataset and factors that may influence a data point.
  • Regression Analysis: Regression analysis estimates the relationship between a set of variables. It shows the correlation between a dependent variable (the variable or outcome you want to measure or predict) and any number of independent variables (factors that may impact the dependent variable). Therefore, the purpose of the regression analysis is to estimate how one or more variables might affect a dependent variable to identify trends and patterns to make predictions and forecast possible future trends. There are many types of regression analysis, and the model you choose will be determined by the type of data you have for the dependent variable. The types of regression analysis include linear regression, non-linear regression, binary logistic regression, etc.
  • Monte Carlo Simulation: Monte Carlo simulation, also known as the Monte Carlo method, is a computerized technique of generating models of possible outcomes and showing their probability distributions. It considers a range of possible outcomes and then tries to calculate how likely each outcome will occur. Data analysts use it to perform advanced risk analyses to help forecast future events and make decisions accordingly.
  • Analysis of Variance (ANOVA): This is used to test the extent to which two or more groups differ from each other. It compares the mean of various groups and allows the analysis of multiple groups.
  • Factor Analysis:   A large number of variables can be reduced into a smaller number of factors using the factor analysis technique. It works on the principle that multiple separate observable variables correlate with each other because they are all associated with an underlying construct. It helps in reducing large datasets into smaller, more manageable samples.
  • Cohort Analysis: Cohort analysis can be defined as a subset of behavioral analytics that operates from data taken from a given dataset. Rather than looking at all users as one unit, cohort analysis breaks down data into related groups for analysis, where these groups or cohorts usually have common characteristics or similarities within a defined period.
  • MaxDiff Analysis: This is a quantitative data analysis method that is used to gauge customers’ preferences for purchase and what parameters rank higher than the others in the process. 
  • Cluster Analysis: Cluster analysis is a technique used to identify structures within a dataset. Cluster analysis aims to be able to sort different data points into groups that are internally similar and externally different; that is, data points within a cluster will look like each other and different from data points in other clusters.
  • Time Series Analysis: This is a statistical analytic technique used to identify trends and cycles over time. It is simply the measurement of the same variables at different times, like weekly and monthly email sign-ups, to uncover trends, seasonality, and cyclic patterns. By doing this, the data analyst can forecast how variables of interest may fluctuate in the future. 
  • SWOT analysis: This is a quantitative data analysis method that assigns numerical values to indicate strengths, weaknesses, opportunities, and threats of an organization, product, or service to show a clearer picture of competition to foster better business strategies

How to Choose the Right Method for your Analysis?

Choosing between Descriptive Statistics or Inferential Statistics can be often confusing. You should consider the following factors before choosing the right method for your quantitative data analysis:

1. Type of Data

The first consideration in data analysis is understanding the type of data you have. Different statistical methods have specific requirements based on these data types, and using the wrong method can render results meaningless. The choice of statistical method should align with the nature and distribution of your data to ensure meaningful and accurate analysis.

2. Your Research Questions

When deciding on statistical methods, it’s crucial to align them with your specific research questions and hypotheses. The nature of your questions will influence whether descriptive statistics alone, which reveal sample attributes, are sufficient or if you need both descriptive and inferential statistics to understand group differences or relationships between variables and make population inferences.

Pros and Cons of Quantitative Data Analysis

1. Objectivity and Generalizability:

  • Quantitative data analysis offers objective, numerical measurements, minimizing bias and personal interpretation.
  • Results can often be generalized to larger populations, making them applicable to broader contexts.

Example: A study using quantitative data analysis to measure student test scores can objectively compare performance across different schools and demographics, leading to generalizable insights about educational strategies.

2. Precision and Efficiency:

  • Statistical methods provide precise numerical results, allowing for accurate comparisons and prediction.
  • Large datasets can be analyzed efficiently with the help of computer software, saving time and resources.

Example: A marketing team can use quantitative data analysis to precisely track click-through rates and conversion rates on different ad campaigns, quickly identifying the most effective strategies for maximizing customer engagement.

3. Identification of Patterns and Relationships:

  • Statistical techniques reveal hidden patterns and relationships between variables that might not be apparent through observation alone.
  • This can lead to new insights and understanding of complex phenomena.

Example: A medical researcher can use quantitative analysis to pinpoint correlations between lifestyle factors and disease risk, aiding in the development of prevention strategies.

1. Limited Scope:

  • Quantitative analysis focuses on quantifiable aspects of a phenomenon ,  potentially overlooking important qualitative nuances, such as emotions, motivations, or cultural contexts.

Example: A survey measuring customer satisfaction with numerical ratings might miss key insights about the underlying reasons for their satisfaction or dissatisfaction, which could be better captured through open-ended feedback.

2. Oversimplification:

  • Reducing complex phenomena to numerical data can lead to oversimplification and a loss of richness in understanding.

Example: Analyzing employee productivity solely through quantitative metrics like hours worked or tasks completed might not account for factors like creativity, collaboration, or problem-solving skills, which are crucial for overall performance.

3. Potential for Misinterpretation:

  • Statistical results can be misinterpreted if not analyzed carefully and with appropriate expertise.
  • The choice of statistical methods and assumptions can significantly influence results.

This blog discusses the steps, methods, and techniques of quantitative data analysis. It also gives insights into the methods of data collection, the type of data one should work with, and the pros and cons of such analysis.

Gain a better understanding of data analysis with these essential reads:

  • Data Analysis and Modeling: 4 Critical Differences
  • Exploratory Data Analysis Simplified 101
  • 25 Best Data Analysis Tools in 2024

Carrying out successful data analysis requires prepping the data and making it analysis-ready. That is where Hevo steps in.

Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing Hevo price , which will assist you in selecting the best plan for your requirements.

Share your experience of understanding Quantitative Data Analysis in the comment section below! We would love to hear your thoughts.

Ofem Eteng is a seasoned technical content writer with over 12 years of experience. He has held pivotal roles such as System Analyst (DevOps) at Dagbs Nigeria Limited and Full-Stack Developer at Pedoquasphere International Limited. He specializes in data science, data analytics and cutting-edge technologies, making him an expert in the data industry.

No-code Data Pipeline for your Data Warehouse

  • Data Analysis
  • Data Warehouse
  • Quantitative Data Analysis

Continue Reading

data presentation methods in quantitative research

Rashmi Joshi

Matillion vs dbt: 5 Key Differences

data presentation methods in quantitative research

Skand Agrawal

AWS Glue Architecture: Components, Working, and Alternatives

data presentation methods in quantitative research

Asimiyu Musa

AWS Glue Data Quality: Implementation, Best Practices & Alternatives

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

I want to read this e-book

data presentation methods in quantitative research

  • Databank Solution
  • Credit Risk Solution
  • Valuation Analytics Solution
  • ESG Sustainability Solution
  • Quantitative Finance Solution
  • Regulatory Technology Solution
  • Product Introduction
  • Service Provide Method
  • Learning Video

Quantitative Data Analysis Guide: Methods, Examples & Uses

data presentation methods in quantitative research

This guide will introduce the types of data analysis used in quantitative research, then discuss relevant examples and applications in the finance industry.

Table of Contents

An Overview of Quantitative Data Analysis

What is quantitative data analysis and what is it for .

Quantitative data analysis is the process of interpreting meaning and extracting insights from numerical data , which involves mathematical calculations and statistical reviews to uncover patterns, trends, and relationships between variables.

Beyond academic and statistical research, this approach is particularly useful in the finance industry. Financial data, such as stock prices, interest rates, and economic indicators, can all be quantified with statistics and metrics to offer crucial insights for informed investment decisions. To illustrate this, here are some examples of what quantitative data is usually used for:

  • Measuring Differences between Groups: For instance, analyzing historical stock prices of different companies or asset classes can reveal which companies consistently outperform the market average.
  • Assessing Relationships between Variables: An investor could analyze the relationship between a company’s price-to-earnings ratio (P/E ratio) and relevant factors, like industry performance, inflation rates, interests, etc, allowing them to predict future stock price growth.
  • Testing Hypotheses: For example, an investor might hypothesize that companies with strong ESG (Environment, Social, and Governance) practices outperform those without. By categorizing these companies into two groups (strong ESG vs. weak ESG practices), they can compare the average return on investment (ROI) between the groups while assessing relevant factors to find evidence for the hypothesis. 

Ultimately, quantitative data analysis helps investors navigate the complex financial landscape and pursue profitable opportunities.

Quantitative Data Analysis VS. Qualitative Data Analysis

Although quantitative data analysis is a powerful tool, it cannot be used to provide context for your research, so this is where qualitative analysis comes in. Qualitative analysis is another common research method that focuses on collecting and analyzing non-numerical data , like text, images, or audio recordings to gain a deeper understanding of experiences, opinions, and motivations. Here’s a table summarizing its key differences between quantitative data analysis:

Types of Data UsedNumerical data: numbers, percentages, etc.Non-numerical data: text, images, audio, narratives, etc
Perspective More objective and less prone to biasMore subjective as it may be influenced by the researcher’s interpretation
Data CollectionClosed-ended questions, surveys, pollsOpen-ended questions, interviews, observations
Data AnalysisStatistical methods, numbers, graphs, chartsCategorization, thematic analysis, verbal communication
Focus and and
Best Use CaseMeasuring trends, comparing groups, testing hypothesesUnderstanding user experience, exploring consumer motivations, uncovering new ideas

Due to their characteristics, quantitative analysis allows you to measure and compare large datasets; while qualitative analysis helps you understand the context behind the data. In some cases, researchers might even use both methods together for a more comprehensive understanding, but we’ll mainly focus on quantitative analysis for this article.

The 2 Main Quantitative Data Analysis Methods

Once you have your data collected, you have to use descriptive statistics or inferential statistics analysis to draw summaries and conclusions from your raw numbers. 

As its name suggests, the purpose of descriptive statistics is to describe your sample . It provides the groundwork for understanding your data by focusing on the details and characteristics of the specific group you’ve collected data from. 

On the other hand, inferential statistics act as bridges that connect your sample data to the broader population you’re truly interested in, helping you to draw conclusions in your research. Moreover, choosing the right inferential technique for your specific data and research questions is dependent on the initial insights from descriptive statistics, so both of these methods usually go hand-in-hand.

Descriptive Statistics Analysis

With sophisticated descriptive statistics, you can detect potential errors in your data by highlighting inconsistencies and outliers that might otherwise go unnoticed. Additionally, the characteristics revealed by descriptive statistics will help determine which inferential techniques are suitable for further analysis.

Measures in Descriptive Statistics

One of the key statistical tests used for descriptive statistics is central tendency . It consists of mean, median, and mode, telling you where most of your data points cluster:

  • Mean: It refers to the “average” and is calculated by adding all the values in your data set and dividing by the number of values.
  • Median: The middle value when your data is arranged in ascending or descending order. If you have an odd number of data points, the median is the exact middle value; with even numbers, it’s the average of the two middle values. 
  • Mode: This refers to the most frequently occurring value in your data set, indicating the most common response or observation. Some data can have multiple modes (bimodal) or no mode at all.

Another statistic to test in descriptive analysis is the measures of dispersion , which involves range and standard deviation, revealing how spread out your data is relative to the central tendency measures:

  • Range: It refers to the difference between the highest and lowest values in your data set. 
  • Standard Deviation (SD): This tells you how the data is distributed within the range, revealing how much, on average, each data point deviates from the mean. Lower standard deviations indicate data points clustered closer to the mean, while higher standard deviations suggest a wider spread.

The shape of the distribution will then be measured through skewness. 

  • Skewness: A statistic that indicates whether your data leans to one side (positive or negative) or is symmetrical (normal distribution). A positive skew suggests more data points concentrated on the lower end, while a negative skew indicates more data points on the higher end.

While the core measures mentioned above are fundamental, there are additional descriptive statistics used in specific contexts, including percentiles and interquartile range.

  • Percentiles: This divides your data into 100 equal parts, revealing what percentage of data falls below a specific value. The 25th percentile (Q1) is the first quartile, the 50th percentile (Q2) is the median, and the 75th percentile (Q3) is the third quartile. Knowing these quartiles can help visualize the spread of your data.
  • Interquartile Range (IQR): This measures the difference between Q3 and Q1, representing the middle 50% of your data.

Example of Descriptive Quantitative Data Analysis 

Let’s illustrate these concepts with a real-world example. Imagine a financial advisor analyzing a client’s portfolio. They have data on the client’s various holdings, including stock prices over the past year. With descriptive statistics they can obtain the following information:

  • Central Tendency: The mean price for each stock reveals its average price over the year. The median price can further highlight if there were any significant price spikes or dips that skewed the mean.
  • Measures of Dispersion: The standard deviation for each stock indicates its price volatility. A high standard deviation suggests the stock’s price fluctuated considerably, while a low standard deviation implies a more stable price history. This helps the advisor assess each stock’s risk profile.
  • Shape of the Distribution: If data allows, analyzing skewness can be informative. A positive skew for a stock might suggest more frequent price drops, while a negative skew might indicate more frequent price increases.

By calculating these descriptive statistics, the advisor gains a quick understanding of the client’s portfolio performance and risk distribution. For instance, they could use correlation analysis to see if certain stock prices tend to move together, helping them identify expansion opportunities within the portfolio.

While descriptive statistics provide a foundational understanding, they should be followed by inferential analysis to uncover deeper insights that are crucial for making investment decisions.

Inferential Statistics Analysis

Inferential statistics analysis is particularly useful for hypothesis testing , as you can formulate predictions about group differences or potential relationships between variables , then use statistical tests to see if your sample data supports those hypotheses.

However, the power of inferential statistics hinges on one crucial factor: sample representativeness . If your sample doesn’t accurately reflect the population, your predictions won’t be very reliable. 

Statistical Tests for Inferential Statistics

Here are some of the commonly used tests for inferential statistics in commerce and finance, which can also be integrated to most analysis software:

  • T-Tests: This compares the means, standard deviation, or skewness of two groups to assess if they’re statistically different, helping you determine if the observed difference is just a quirk within the sample or a significant reflection of the population.
  • ANOVA (Analysis of Variance): While T-Tests handle comparisons between two groups, ANOVA focuses on comparisons across multiple groups, allowing you to identify potential variations and trends within the population.
  • Correlation Analysis: This technique tests the relationship between two variables, assessing if one variable increases or decreases with the other. However, it’s important to note that just because two financial variables are correlated and move together, doesn’t necessarily mean one directly influences the other.
  • Regression Analysis: Building on correlation, regression analysis goes a step further to verify the cause-and-effect relationships between the tested variables, allowing you to investigate if one variable actually influences the other.
  • Cross-Tabulation: This breaks down the relationship between two categorical variables by displaying the frequency counts in a table format, helping you to understand how different groups within your data set might behave. The data in cross-tabulation can be mutually exclusive or have several connections with each other. 
  • Trend Analysis: This examines how a variable in quantitative data changes over time, revealing upward or downward trends, as well as seasonal fluctuations. This can help you forecast future trends, and also lets you assess the effectiveness of the interventions in your marketing or investment strategy.
  • MaxDiff Analysis: This is also known as the “best-worst” method. It evaluates customer preferences by asking respondents to choose the most and least preferred options from a set of products or services, allowing stakeholders to optimize product development or marketing strategies.
  • Conjoint Analysis: Similar to MaxDiff, conjoint analysis gauges customer preferences, but it goes a step further by allowing researchers to see how changes in different product features (price, size, brand) influence overall preference.
  • TURF Analysis (Total Unduplicated Reach and Frequency Analysis): This assesses a marketing campaign’s reach and frequency of exposure in different channels, helping businesses identify the most efficient channels to reach target audiences.
  • Gap Analysis: This compares current performance metrics against established goals or benchmarks, using numerical data to represent the factors involved. This helps identify areas where performance falls short of expectations, serving as a springboard for developing strategies to bridge the gap and achieve those desired outcomes.
  • SWOT Analysis (Strengths, Weaknesses, Opportunities, and Threats): This uses ratings or rankings to represent an organization’s internal strengths and weaknesses, along with external opportunities and threats. Based on this analysis, organizations can create strategic plans to capitalize on opportunities while minimizing risks.
  • Text Analysis: This is an advanced method that uses specialized software to categorize and quantify themes, sentiment (positive, negative, neutral), and topics within textual data, allowing companies to obtain structured quantitative data from surveys, social media posts, or customer reviews.

Example of Inferential Quantitative Data Analysis

If you’re a financial analyst studying the historical performance of a particular stock, here are some predictions you can make with inferential statistics:

  • The Differences between Groups: You can conduct T-Tests to compare the average returns of stocks in the technology sector with those in the healthcare sector. It can help assess if the observed difference in returns between these two sectors is simply due to random chance or if it’s statistically significant due to a significant difference in their performance.
  • The Relationships between Variables: If you’re curious about the connection between a company’s price-to-earnings ratio (P/E ratios) and its future stock price movements, conducting correlation analysis can let you measure the strength and direction of this relationship. Is there a negative correlation, suggesting that higher P/E ratios might be associated with lower future stock prices? Or is there no significant correlation at all?

Understanding these inferential analysis techniques can help you uncover potential relationships and group differences that might not be readily apparent from descriptive statistics alone. Nonetheless, it’s important to remember that each technique has its own set of assumptions and limitations . Some methods are designed for parametric data with a normal distribution, while others are suitable for non-parametric data. 

Guide to Conduct Data Analysis in Quantitative Research

Now that we have discussed the types of data analysis techniques used in quantitative research, here’s a quick guide to help you choose the right method and grasp the essential steps of quantitative data analysis.

How to Choose the Right Quantitative Analysis Method?

Choosing between all these quantitative analysis methods may seem like a complicated task, but if you consider the 2 following factors, you can definitely choose the right technique:

Factor 1: Data Type

The data used in quantitative analysis can be categorized into two types, discrete data and continuous data, based on how they’re measured. They can also be further differentiated by their measurement scale. The four main types of measurement scales include: nominal, ordinal, interval or ratio. Understanding the distinctions between them is essential for choosing the appropriate statistical methods to interpret the results of your quantitative data analysis accurately.

Discrete data , which is also known as attribute data, represents whole numbers that can be easily counted and separated into distinct categories. It is often visualized using bar charts or pie charts, making it easy to see the frequency of each value. In the financial world, examples of discrete quantitative data include:

  • The number of shares owned by an investor in a particular company
  • The number of customer transactions processed by a bank per day
  • Bond ratings (AAA, BBB, etc.) that represent discrete categories indicating the creditworthiness of a bond issuer
  • The number of customers with different account types (checking, savings, investment) as seen in the pie chart below:

Pie chart illustrating the distribution customers with different account types (checking, savings, investment, salary)

Discrete data usually use nominal or ordinal measurement scales, which can be then quantified to calculate their mode or median. Here are some examples:

  • Nominal: This scale categorizes data into distinct groups with no inherent order. For instance, data on bank account types can be considered nominal data as it classifies customers in distinct categories which are independent of each other, either checking, savings, or investment accounts. and no inherent order or ranking implied by these account types.
  • Ordinal: Ordinal data establishes a rank or order among categories. For example, investment risk ratings (low, medium, high) are ordered based on their perceived risk of loss, making it a type or ordinal data.

Conversely, continuous data can take on any value and fluctuate over time. It is usually visualized using line graphs, effectively showcasing how the values can change within a specific time frame. Examples of continuous data in the financial industry include:

  • Interest rates set by central banks or offered by banks on loans and deposits
  • Currency exchange rates which also fluctuate constantly throughout the day
  • Daily trading volume of a particular stock on a specific day
  • Stock prices that fluctuate throughout the day, as seen in the line graph below:

Line chart illustrating the fluctuating stock prices

Source: Freepik

The measurement scale for continuous data is usually interval or ratio . Here is breakdown of their differences:

  • Interval: This builds upon ordinal data by having consistent intervals between each unit, and its zero point doesn’t represent a complete absence of the variable. Let’s use credit score as an example. While the scale ranges from 300 to 850, the interval between each score rating is consistent (50 points), and a score of zero wouldn’t indicate an absence of credit history, but rather no credit score available. 
  • Ratio: This scale has all the same characteristics of interval data but also has a true zero point, indicating a complete absence of the variable. Interest rates expressed as percentages are a classic example of ratio data. A 0% interest rate signifies the complete absence of any interest charged or earned, making it a true zero point.

Factor 2: Research Question

You also need to make sure that the analysis method aligns with your specific research questions. If you merely want to focus on understanding the characteristics of your data set, descriptive statistics might be all you need; if you need to analyze the connection between variables, then you have to include inferential statistics as well.

How to Analyze Quantitative Data 

Step 1: data collection  .

Depending on your research question, you might choose to conduct surveys or interviews. Distributing online or paper surveys can reach a broad audience, while interviews allow for deeper exploration of specific topics. You can also choose to source existing datasets from government agencies or industry reports.

Step 2: Data Cleaning

Raw data might contain errors, inconsistencies, or missing values, so data cleaning has to be done meticulously to ensure accuracy and consistency. This might involve removing duplicates, correcting typos, and handling missing information.

Furthermore, you should also identify the nature of your variables and assign them appropriate measurement scales , it could be nominal, ordinal, interval or ratio. This is important because it determines the types of descriptive statistics and analysis methods you can employ later. Once you categorize your data based on these measurement scales, you can arrange the data of each category in a proper order and organize it in a format that is convenient for you.

Step 3: Data Analysis

Based on the measurement scales of your variables, calculate relevant descriptive statistics to summarize your data. This might include measures of central tendency (mean, median, mode) and dispersion (range, standard deviation, variance). With these statistics, you can identify the pattern within your raw data. 

Then, these patterns can be analyzed further with inferential methods to test out the hypotheses you have developed. You may choose any of the statistical tests mentioned above, as long as they are compatible with the characteristics of your data.

Step 4. Data Interpretation and Communication 

Now that you have the results from your statistical analysis, you may draw conclusions based on the findings and incorporate them into your business strategies. Additionally, you should also transform your findings into clear and shareable information to facilitate discussion among stakeholders. Visualization techniques like tables, charts, or graphs can make complex data more digestible so that you can communicate your findings efficiently. 

Useful Quantitative Data Analysis Tools and Software 

We’ve compiled some commonly used quantitative data analysis tools and software. Choosing the right one depends on your experience level, project needs, and budget. Here’s a brief comparison: 

EasiestBeginners & basic analysisOne-time purchase with Microsoft Office Suite
EasySocial scientists & researchersPaid commercial license
EasyStudents & researchersPaid commercial license or student discounts
ModerateBusinesses & advanced researchPaid commercial license
ModerateResearchers & statisticiansPaid commercial license
Moderate (Coding optional)Programmers & data scientistsFree & Open-Source
Steep (Coding required)Experienced users & programmersFree & Open-Source
Steep (Coding required)Scientists & engineersPaid commercial license
Steep (Coding required)Scientists & engineersPaid commercial license

Quantitative Data in Finance and Investment

So how does this all affect the finance industry? Quantitative finance (or quant finance) has become a growing trend, with the quant fund market valued at $16,008.69 billion in 2023. This value is expected to increase at the compound annual growth rate of 10.09% and reach $31,365.94 billion by 2031, signifying its expanding role in the industry.

What is Quant Finance?

Quant finance is the process of using massive financial data and mathematical models to identify market behavior, financial trends, movements, and economic indicators, so that they can predict future trends.These calculated probabilities can be leveraged to find potential investment opportunities and maximize returns while minimizing risks.

Common Quantitative Investment Strategies

There are several common quantitative strategies, each offering unique approaches to help stakeholders navigate the market:

1. Statistical Arbitrage

This strategy aims for high returns with low volatility. It employs sophisticated algorithms to identify minuscule price discrepancies across the market, then capitalize on them at lightning speed, often generating short-term profits. However, its reliance on market efficiency makes it vulnerable to sudden market shifts, posing a risk of disrupting the calculations.

2. Factor Investing 

This strategy identifies and invests in assets based on factors like value, momentum, or quality. By analyzing these factors in quantitative databases , investors can construct portfolios designed to outperform the broader market. Overall, this method offers diversification and potentially higher returns than passive investing, but its success relies on the historical validity of these factors, which can evolve over time.

3. Risk Parity

This approach prioritizes portfolio balance above all else. Instead of allocating assets based on their market value, risk parity distributes them based on their risk contribution to achieve a desired level of overall portfolio risk, regardless of individual asset volatility. Although it is efficient in managing risks while potentially offering positive returns, it is important to note that this strategy’s complex calculations can be sensitive to unexpected market events.

4. Machine Learning & Artificial Intelligence (AI)

Quant analysts are beginning to incorporate these cutting-edge technologies into their strategies. Machine learning algorithms can act as data sifters, identifying complex patterns within massive datasets; whereas AI goes a step further, leveraging these insights to make investment decisions, essentially mimicking human-like decision-making with added adaptability. Despite the hefty development and implementation costs, its superior risk-adjusted returns and uncovering hidden patterns make this strategy a valuable asset.

Pros and Cons of Quantitative Data Analysis

Advantages of quantitative data analysis, minimum bias for reliable results.

Quantitative data analysis relies on objective, numerical data. This minimizes bias and human error, allowing stakeholders to make investment decisions without emotional intuitions that can cloud judgment. In turn, this offers reliable and consistent results for investment strategies.

Precise Calculations for Data-Driven Decisions

Quantitative analysis generates precise numerical results through statistical methods. This allows accurate comparisons between investment options and even predictions of future market behavior, helping investors make informed decisions about where to allocate their capital while managing potential risks.

Generalizability for Broader Insights 

By analyzing large datasets and identifying patterns, stakeholders can generalize the findings from quantitative analysis into broader populations, applying them to a wider range of investments for better portfolio construction and risk management

Efficiency for Extensive Research

Quantitative research is more suited to analyze large datasets efficiently, letting companies save valuable time and resources. The softwares used for quantitative analysis can automate the process of sifting through extensive financial data, facilitating quicker decision-making in the fast-paced financial environment.

Disadvantages of Quantitative Data Analysis

Limited scope .

By focusing on numerical data, quantitative analysis may provide a limited scope, as it can’t capture qualitative context such as emotions, motivations, or cultural factors. Although quantitative analysis provides a strong starting point, neglecting qualitative factors can lead to incomplete insights in the financial industry, impacting areas like customer relationship management and targeted marketing strategies.

Oversimplification 

Breaking down complex phenomena into numerical data could cause analysts to overlook the richness of the data, leading to the issue of oversimplification. Stakeholders who fail to understand the complexity of economic factors or market trends could face flawed investment decisions and missed opportunities.

Reliable Quantitative Data Solution 

In conclusion, quantitative data analysis offers a deeper insight into market trends and patterns, empowering you to make well-informed financial decisions. However, collecting comprehensive data and analyzing them can be a complex task that may divert resources from core investment activity. 

As a reliable provider, TEJ understands these concerns. Our TEJ Quantitative Investment Database offers high-quality financial and economic data for rigorous quantitative analysis. This data captures the true market conditions at specific points in time, enabling accurate backtesting of investment strategies.

Furthermore, TEJ offers diverse data sets that go beyond basic stock prices, encompassing various financial metrics, company risk attributes, and even broker trading information, all designed to empower your analysis and strategy development. Save resources and unlock the full potential of quantitative finance with TEJ’s data solutions today!

data presentation methods in quantitative research

Subscribe to newsletter

  • Data Analysis 107
  • Market Research 48
  • TQuant Lab 25
  • Solution&DataSets
  • Privacy Policy Statement
  • Personal Data Policy
  • Copyright Information
  • Website licensing terms
  • Information Security Policy 

data presentation methods in quantitative research

  • 11th Floor, No. 57, DongXing Road, Taipei 11070, Taiwan(R.O.C.)
  • +886-2-8768-1088
  • +886-2-8768-1336
  • [email protected]

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Data Collection | Definition, Methods & Examples

Data Collection | Definition, Methods & Examples

Published on June 5, 2020 by Pritha Bhandari . Revised on June 21, 2023.

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, other interesting articles, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analyzed through statistical methods .
  • Qualitative data is expressed in words and analyzed through interpretations and categorizations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data. If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism. Run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

Data collection methods
Method When to use How to collect data
Experiment To test a causal relationship. Manipulate variables and measure their effects on others.
Survey To understand the general characteristics or opinions of a group of people. Distribute a list of questions to a sample online, in person or over-the-phone.
Interview/focus group To gain an in-depth understanding of perceptions or opinions on a topic. Verbally ask participants open-ended questions in individual interviews or focus group discussions.
Observation To understand something in its natural setting. Measure or survey a sample without trying to affect them.
Ethnography To study the culture of a community or organization first-hand. Join and participate in a community and record your observations and reflections.
Archival research To understand current or historical events, conditions or practices. Access manuscripts, documents or records from libraries, depositories or the internet.
Secondary data collection To analyze data from populations that you can’t access first-hand. Find existing datasets that have already been collected, from sources such as government agencies or research organizations.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design (e.g., determine inclusion and exclusion criteria ).

Operationalization

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalization means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and timeframe of the data collection.

Standardizing procedures

If multiple researchers are involved, write a detailed manual to standardize data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorize observations. This helps you avoid common research biases like omitted variable bias or information bias .

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organize and store your data.

  • If you are collecting data from people, you will likely need to anonymize and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimize distortion.
  • You can prevent loss of data by having an organization system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1–5. The data produced is numerical and can be statistically analyzed for averages and patterns.

To ensure that high quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 21). Data Collection | Definition, Methods & Examples. Scribbr. Retrieved September 3, 2024, from https://www.scribbr.com/methodology/data-collection/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, sampling methods | types, techniques & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

logo

Qualitative VS Quantitative Definition – Research Methods and Data

' data-src=

When undertaking any type of research study, the data collected will fall into one of two categories: qualitative or quantitative. But what exactly is the difference between these two data types and research methodologies?

Put simply, quantitative data deals with numbers, objective facts and measurable statistics. For example, quantitative data provides specifics on values like website traffic metrics, sales figures, survey response rates, operational costs, etc.

Qualitative data , on the other hand, reveals deeper insights into people‘s subjective perspectives, experiences, beliefs and behaviors. Instead of numbers, qualitative findings are expressed through detailed observations, interviews, focus groups and more.

Now let‘s explore both types of research to understand how and when to apply these methodologies.

Qualitative Research: An In-Depth Perspective

The purpose of qualitative research is to comprehend human behaviors, opinions, motivations and tendencies through an in-depth exploratory approach. Qualitative studies generally seek to answer "why" and "how" questions to uncover deeper meaning and patterns.

Key Features of Qualitative Research

  • Exploratory and open-ended data collection
  • Subjective, experiential and perception-based findings
  • Textual, audio and visual data representation
  • Smaller purposeful sample sizes with participants studied in-depth
  • Findings provide understanding and context around human behaviors

Some examples of popular qualitative methods include:

  • In-depth interviews – Open discussions exploring perspectives
  • Focus groups – Facilitated group discussions
  • Ethnographic research – Observing behaviors in natural environments
  • Content analysis – Studying documents, images, videos, etc.
  • Open-ended surveys or questionnaires – Subjective questions

The benefit of these techniques is collecting elaborate and descriptive qualitative data based on personal experiences rather than just objective facts and figures. This reveals not just what research participants are doing but more importantly, why they think, feel and act in certain ways.

For example, an open-ended survey may find that 52% of respondents felt "happy" about using a particular smartphone brand. But in-depth interviews would help uncover exactly why they feel this way by collecting descriptive details on their user experience.

In essence, qualitative techniques like interviews and ethnographic studies add crucial context . This allows us to delve deeper into research problems to gain meaningful insights.

Quantitative Research: A Data-Driven Approach

Unlike qualitative methods, quantitative research relies primarily on the collection and analysis of objective, measurable numerical data. This structured empirical evidence is then manipulated using statistical, graphical and mathematical techniques to derive patterns, trends and conclusions.

Key Aspects of Quantitative Research

  • Numerical, measurable and quantifiable data
  • Objective facts and empirical evidence
  • Statistical, mathematical or computational analysis
  • Larger randomized sample sizes to generalize findings
  • Research aims to prove, disprove or lend support to existing theories

Some examples of quantitative methods include:

  • Closed-ended surveys with numeric rating scales
  • Multiple choice/dichotomous questionnaires
  • Counting behaviors, events or attributes as frequencies
  • Scientific experiments generating stats and figures
  • Economic and marketing modeling based on historical data

For instance, an online survey may find that 74% of respondents rate a particular laptop 4 or higher on a 5-point scale for quality. Or an experiment might determine that a revised checkout process increases e-commerce conversion rates by 14.5%.

The benefit of quantitative data is that it generates hard numbers and statistics that allow objective measurement and comparison between groups or changes over time. But the limitation is it lacks detailed insights into the subjective reasons and context behind the data.

Qualitative vs. Quantitative: A Comparison

QualitativeQuantitative
Textual dataNumerical data
In-depth insightsHard facts/stats
SubjectiveObjective
Detailed contextsGeneralizable data
Explores "why/how"Tests "what/when"
Interviews, focus groupsSurveys, analytics

Is Qualitative or Quantitative Research Better?

Qualitative and quantitative methodologies have differing strengths and limitations. Expert researchers argue both approaches play an invaluable role when combined effectively .

Qualitative research allows rich exploration of perceptions, motivations and ideas through open-ended inquiry. This generates impactful insights but typically with smaller sample sizes focused on depth over breadth.

Quantitative statistically analyzes empirical evidence to uncover patterns and test hypotheses. This lends generalizable support to relationships between variables but risks losing contextual qualitative detail.

In short, qualitative informs the human perspectives while quantitative informs the overarching trends. Together this approaches a problem from both a granular and big-picture level for robust conclusions.

Integrating Mixed Research Methods

Mixing qualitative and quantitative techniques leverages the strengths while minimizing the weaknesses of both approaches. This integration can happen sequentially in phases or concurrently in parallel strands:

Sequential Mixed Methods

  • Initial exploratory qualitative data collection via interviews, ethnography etc.
  • Develop hypotheses and theories based on qualitative findings
  • Follow up with quantitative research to test hypotheses
  • Interpret how quantitative results explain qualitative discoveries

Concurrent Mixed Methods

  • Simultaneously collect both qualitative and quantitative data
  • Merge findings to provide a comprehensive analysis
  • Compare results between sources to cross-validate conclusions

This intermixing provides corroboration between subjective qualitative themes and hard quantitative figures to produce actionable insights.

Let‘s look at two examples of effective mixed methods research approaches.

Applied Examples of Mixed Methods

Hospital patient experience analysis.

A hospital administrator seeks to improve patient satisfaction rates.

Quantitative Data

  • Statistical survey ratings for aspects like room cleanliness, wait times, staff courtesy etc.
  • Rankings benchmarked over time and against other hospitals

Qualitative Data

  • Patient interviews detailing frustrations, likes/dislikes and emotional journey
  • Expert focus groups discussing challenges and brainstorming solutions

Combined Analysis

Statistical survey analysis coupled with patient interview narratives provides a robust perspective into precisely which issues most critically impact patient experience and what solutions may have the greatest impact.

Product Development Research

A technology company designs a new smartphone app prototype.

  • App metric tracking showing feature usage frequencies, conversions, churn rates
  • In-app surveys measuring ease-of-use ratings on numeric scales
  • Moderated focus groups discussing reactions to prototype
  • Diary studies capturing user challenges and delights

Metrics prove what features customers interact with most while qualitative findings explain why they choose to use or abandon certain app functions. This drives effective product refinement.

As demonstrated, thoughtfully blending quantitative and qualitative techniques can provide powerful multifaceted insights.

Tying It All Together: A Nuanced Perspective

Qualitative and quantitative research encompass differing but complementary methodological paradigms for understanding our world through data.

Qualitative research allows inquiry into the depths of human complexities – perceptions, stories, symbols and meanings. Meanwhile, quantitative methods enable us to zoom out and systematically analyze empirical patterns.

Leveraging both modes of discovery provides a nuanced perspective for unlocking insights. As analyst John Tukey noted, "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data."

Rather than blindly following statistics alone, factoring in qualitative details allows us to carefully interpret the context and meaning behind the numbers.

In closing, elegantly integrating quantitative precision with qualitative awareness offers a multilayered lens for conducting research and driving data-savvy decisions.

' data-src=

Dr. Alex Mitchell is a dedicated coding instructor with a deep passion for teaching and a wealth of experience in computer science education. As a university professor, Dr. Mitchell has played a pivotal role in shaping the coding skills of countless students, helping them navigate the intricate world of programming languages and software development.

Beyond the classroom, Dr. Mitchell is an active contributor to the freeCodeCamp community, where he regularly shares his expertise through tutorials, code examples, and practical insights. His teaching repertoire includes a wide range of languages and frameworks, such as Python, JavaScript, Next.js, and React, which he presents in an accessible and engaging manner.

Dr. Mitchell’s approach to teaching blends academic rigor with real-world applications, ensuring that his students not only understand the theory but also how to apply it effectively. His commitment to education and his ability to simplify complex topics have made him a respected figure in both the university and online learning communities.

Similar Posts

Achieving 100: How to Optimize Your Gatsby Site for a Perfect Lighthouse Score

Achieving 100: How to Optimize Your Gatsby Site for a Perfect Lighthouse Score

As a full stack developer who frequently works with JAMstack apps, I was accustomed to sky-high…

How to Prepare for a Software Developer Interview

How to Prepare for a Software Developer Interview

Interviewing for a software developer role can be an intimidating process. With technical screens, take-home assignments,…

JavaScript Modules: A Beginner’s Guide

JavaScript Modules: A Beginner’s Guide

If you‘re new to JavaScript development, terms like "module bundlers vs. loaders," "Webpack vs. Browserify," and…

Inline Elements and Block Elements in HTML – Explained

Inline Elements and Block Elements in HTML – Explained

As a developer with over 15 years of experience building websites and web applications, few concepts…

Android Broadcast Receivers: A 2600+ Word In-Depth Guide for Beginners

Android Broadcast Receivers: A 2600+ Word In-Depth Guide for Beginners

Broadcast receivers enable Android apps to respond to system or application events through broadcast messaging. This…

How to correctly mock Moment.js/dates in Jest

How to correctly mock Moment.js/dates in Jest

Working with dates and times in any programming language can be deceivingly tricky. Even basic use…

Oboolo: online search, publication and ordering of documents

  • Search by theme
  • Search by keyword
  • Plagiarism checker
  • Document production & correction
  • Publish my documents
  • Student life
  • Testimonials

Read all our documents online!

SUBSCRIBE!

from 9.95 € without obligation of duration

Qualitative vs. Quantitative Research | Differences, Examples & Methods

When it comes to collecting, analyzing, and interpreting research findings, two primary approaches stand out - qualitative and quantitative methods. While both are essential for gaining a deeper understanding of various topics, they differ significantly in their approach, methodology, data collection techniques, analysis procedures, and the insights they provide.

Qualitative vs. Quantitative Research | Differences, Examples & Methods

Image by freepik

Facebook

Quantitative Research: Numbers & Statistics

Quantitative research focuses on numerical values to test hypotheses or confirm theories through statistical analyses. This type involves collecting large datasets using methods such as surveys with closed-ended questions, experiments where variables are controlled and manipulated, observations in natural environments without control over variables, literature reviews of published works by other authors.

Quantitative data collection techniques include:

  • Surveys: Distributed to a sample population via online platforms or face-to-face interactions.
  • Experiments: Manipulating independent variables while controlling for confounding factors.
  • Observations: Recording events in natural settings without intervention.

Common quantitative biases and limitations are information bias, omitted variable bias, sampling bias, selection bias.

Qualitative Research: Words & Meanings

Qualitative research focuses on the realm of words to understand concepts, thoughts, or experiences through open-ended questions during interviews, observations described verbally, literature reviews exploring theoretical frameworks.

Key qualitative data collection methods include:

  • Interviews with in-depth questioning and follow-up clarification
  • Observations where events are documented using descriptive language
  • Literature Reviews examining existing theories & conceptualizations

Qualitative research is susceptible to biases such as the Hawthorne effect (observer influence), observer bias, recall bias, social desirability bias.

The Differences Between Quantitative vs. Qualitative Research Methods

Quantitative and qualitative approaches differ in their data collection methods, analysis procedures, and insights gained.

Data Collection Techniques

Both quantitative & qualitative research employ various techniques for collecting information; however some are more commonly associated with one type over the other

Commonly used both: Surveys (can be open-ended or closed), Observational studies (data can represent numbers e.g., rating scales) Case Studies.

Quantitative data collection methods tend to focus on numerical representations, whereas qualitative approaches emphasize descriptive language.

When Choosing Between Qualitative vs. Quantitative Research Methods

A general guideline for selecting between these two is:

Use quantitative research if you aim to confirm or test a hypothesis/theory Use qualitative research when seeking in-depth understanding of concepts/thoughts/experiences

Most topics allow the use of either, mixed-method approaches combining both are also viable options depending on your question(s), deductive vs. inductive approach and practical considerations such as time & resources.

Research Question

Example: How satisfied are students with their studies?

Quantitative Approach:

Survey 300 university students asking questions like "on a scale from 1-5, how do you rate professors?" Statistical analysis can reveal average ratings (e.g., 4.2).

Qualitative approach:

Conduct in-depth interviews using open-ended queries such as: “How satisfied are you with your studies?” Transcribe & analyze responses to identify common themes.

Mixed Methods Approach:

Combine both approaches by first conducting qualitative research through interviews, then quantifying the findings via a survey.

Analyzing Qualitative and Quantitative Data

Data analysis is crucial for extracting meaningful insights from collected data. The approach differs significantly between quantitative (numbers) and qualitative methods (words).

Quantitative Analysis: Numbers & Statistics

Use statistical software like Excel or SPSS to discover patterns, correlations/causations in numerical datasets.

Common analyses include:

  • Average scores/means
  • Frequency counts of specific answers
  • Correlation/coefficient analysis for relationships among variables.
  • Data Analysis for Decision Making - Quantitative Technique

Qualitative Data Analysis:

Analyze text-based data through various techniques such as content tracking (word frequency), thematic identification & discourse examination.

Some common qualitative approaches are: 

  • Qualitative Content Analysis - examining word occurrences, positions and meanings.
  • Thematic Analysis – closely analyzing the dataset to identify main themes/patterns
  • Discourse analysis- studying communication dynamics in social contexts.
  • Qualitative research within management

Ultimately understanding when to use each method depends on your research question(s), whether you're taking an inductive or deductive approach & practical considerations such as time, resources and access. By combining both qualitative and quantitative methods (mixed-method approaches) researchers can gain a more comprehensive view of their subject matter.

Need a tutor? We can help you !

Recommended documents, related articles.

PESTEL Analysis - Quicksilver

PESTEL Analysis - Quicksilver

Crafting an Effective Literature Review | Guide & Examples

Crafting an Effective Literature Review | Guide

Write the perfect cover letter to apply for your dream job or course!

Write the perfect cover letter to apply for your dream...

Latest articles

Why and how to become a free-lance writer ?

Why and how to become a free-lance tutor-writer ?

PESTEL Analysis - Quicksilver

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • An Bras Dermatol
  • v.89(2); Mar-Apr 2014

Presenting data in tables and charts *

Rodrigo pereira duquia.

1 Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA) - Porto Alegre (RS), Brazil.

João Luiz Bastos

2 Universidade Federal de Santa Catarina (UFSC) - Florianópolis (SC) Brazil.

Renan Rangel Bonamigo

David alejandro gonzález-chica, jeovany martínez-mesa.

3 Latin American Cooperative Oncology Group (LACOG) - Porto Alegre (RS) Brazil.

The present paper aims to provide basic guidelines to present epidemiological data using tables and graphs in Dermatology. Although simple, the preparation of tables and graphs should follow basic recommendations, which make it much easier to understand the data under analysis and to promote accurate communication in science. Additionally, this paper deals with other basic concepts in epidemiology, such as variable, observation, and data, which are useful both in the exchange of information between researchers and in the planning and conception of a research project.

INTRODUCTION

Among the essential stages of epidemiological research, one of the most important is the identification of data with which the researcher is working, as well as a clear and synthetic description of these data using graphs and tables. The identification of the type of data has an impact on the different stages of the research process, encompassing the research planning and the production/publication of its results. For example, the use of a certain type of data impacts the amount of time it will take to collect the desired information (throughout the field work) and the selection of the most appropriate statistical tests for data analysis.

On the other hand, the preparation of tables and graphs is a crucial tool in the analysis and production/publication of results, given that it organizes the collected information in a clear and summarized fashion. The correct preparation of tables allows researchers to present information about tens or hundreds of individuals efficiently and with significant visual appeal, making the results more easily understandable and thus more attractive to the users of the produced information. Therefore, it is very important for the authors of scientific articles to master the preparation of tables and graphs, which requires previous knowledge of data characteristics and the ability of identifying which type of table or graph is the most appropriate for the situation of interest.

BASIC CONCEPTS

Before evaluating the different types of data that permeate an epidemiological study, it is worth discussing about some key concepts (herein named data, variables and observations):

Data - during field work, researchers collect information by means of questions, systematic observations, and imaging or laboratory tests. All this gathered information represents the data of the research. For example, it is possible to determine the color of an individual's skin according to Fitzpatrick classification or quantify the number of times a person uses sunscreen during summer. 1 , 2 All the information collected during research is generically named "data." A set of individual data makes it possible to perform statistical analysis. If the quality of data is good, i.e., if the way information was gathered was appropriate, the next stages of database preparation, which will set the ground for analysis and presentation of results, will be properly conducted.

Observations - are measurements carried out in one or more individuals, based on one or more variables. For instance, if one is working with the variable "sex" in a sample of 20 individuals and knows the exact amount of men and women in this sample (10 for each group), it can be said that this variable has 20 observations.

Variables - are constituted by data. For instance, an individual may be male or female. In this case, there are 10 observations for each sex, but "sex" is the variable that is referred to as a whole. Another example of variable is "age" in complete years, in which observations are the values 1 year, 2 years, 3 years, and so forth. In other words, variables are characteristics or attributes that can be measured, assuming different values, such as sex, skin type, eye color, age of the individuals under study, laboratory results, or the presence of a given lesion/disease. Variables are specifically divided into two large groups: (a) the group of categorical or qualitative variables, which is subdivided into dichotomous, nominal and ordinal variables; and (b) the group of numerical or quantitative variables, which is subdivided into continuous and discrete variables.

Categorical variables

  • Dichotomous variables, also known as binary variables: are those that have only two categories, i.e., only two response options. Typical examples of this type of variable are sex (male and female) and presence of skin cancer (yes or no).
  • Ordinal variables: are those that have three or more categories with an obvious ordering of the categories (whether in an ascending or descending order). For example, Fitzpatrick skin classification into types I, II, III, IV and V. 1
  • Nominal variables: are those that have three or more categories with no apparent ordering of the categories. Example: blood types A, B, AB, and O, or brown, blue or green eye colors.

Numerical variables

  • Discrete variables: are observations that can only take certain numerical values. An example of this type of variable is subjects' age, when assessed in complete years of life (1 year, 2 years, 3 years, 4 years, etc.) and the number of times a set of patients visited the dermatologist in a year.
  • Continuous variables: are those measured on a continuous scale, i.e., which have as many decimal places as the measuring instrument can record. For instance: blood pressure, birth weight, height, or even age, when measured on a continuous scale.

It is important to point out that, depending on the objectives of the study, data may be collected as discrete or continuous variables and be subsequently transformed into categorical variables to suit the purpose of the research and/or make interpretation easier. However, it is important to emphasize that variables measured on a numerical scale (whether discrete or continuous) are richer in information and should be preferred for statistical analyses. Figure 1 shows a diagram that makes it easier to understand, identify and classify the abovementioned variables.

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g01.jpg

Types of variables

DATA PRESENTATION IN TABLES AND GRAPHS

Firstly, it is worth emphasizing that every table or graph should be self-explanatory, i.e., should be understandable without the need to read the text that refers to it refers.

Presentation of categorical variables

In order to analyze the distribution of a variable, data should be organized according to the occurrence of different results in each category. As for categorical variables, frequency distributions may be presented in a table or a graph, including bar charts and pie or sector charts. The term frequency distribution has a specific meaning, referring to the the way observations of a given variable behave in terms of its absolute, relative or cumulative frequencies.

In order to synthesize information contained in a categorical variable using a table, it is important to count the number of observations in each category of the variable, thus obtaining its absolute frequencies. However, in addition to absolute frequencies, it is worth presenting its percentage values, also known as relative frequencies. For example, table 1 expresses, in absolute and relative terms, the frequency of acne scars in 18-year-old youngsters from a population-based study conducted in the city of Pelotas, Southern Brazil, in 2010. 3

Absolute and relative frequencies of acne scar in 18- year-old adolescents (n = 2.414). Pelotas, Brazil, 2010

No1.85576.84
Yes55923.16
Total2.414100.00

The same information from table 1 may be presented as a bar or a pie chart, which can be prepared considering the absolute or relative frequency of the categories. Figures 2 and ​ and3 3 illustrate the same information shown in table 1 , but present it as a bar chart and a pie chart, respectively. It can be observed that, regardless of the form of presentation, the total number of observations must be mentioned, whether in the title or as part of the table or figure. Additionally, appropriate legends should always be included, allowing for the proper identification of each of the categories of the variable and including the type of information provided (absolute and/or relative frequency).

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g02.jpg

Absolute frequencies of acne scar in 18-year-old adolescents (n = 2.414). Pelotas, Brazil, 2010

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g03.jpg

Relative frequencies of acne scar in 18-year-old adolescents (n = 2.414). Pelotas, Brazil, 2010

Presentation of numerical variables

Frequency distributions of numerical variables can be displayed in a table, a histogram chart, or a frequency polygon chart. With regard to discrete variables, it is possible to present the number of observations according to the different values found in the study, as illustrated in table 2 . This type of table may provide a wide range of information on the collected data.

Educational level of 18-year-old adolescents (n = 2,199). Pelotas, Brazil, 2010

Educational level (in years of education)Absolute frequency (n)Relative frequency (%)Cumulative relative frequency (%)
010.050.05
120.090.14
220.090.23
3110.500.73
41004.555.28
51567.0912.37
61697.6920.05
722110.0530.10
845020.4650.57
925111.4161.98
1032014.5576.53
1147921.7898.32
12311.4199.73
1360.27100.00

Table 2 shows the distribution of educational levels among 18-year-old youngsters from Pelotas, Southern Brazil, with absolute, relative, and cumulative relative frequencies. In this case, absolute and relative frequencies correspond to the absolute number and the percentage of individuals according to their distribution for this variable, respectively, based on complete years of education. It should be noticed that there are 450 adolescents with 8 years of education, which corresponds to 20.5% of the subjects. Tables may also present the cumulative relative frequency of the variable. In this case, it was found that 50.6% of study subjects have up to 8 years of education. It is important to point that, although the same data were used, each form of presentation (absolute, relative or cumulative frequency) provides different information and may be used to understand frequency distribution from different perspectives.

When one wants to evaluate the frequency distribution of continuous variables using tables or graphs, it is necessary to transform the variable into categories, preferably creating categories with the same size (or the same amplitude). However, in addition to this general recommendation, other basic guidelines should be followed, such as: (1) subtracting the highest from the lowest value for the variable of interest; (2) dividing the result of this subtraction by the number of categories to be created (usually from three to ten); and (3) defining category intervals based on this last result.

For example, in order to categorize height (in meters) of a set of individuals, the first step is to identify the tallest and the shortest individual of the sample. Let us assume that the tallest individual is 1.85m tall and the shortest, 1.55m tall, with a difference of 0.3m between these values. The next step is to divide this difference by the number of categories to be created, e.g., five. Thus, 0.3m divided by five equals 0.06m, which means that categories will have exactly this range and will be numerically represented by the following range of values: 1st category - 1.55m to 1.60m; 2nd category - 1.61m to 1.66m; 3rd category - 1.67m to 1.72m; 4th category - 1.73m to 1.78m; 5th category - 1.79m to 1.85m.

Table 3 illustrates weight values at 18 years of age in kg (continuous numerical variable) obtained in a study with youngsters from Pelotas, Southern Brazil. 4 , 5 Figure 4 shows a histogram with the variable weight categorized into 20-kg intervals. Therefore, it is possible to observe that data from continuous numerical variables may be presented in tables or graphs.

Weight distribution among 18-year-old young male sex (n = 2.194). Pelotas, Brazil, 2010

 40.5 to 59.9 554 25.25
 60.0 to 65.8 543 24.75
 65.9 to 74.6 551 25.11
 74.7 to 147.8 546 24.89
     

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g04.jpg

Weight distribution at 18 years of age among youngsters from the city of Pelotas. Pelotas (n = 2.194), Brazil, 2010

Assessing the relationship between two variables

The forms of data presentation that have been described up to this point illustrated the distribution of a given variable, whether categorical or numerical. In addition, it is possible to present the relationship between two variables of interest, either categorical or numerical.

The relationship between categorical variables may be investigated using a contingency table, which has the purpose of analyzing the association between two or more variables. The lines of this type of table usually display the exposure variable (independent variable), and the columns, the outcome variable (dependent variable). For example, in order to study the effect of sun exposure (exposure variable) on the development of skin cancer (outcome variable), it is possible to place the variable sun exposure on the lines and the variable skin cancer on the columns of a contingency table. Tables may be easier to understand by including total values in lines and columns. These values should agree with the sum of the lines and/or columns, as appropriate, whereas relative values should be in accordance with the exposure variable, i.e., the sum of the values mentioned in the lines should total 100%.

It is such a display of percentage values that will make it possible for risk or exposure groups to be compared with each other, in order to investigate whether individuals exposed to a given risk factor show higher frequency of the disease of interest. Thus, table 4 shows that 75.0%, 9.0%, and 0.3% of individuals in the study sample who had been working exposed to the sun for 20 years or more, for less than 20 years, and had never been working exposed to the sun, respectively, developed non-melanoma skin cancer. Another way of interpreting this table is observing that 25.0%, 91%,.0%, and 99.7% of individuals who had been working exposed to the sun for 20 years of more, for less than 20 years, and had never been working exposed to the sun did not develop non-melanoma skin cancer. This form of presentation is one of the most used in the literature and makes the table easier to read.

Sun exposure during work and non-melanoma skin cancer (hypothetical data).

Work exposed to the sun Non-melanoma skin cancer  Total
Yes No
N % N % N %
20 or more years3075.01025.040100
<20 years99.09091.099100
Never10.330099.7301100
Total409.040091.0440100

The relationship between two numerical variables or between one numerical variable and one categorical variable may be assessed using a scatter diagram, also known as dispersion diagram. In this diagram, each pair of values is represented by a symbol or a dot, whose horizontal and vertical positions are determined by the value of the first and second variables, respectively. By convention, vertical and horizontal axes should correspond to outcome and exposure variables, respectively. Figure 5 shows the relationship between weight and height among 18-year-old youngsters from Pelotas, Southern Brazil, in 2010. 3 , 4 The diagram presented in figure 5 should be interpreted as follows: the increase in subjects' height is accompanied by an increase in their weight.

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g05.jpg

Point diagram for the relationship between weight (kg) and height (cm) among 18-year-old youngsters from the city of Pelotas (n = 2.194). Pelotas, Brazil, 2010.

BASIC RULES FOR THE PREPARATION OF TABLES AND GRAPHS

Ideally, every table should:

  • Be self-explanatory;
  • Present values with the same number of decimal places in all its cells (standardization);
  • Include a title informing what is being described and where, as well as the number of observations (N) and when data were collected;
  • Have a structure formed by three horizontal lines, defining table heading and the end of the table at its lower border;
  • Not have vertical lines at its lateral borders;
  • Provide additional information in table footer, when needed;
  • Be inserted into a document only after being mentioned in the text; and
  • Be numbered by Arabic numerals.

Similarly to tables, graphs should:

  • Include, below the figure, a title providing all relevant information;
  • Be referred to as figures in the text;
  • Identify figure axes by the variables under analysis;
  • Quote the source which provided the data, if required;
  • Demonstrate the scale being used; and
  • Be self-explanatory.

The graph's vertical axis should always start with zero. A usual type of distortion is starting this axis with values higher than zero. Whenever it happens, differences between variables are overestimated, as can been seen in figure 6 .

An external file that holds a picture, illustration, etc.
Object name is abd-89-02-0280-g06.jpg

Figure showing how graphs in which the Y-axis does not start with zero tend to overestimate the differences under analysis. On the left there is a graph whose Y axis does not start with zero and on the right a graph reproducing the same data but with the Y axis starting with zero.

Understanding how to classify the different types of variables and how to present them in tables or graphs is an essential stage for epidemiological research in all areas of knowledge, including Dermatology. Mastering this topic collaborates to synthesize research results and prevents the misuse or overuse of tables and figures in scientific papers.

Conflict of Interest: None

Financial Support: None

How to cite this article: Duquia RP, Bastos JL, Bonamigo RR, González-Chica DA, Martínez-Mesa J. Presenting data in tables and charts. An Bras Dermatol. 2014;89(2):280-5.

* Work performed at the Dermatology service, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Departamento de Saúde Pública e Departamento de Nutrição da UFSC.

We Trust in Human Precision

20,000+ Professional Language Experts Ready to Help. Expertise in a variety of Niches.

API Solutions

  • API Pricing
  • Cost estimate
  • Customer loyalty program
  • Educational Discount
  • Non-Profit Discount
  • Green Initiative Discount1

Value-Driven Pricing

Unmatched expertise at affordable rates tailored for your needs. Our services empower you to boost your productivity.

PC editors choice

  • Special Discounts
  • Enterprise transcription solutions
  • Enterprise translation solutions
  • Transcription/Caption API
  • AI Transcription Proofreading API

Trusted by Global Leaders

GoTranscript is the chosen service for top media organizations, universities, and Fortune 50 companies.

GoTranscript

One of the Largest Online Transcription and Translation Agencies in the World. Founded in 2005.

Speaker 1: Welcome to this overview of quantitative research methods. This tutorial will give you the big picture of quantitative research and introduce key concepts that will help you determine if quantitative methods are appropriate for your project study. First, what is educational research? Educational research is a process of scholarly inquiry designed to investigate the process of instruction and learning, the behaviors, perceptions, and attributes of students and teachers, the impact of institutional processes and policies, and all other areas of the educational process. The research design may be quantitative, qualitative, or a mixed methods design. The focus of this overview is quantitative methods. The general purpose of quantitative research is to explain, predict, investigate relationships, describe current conditions, or to examine possible impacts or influences on designated outcomes. Quantitative research differs from qualitative research in several ways. It works to achieve different goals and uses different methods and design. This table illustrates some of the key differences. Qualitative research generally uses a small sample to explore and describe experiences through the use of thick, rich descriptions of detailed data in an attempt to understand and interpret human perspectives. It is less interested in generalizing to the population as a whole. For example, when studying bullying, a qualitative researcher might learn about the experience of the victims and the experience of the bully by interviewing both bullies and victims and observing them on the playground. Quantitative studies generally use large samples to test numerical data by comparing or finding correlations among sample attributes so that the findings can be generalized to the population. If quantitative researchers were studying bullying, they might measure the effects of a bully on the victim by comparing students who are victims and students who are not victims of bullying using an attitudinal survey. In conducting quantitative research, the researcher first identifies the problem. For Ed.D. research, this problem represents a gap in practice. For Ph.D. research, this problem represents a gap in the literature. In either case, the problem needs to be of importance in the professional field. Next, the researcher establishes the purpose of the study. Why do you want to do the study, and what do you intend to accomplish? This is followed by research questions which help to focus the study. Once the study is focused, the researcher needs to review both seminal works and current peer-reviewed primary sources. Based on the research question and on a review of prior research, a hypothesis is created that predicts the relationship between the study's variables. Next, the researcher chooses a study design and methods to test the hypothesis. These choices should be informed by a review of methodological approaches used to address similar questions in prior research. Finally, appropriate analytical methods are used to analyze the data, allowing the researcher to draw conclusions and inferences about the data, and answer the research question that was originally posed. In quantitative research, research questions are typically descriptive, relational, or causal. Descriptive questions constrain the researcher to describing what currently exists. With a descriptive research question, one can examine perceptions or attitudes as well as more concrete variables such as achievement. For example, one might describe a population of learners by gathering data on their age, gender, socioeconomic status, and attributes towards their learning experiences. Relational questions examine the relationship between two or more variables. The X variable has some linear relationship to the Y variable. Causal inferences cannot be made from this type of research. For example, one could study the relationship between students' study habits and achievements. One might find that students using certain kinds of study strategies demonstrate greater learning, but one could not state conclusively that using certain study strategies will lead to or cause higher achievement. Causal questions, on the other hand, are designed to allow the researcher to draw a causal inference. A causal question seeks to determine if a treatment variable in a program had an effect on one or more outcome variables. In other words, the X variable influences the Y variable. For example, one could design a study that answered the question of whether a particular instructional approach caused students to learn more. The research question serves as a basis for posing a hypothesis, a predicted answer to the research question that incorporates operational definitions of the study's variables and is rooted in the literature. An operational definition matches a concept with a method of measurement, identifying how the concept will be quantified. For example, in a study of instructional strategies, the hypothesis might be that students of teachers who use Strategy X will exhibit greater learning than students of teachers who do not. In this study, one would need to operationalize learning by identifying a test or instrument that would measure learning. This approach allows the researcher to create a testable hypothesis. Relational and causal research relies on the creation of a null hypothesis, a version of the research hypothesis that predicts no relationship between variables or no effect of one variable on another. When writing the hypothesis for a quantitative question, the null hypothesis and the research or alternative hypothesis use parallel sentence structure. In this example, the null hypothesis states that there will be no statistical difference between groups, while the research or alternative hypothesis states that there will be a statistical difference between groups. Note also that both hypothesis statements operationalize the critical thinking skills variable by identifying the measurement instrument to be used. Once the research questions and hypotheses are solidified, the researcher must select a design that will create a situation in which the hypotheses can be tested and the research questions answered. Ideally, the research design will isolate the study's variables and control for intervening variables so that one can be certain of the relationships being tested. In educational research, however, it is extremely difficult to establish sufficient controls in the complex social settings being studied. In our example of investigating the impact of a certain instructional strategy in the classroom on student achievement, each day the teacher uses a specific instructional strategy. After school, some of the students in her class receive tutoring. Other students have parents that are very involved in their child's academic progress and provide learning experiences in the home. These students may do better because they received extra help, not because the teacher's instructional strategy is more effective. Unless the researcher can control for the intervening variable of extra help, it will be impossible to effectively test the study's hypothesis. Quantitative research designs can fall into two broad categories, experimental and quasi-experimental. Classic experimental designs are those that randomly assign subjects to either a control or treatment comparison group. The researcher can then compare the treatment group to the control group to test for an intervention's effect, known as a between-subject design. It is important to note that the control group may receive a standard treatment or may receive a treatment of any kind. Quasi-experimental designs do not randomly assign subjects to groups, but rather take advantage of existing groups. A researcher can still have a control and comparison group, but assignment to the groups is not random. The use of a control group is not required. However, the researcher may choose a design in which a single group is pre- and post-tested, known as a within-subjects design. Or a single group may receive only a post-test. Since quasi-experimental designs lack random assignment, the researcher should be aware of the threats to validity. Educational research often attempts to measure abstract variables such as attitudes, beliefs, and feelings. Surveys can capture data about these hard-to-measure variables, as well as other self-reported information such as demographic factors. A survey is an instrument used to collect verifiable information from a sample population. In quantitative research, surveys typically include questions that ask respondents to choose a rating from a scale, select one or more items from a list, or other responses that result in numerical data. Studies that use surveys or tests need to include strategies that establish the validity of the instrument used. There are many types of validity that need to be addressed. Face validity. Does the test appear at face value to measure what it is supposed to measure? Content validity. Content validity includes both item validity and sampling validity. Item validity ensures that the individual test items deal only with the subject being addressed. Sampling validity ensures that the range of item topics is appropriate to the subject being studied. For example, item validity might be high, but if all the items only deal with one aspect of the subjects, then sampling validity is low. Content validity can be established by having experts in the field review the test. Concurrent validity. Does a new test correlate with an older, established test that measures the same thing? Predictive validity. Does the test correlate with another related measure? For example, GRE tests are used at many colleges because these schools believe that a good grade on this test increases the probability that the student will do well at the college. Linear regression can establish the predictive validity of a test. Construct validity. Does the test measure the construct it is intended to measure? Establishing construct validity can be a difficult task when the constructs being measured are abstract. But it can be established by conducting a number of studies in which you test hypotheses regarding the construct, or by completing a factor analysis to ensure that you have the number of constructs that you say you have. In addition to ensuring the validity of instruments, the quantitative researcher needs to establish their reliability as well. Strategies for establishing reliability include Test retest. Correlates scores from two different administrations of the same test. Alternate forms. Correlates scores from administrations of two different forms of the same test. Split half reliability. Treats each half of one test or survey as a separate administration and correlates the results from each. Internal consistency. Uses Cronbach's coefficient alpha to calculate the average of all possible split halves. Quantitative research almost always relies on a sample that is intended to be representative of a larger population. There are two basic sampling strategies, random and non-random, and a number of specific strategies within each of these approaches. This table provides examples of each of the major strategies. The next section of this tutorial provides an overview of the procedures in conducting quantitative data analysis. There are specific procedures for conducting the data collection, preparing for and analyzing data, presenting the findings, and connecting to the body of existing research. This process ensures that the research is conducted as a systematic investigation that leads to credible results. Data comes in various sizes and shapes, and it is important to know about these so that the proper analysis can be used on the data. In 1946, S.S. Stevens first described the properties of measurement systems that allowed decisions about the type of measurement and about the attributes of objects that are preserved in numbers. These four types of data are referred to as nominal, ordinal, interval, and ratio. First, let's examine nominal data. With nominal data, there is no number value that indicates quantity. Instead, a number has been assigned to represent a certain attribute, like the number 1 to represent male and the number 2 to represent female. In other words, the number is just a label. You could also assign numbers to represent race, religion, or any other categorical information. Nominal data only denotes group membership. With ordinal data, there is again no indication of quantity. Rather, a number is assigned for ranking order. For example, satisfaction surveys often ask respondents to rank order their level of satisfaction with services or programs. The next level of measurement is interval data. With interval data, there are equal distances between two values, but there is no natural zero. A common example is the Fahrenheit temperature scale. Differences between the temperature measurements make sense, but ratios do not. For instance, 20 degrees Fahrenheit is not twice as hot as 10 degrees Fahrenheit. You can add and subtract interval level data, but they cannot be divided or multiplied. Finally, we have ratio data. Ratio is the same as interval, however ratios, means, averages, and other numerical formulas are all possible and make sense. Zero has a logical meaning, which shows the absence of, or having none of. Examples of ratio data are height, weight, speed, or any quantities based on a scale with a natural zero. In summary, nominal data can only be counted. Ordinal data can be counted and ranked. Interval data can also be added and subtracted, and ratio data can also be used in ratios and other calculations. Determining what type of data you have is one of the most important aspects of quantitative analysis. Depending on the research question, hypotheses, and research design, the researcher may choose to use descriptive and or inferential statistics to begin to analyze the data. Descriptive statistics are best illustrated when viewed through the lens of America's pastimes. Sports, weather, economy, stock market, and even our retirement portfolio are presented in a descriptive analysis. Basic terminology for descriptive statistics are terms that we are most familiar in this discipline. Frequency, mean, median, mode, range, variance, and standard deviation. Simply put, you are describing the data. Some of the most common graphic representations of data are bar graphs, pie graphs, histograms, and box and whisker graphs. Attempting to reach conclusions and make causal inferences beyond graphic representations or descriptive analyses is referred to as inferential statistics. In other words, examining the college enrollment of the past decade in a certain geographical region would assist in estimating what the enrollment for the next year might be. Frequently in education, the means of two or more groups are compared. When comparing means to assist in answering a research question, one can use a within-group, between-groups, or mixed-subject design. In a within-group design, the researcher compares measures of the same subjects across time, therefore within-group, or under different treatment conditions. This can also be referred to as a dependent-group design. The most basic example of this type of quasi-experimental design would be if a researcher conducted a pretest of a group of students, subjected them to a treatment, and then conducted a post-test. The group has been measured at different points in time. In a between-group design, subjects are assigned to one of the two or more groups. For example, Control, Treatment 1, Treatment 2. Ideally, the sampling and assignment to groups would be random, which would make this an experimental design. The researcher can then compare the means of the treatment group to the control group. When comparing two groups, the researcher can gain insight into the effects of the treatment. In a mixed-subjects design, the researcher is testing for significant differences between two or more independent groups while subjecting them to repeated measures. Choosing a statistical test to compare groups depends on the number of groups, whether the data are nominal, ordinal, or interval, and whether the data meet the assumptions for parametric tests. Nonparametric tests are typically used with nominal and ordinal data, while parametric tests use interval and ratio-level data. In addition to this, some further assumptions are made for parametric tests that the data are normally distributed in the population, that participant selection is independent, and the selection of one person does not determine the selection of another, and that the variances of the groups being compared are equal. The assumption of independent participant selection cannot be violated, but the others are more flexible. The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate as the method of analysis for a quasi-experimental design. When choosing a t-test, the assumptions are that the data are parametric. The analysis of variance, or ANOVA, assesses whether the means of more than two groups are statistically different from each other. When choosing an ANOVA, the assumptions are that the data are parametric. The chi-square test can be used when you have non-parametric data and want to compare differences between groups. The Kruskal-Wallis test can be used when there are more than two groups and the data are non-parametric. Correlation analysis is a set of statistical tests to determine whether there are linear relationships between two or more sets of variables from the same list of items or individuals, for example, achievement and performance of students. The tests provide a statistical yes or no as to whether a significant relationship or correlation exists between the variables. A correlation test consists of calculating a correlation coefficient between two variables. Again, there are parametric and non-parametric choices based on the assumptions of the data. Pearson R correlation is widely used in statistics to measure the strength of the relationship between linearly related variables. Spearman-Rank correlation is a non-parametric test that is used to measure the degree of association between two variables. Spearman-Rank correlation test does not assume any assumptions about the distribution. Spearman-Rank correlation test is used when the Pearson test gives misleading results. Often a Kendall-Taw is also included in this list of non-parametric correlation tests to examine the strength of the relationship if there are less than 20 rankings. Linear regression and correlation are similar and often confused. Sometimes your methodologist will encourage you to examine both the calculations. Calculate linear correlation if you measured both variables, x and y. Make sure to use the Pearson parametric correlation coefficient if you are certain you are not violating the test assumptions. Otherwise, choose the Spearman non-parametric correlation coefficient. If either variable has been manipulated using an intervention, do not calculate a correlation. While linear regression does indicate the nature of the relationship between two variables, like correlation, it can also be used to make predictions because one variable is considered explanatory while the other is considered a dependent variable. Establishing validity is a critical part of quantitative research. As with the nature of quantitative research, there is a defined approach or process for establishing validity. This also allows for the findings transferability. For a study to be valid, the evidence must support the interpretations of the data, the data must be accurate, and their use in drawing conclusions must be logical and appropriate. Construct validity concerns whether what you did for the program was what you wanted to do, or whether what you observed was what you wanted to observe. Construct validity concerns whether the operationalization of your variables are related to the theoretical concepts you are trying to measure. Are you actually measuring what you want to measure? Internal validity means that you have evidence that what you did in the study, i.e., the program, caused what you observed, i.e., the outcome, to happen. Conclusion validity is the degree to which conclusions drawn about relationships in the data are reasonable. External validity concerns the process of generalizing, or the degree to which the conclusions in your study would hold for other persons in other places and at other times. Establishing reliability and validity to your study is one of the most critical elements of the research process. Once you have decided to embark upon the process of conducting a quantitative study, use the following steps to get started. First, review research studies that have been conducted on your topic to determine what methods were used. Consider the strengths and weaknesses of the various data collection and analysis methods. Next, review the literature on quantitative research methods. Every aspect of your research has a body of literature associated with it. Just as you would not confine yourself to your course textbooks for your review of research on your topic, you should not limit yourself to your course texts for your review of methodological literature. Read broadly and deeply from the scholarly literature to gain expertise in quantitative research. Additional self-paced tutorials have been developed on different methodologies and techniques associated with quantitative research. Make sure that you complete all of the self-paced tutorials and review them as often as needed. You will then be prepared to complete a literature review of the specific methodologies and techniques that you will use in your study. Thank you for watching.

techradar

Data Analysis Techniques for Quantitative Study

  • First Online: 27 October 2022

Cite this chapter

data presentation methods in quantitative research

  • Md. Mahsin 4  

3027 Accesses

1 Citations

This chapter describes the types of data analysis techniques in quantitative research and sampling strategies suitable for quantitative studies, particularly probability sampling, to produce credible and trustworthy explanations of a phenomenon. Initially, it briefly describes the measurement levels of variables. It then provides some statistical analysis techniques for quantitative study with examples using tables and graphs, making it easier for the readers to understand the data presentation techniques in quantitative research. In summary, it will be a beneficial resource for those interested in using quantitative design for their data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

data presentation methods in quantitative research

Data Analysis in Quantitative Research

data presentation methods in quantitative research

Sampling Errors, Bias, and Objectivity

It is called the “Pearson correlation coefficient” in honour of Karl Pearson, a British mathematician who developed the method.

Agresti, A., & Kateri, M. (2011). Categorical data analysis . Springer.

Google Scholar  

Argyrous, G. (1997). Statistics for social research . Macmillan Education Australia Printery Limited.

Aron, A., Coups, E., & Aron, E. N. (2013). Statistics for the behavioral and social sciences: A brief course: Pearson new international edition. Pearson Higher Ed.

Bailey, K. (2008). Methods of social research . Simon and Schuster.

Babbie, E. R. (2015). The practice of social research . Nelson Education.

Bernard, H. R., & Bernard, H. R. (2012). Social research methods: Qualitative and quantitative approaches . Sage Publications.

Bickman, L., & Rog, D. J. (Eds.). (2008). The sage handbook of applied social research methods . Sage Publications.

Bryman, A. (2015). Social research methods . Oxford University Press.

Field, A. (2009). Discovering statistics using SPSS . Sage Publications.

Gorard, S. (2003). Quantitative methods in social science research . A&C Black.

Hosmer, D. W., Jr., & Lemeshow, S. (2004). Applied logistic regression . John Wiley & Sons.

Islam, M. R. (Ed.). (2019). Social research methodology and new techniques in analysis, interpretation and writing . IGI Global.

Klecka, W. R. (1980). Discriminant analysis (No. 19). Sage Populations.

Lampard, R., & Pole, C. (2015). Practical social investigation: Qualitative and quantitative methods in social research . Routledge.

McLachlan, G. (2004). Discriminant analysis and statistical pattern recognition (Vol. 544). John Wiley & Sons.

Montgomery, D. C. (2012). Design and analysis of experiments . John Wiley & Sons.

Muijs, D. (2010). Doing quantitative research in education with SPSS . Sage Publications.

Neuman, L. W. (2002). Social research methods: Qualitative and quantitative approaches .

Population & Housing Census, Bangladesh. (2011). Preliminary results

Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat, World Population Prospects: The 2010 Revision

Punch, K. F. (2013). Introduction to social research: Quantitative and qualitative approaches . Sage Publications.

Stevens, J. P. (2012). Applied multivariate statistics for the social sciences . Routledge.

Book   Google Scholar  

Wilcox, R. R. (1996). Statistics for the social sciences . Academic Press.

Download references

Author information

Authors and affiliations.

Department of Mathematics and Statistics, University of Calgary, 2500 University Drive NW, Calgary, Canada

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Md. Mahsin .

Editor information

Editors and affiliations.

Centre for Family and Child Studies, Research Institute of Humanities and Social Sciences, University of Sharjah, Sharjah, United Arab Emirates

M. Rezaul Islam

Department of Development Studies, University of Dhaka, Dhaka, Bangladesh

Niaz Ahmed Khan

Department of Social Work, School of Humanities, University of Johannesburg, Johannesburg, South Africa

Rajendra Baikady

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Mahsin, M. (2022). Data Analysis Techniques for Quantitative Study. In: Islam, M.R., Khan, N.A., Baikady, R. (eds) Principles of Social Research Methodology. Springer, Singapore. https://doi.org/10.1007/978-981-19-5441-2_16

Download citation

DOI : https://doi.org/10.1007/978-981-19-5441-2_16

Published : 27 October 2022

Publisher Name : Springer, Singapore

Print ISBN : 978-981-19-5219-7

Online ISBN : 978-981-19-5441-2

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 September 2024

Quantitative microbiology with widefield microscopy: navigating optical artefacts for accurate interpretations

  • Georgeos Hardo 1 ,
  • Ruizhe Li 1 &
  • Somenath Bakshi 1  

npj Imaging volume  2 , Article number:  26 ( 2024 ) Cite this article

Metrics details

  • Cellular imaging
  • Imaging and sensing
  • Phase-contrast microscopy
  • Single-cell imaging
  • Wide-field fluorescence microscopy

Time-resolved live-cell imaging using widefield microscopy is instrumental in quantitative microbiology research. It allows researchers to track and measure the size, shape, and content of individual microbial cells over time. However, the small size of microbial cells poses a significant challenge in interpreting image data, as their dimensions approache that of the microscope’s depth of field, and they begin to experience significant diffraction effects. As a result, 2D widefield images of microbial cells contain projected 3D information, blurred by the 3D point spread function. In this study, we employed simulations and targeted experiments to investigate the impact of diffraction and projection on our ability to quantify the size and content of microbial cells from 2D microscopic images. This study points to some new and often unconsidered artefacts resulting from the interplay of projection and diffraction effects, within the context of quantitative microbiology. These artefacts introduce substantial errors and biases in size, fluorescence quantification, and even single-molecule counting, making the elimination of these errors a complex task. Awareness of these artefacts is crucial for designing strategies to accurately interpret micrographs of microbes. To address this, we present new experimental designs and machine learning-based analysis methods that account for these effects, resulting in accurate quantification of microbiological processes.

Similar content being viewed by others

data presentation methods in quantitative research

Quantitative image analysis of microbial communities with BiofilmQ

data presentation methods in quantitative research

Biofilm viability checker: An open-source tool for automated biofilm viability analysis from confocal microscopy images

data presentation methods in quantitative research

Sopa: a technology-invariant pipeline for analyses of image-based spatial omics

Introduction.

Widefield fluorescence microscopy is a cornerstone of quantitative microbiology, allowing for noninvasive, real-time imaging of individual cells. This technique’s capacity to measure the size, shape, and content of individual microbial cells has advanced several areas of quantitative microbiology research, including studies on size regulation and division control in bacteria 1 , 2 , 3 , regulation and noise in gene expression 4 , 5 , 6 , and analysis of interactions between cells and their environments 7 , 8 . Additionally, imaging individual molecules within cells using this technique has enabled the study of the dynamics and organisation of individual genes, mRNAs and proteins, and has facilitated the construction of accurate distributions of their abundance 9 , 10 . In essence, live-cell widefield microscopy plays a pivotal role in developing a comprehensive understanding of biological processes within and between microbial cells, offering insights into their organisation, dynamics, and regulation across different scales.

However, careful scrutiny is required for the extraction of quantitative microbiological information from microscopy data. The size of most microbes (particularly bacterial cells) is comparable to the dimensions of the microscope’s point-spread function (PSF) 11 , resulting in significant diffraction effects on bacterial cell images (blur). Moreover, the thickness of bacterial cells is roughly equivalent to the depth of field (DoF) of a microscope objective. Consequently, 2D widefield images of microbes contain projected 3D information containing diffraction from the 3D PSF. The interplay of projection and diffraction effects can bias the estimation of cell size, shape, and intensity. Additionally, these factors hinder the quantification of low copy-number molecules like mRNAs and transcription factors from single-molecule counting experiments 10 as molecules at varying depths exhibit varying degrees of defocus, overlap in the 2D projection, and coalesce into a single blurred spot due to diffraction. Systematically analysing these effects to understand their impact on the accurate interpretation of microscopy data has proven challenging due to the lack of accurate ground-truth information.

To address this challenge, we utilised SyMBac (Synthetic Micrographs of Bacteria), a virtual microscopy platform (which we introduced in a previous work 12 ) capable of generating synthetic bacterial images under various conditions. This tool allows us to assess diffraction and projection effects through forward simulation. The SyMBac-generated images come with precise ground-truth data, enabling us to accurately quantify errors and biases in different measurements and offer control over a wide range of parameters, encompassing optics, physics, and cell characteristics (size, shape, intensity distribution and fluorescent label type). Consequently, we can analyse how these factors affect image formation and feature extraction. Moreover, the virtual microscopy setup allows us to explore imaging parameters that may be difficult or impractical to realise in actual experiments but are crucial for identifying important variables by amplifying their effects.

In this paper, we use SyMBac to systematically investigate the impact of projection and diffraction on the accurate quantification of three key aspects: (1) cell dimensions, (2) fluorescence intensity of individual cells and (3) counts of individually labelled entities per cell. To validate the findings from our virtual microscopy experiments, we conducted targeted real experiments with variable optical settings. Our analysis revealed previously unrecognised artefacts arising from the interplay of projection and diffraction effects. These artefacts introduce significant errors and biases in the estimation of cell size, intensity, and molecule counts, proving challenging to rectify. Recognising these effects and devising appropriate mitigation strategies is crucial for accurate quantification of microbiological processes from microscopy data. To this end, we have demonstrated that understanding these effects enables designing ‘smart-microscopy’ experiments, along with analytical protocols that minimise their impact while facilitating accurate data interpretation for estimating cell size and content measurements.

Digital widefield fluorescence microscopy experiments

We employed the SyMBac virtual microscopy platform to conduct digital experiments mimicking widefield epifluorescence microscopy, the technique typically used for time-resolved live-cell microbial imaging 13 , 14 . In this configuration, microbial cells are sandwiched between a glass coverslip and a biocompatible material like agarose or PDMS, as they are imaged through the cover glass via the microscope’s objective lens, which can be either upright or inverted (Fig. 1a ).

figure 1

a Schematic optical paths for sample illumination (blue) and emission (green) collection on the camera for image formation in an epi-fluorescence setup. Emitters in the midplane of the cell are in focus. b A stepwise illustration of image formation of a cell uniformly filled with fluorescent emitters. Light from emitters at various planes of the cell is diffracted by the corresponding plane of the 3D point spread function (PSF). Images from multiple planes at various sample depths are projected on top of each other to form the final 2D image of the cell. Further details on the image formation process are given in Supplementary Information 1 . All scale bars are 0.5 μm.

In these settings, the microscope objective has a dual role: it focuses the excitation beam, illuminating the sample through the cover glass and collects emitted photons from the entire sample, focusing them on the camera to form the image (Fig. 1a ). The excitation light illuminates the entire sample, inducing fluorescence throughout. The objective collects emitted light from various planes along the sample’s Z-axis within its depth of field. Each of these planes in the Z-axis introduces blurring due to the 3D PSF, resulting in the projected 2D image, which comprises contributions from each Z-plane. Each contribution is differentially blurred according to its relative distance from the focal plane and the corresponding slice of the 3D PSF (Fig. 1b ). The interplay of projection and diffraction effects in image formation presents significant challenges in accurately extracting the ground-truth distribution of the emitters, as elaborated below.

Effects of projection and diffraction on cell size estimation

We start by examining how projection and diffraction impact the quantification of cell size and shape from 2D fluorescence images. The extent of blurring due to “diffraction effects” is linked to the 3D PSF’s size, which depends on the imaging wavelength, the numerical aperture of the objective lens, and any aberrations within the optical system. Consequently, diffraction effects exhibit wavelength-dependent characteristics for a fixed objective lens. In our digital simulations and real experiments, we employed PSFs of different wavelengths to investigate how diffraction impacts error and bias in measurements. Conversely, the manipulation of the depth of field in the imaging setup reveals the influence of “projection effects.” Using SyMBac, we can selectively toggle projection or diffraction effects, thus allowing us to model each effect in isolation by either capturing light from an infinitesimally thin plane or omitting convolution with the PSF, as shown in Fig. 2a, g . These “nonphysical” experiments are instrumental in identifying and understanding each underlying effect and its contribution independently.

figure 2

a Synthetic images of a stained cell membrane demonstrate the independent and combined effects of diffraction and projection on 2D image formation (scale bar = 1 µm). Diffraction effects were simulated using the experimentally measured instrumental PSF (iPSF) of our imaging system. b Radial profiles of intensity (across the cell width) from each panel from a are compared to show the relative shifts caused by projection and diffraction effects. The black dotted line indicates the true position of the cell boundary. c Radial profiles of synthetic-cell images with two wavelengths of iPSFs are shown. Lower wavelengths (green) cause smaller shifts, as expected from a smaller size of the PSF. d Example images of a cell stained with fluorescent d -amino acids HADA and RADA are shown. As expected, intensity traces across the cell’s width show RADA (red) emission is more diffracted than HADA (blue) emission, and the diffraction is biassed towards the centre of the cell. e A plot showing the average measured width of a population of cells stained with HADA and RADA (error bar = 99% CI). Inter-peak distances from radial profiles of RADA images consistently underestimate the width more than HADA images. f Comparison of radial profiles before and after deconvolution shows that deconvolution does not shift and correct the peak position; it only makes the profile sharper. g Synthetic images of a digital cell, uniformly filled with fluorescence emitters, show the effects of diffraction and projection on 2D image formation (scale bar = 1 µm). h We compare the radial intensity profile (across the cell width) with and without projection and diffraction effects corresponding to the panels in g . The black dotted line indicates the true position of the cell boundary. i Trendlines from synthetic data show that the observed/true width ratio is dependent on the cell width, with the error growing rapidly for narrow cells. The trends, however, occur in opposing directions for membrane-stained cells and cytoplasm-labelled cells. Estimated widths are calculated from the interpeak distance in membrane-stained cells and full width at half maximum (FWHM) of the radial profile of cytoplasm-labelled cells.

Errors and biases in size estimation from membrane-stained images

To define cell boundaries, quantitative microbiologists often use a membrane stain, a fluorescent dye that highlights the cell membrane (or the cell wall, or a tagged fluorescent protein which localises to the cell’s periplasm), creating a bright outline in the image 15 . Researchers have developed algorithms to identify the cell boundary by either setting a threshold on the brightness of the membrane-stained cells or by locating the brightest contour 16 , 17 . To assess the errors and biases in our estimation of cell boundaries from images of membrane-stained cells, we generated digital images of bacteria stained with a membrane marker. When we isolate the effects of projection and diffraction in synthetic images of a membrane-stained cell, we observe that projection causes a notable shift of the intensity distribution towards the cell’s centre. This shift is further illustrated in Fig. 2b , in the corresponding radial intensity profile. Such a shift leads to an underestimation of cell dimensions, especially cell width, which is typically estimated from the interpeak distance. Diffraction further exacerbates this intensity shift towards the centre, resulting in an even greater underestimation of width. The magnitude of this shift is influenced by the imaging wavelength, as expected due to diffraction (as shown in Fig. 2c ). Note: In this discussion, we have focused on cell width estimation from the interpeak distance of radial intensity profiles of such images, due to its pronounced error sensitivity compared to cell length and its quadratic influence on cell volume, significantly affecting overall size estimation.

Our digital experiments indicate that in a typical widefield imaging setup using a 515 nm emission fluorescent dye and a 1.45 NA objective, the width of a 1 µm wide cell is underestimated by approximately 20%, and a 0.5 µm wide cell is underestimated by 40% (Fig. 2c ). The extent of underestimation is higher for dyes with longer wavelength (Fig. 2c ) and for cells narrower than 0.6 µm in width, stained with a red fluorescent dye (emission wavelength = 700 nm), two separate peaks are not observable due to diffraction blur (Fig. 2i ), meaning the width cannot even be estimated. These findings underscore the major biases and limitations of this method for cell size estimation in existing literature.

To validate the predictions from our digital experiments, we labelled the peptidoglycan layer of individual E. coli cells with two distinct stains emitting at different wavelengths (HADA = 450 nm and RADA = 580 nm), both being fluorescent d -amino acids (FDAAs). We expect these stains to integrate into the same location in the peptidoglycan layer. However, radial intensity profiles revealed a notable inward shift in the intensity peaks of the longer wavelength dye (RADA) compared to the shorter-wavelength counterpart (HADA) (inset - Fig. 2d ), consistent with our simulated profiles in Fig. 2c . Analysing 137 cells, we found that RADA, with the longer wavelength, led to significantly more underestimation of cell width compared to HADA (see Fig. 2e , with additional image examples in Supplementary Information 2 ). These results validate our prediction about diffraction effects on width estimation. However, since the effects of diffraction on the peak position is dependent on the extent of projection, these effects cannot be eliminated by deconvolution using the PSF, as 2D deconvolution is unable to eliminate the projection effects (shown in Fig. 2f and Supplementary Information 3 ). 3D deconvolution would partially address this problem, but it is not compatible with single-plane widefield images that are typically acquired during time-resolved imaging of microcolonies. Instead, using superresolution imaging where diffraction effects are minimised (such as SIM, STED, or PALM 18 , 19 , 20 , 21 ) could help, or employing an imaging system with a shallower depth of field compared to the cell depth (such as a confocal microscope) could also reduce the effects of projection and mitigate the resulting shift from diffraction effects (detailed in Supplementary Information 4 ).

Errors and biases in size estimation from images of cells uniformed filled with markers

Alternatively, researchers often use thresholding algorithms to segment bacterial cell images based on uniformly distributed fluorescence of molecules within the cytoplasm 22 , 23 , 24 , 25 , 26 , 27 . Various thresholding algorithms are employed to segment cells from their fluorescence images, but each has biases and sensitivities that are challenging to quantify and correct (Supplementary Information 5 ). To quantitatively assess the impact of projection and diffraction on extracting cell dimensions from these types of images, we rely on estimating cell width from the full width at half maximum (FWHM) of the radial profile of the intensity.

Unlike in our previous analysis of membrane-stained cells, projection and diffraction have opposite effects on size quantification for cytoplasm-stained cells. Projection effects cause the intensity distribution to shift toward the centre, leading to a bias towards underestimation of cell width from FWHM, while diffraction effects result in light bleeding out, making the radial profile wider than their projected version (see Fig. 2g, h ). We demonstrate that increasing the depth of projection leads to an underestimation of cell dimensions beyond a critical cell width (Supplementary Information 4 ), while higher imaging wavelengths result in increased image blur from diffraction, leading to a bias towards overestimated dimensions. The diffraction effect is also apparent in brightfield/phase-contrast images. In Supplementary Information 6 , we compare radial profiles of phase-contrast images of the same cell collected with different emission filters. The results reveal that phase-contrast images collected with blue emission filters exhibit significantly sharper features and narrower profiles than those collected with red emission filters. Results from image segmentation of phase-contrast or brightfield images of bacteria are affected by such biases and should be corrected for 28 , 29 .

It is important to note that both imaging approaches (membrane-stained or cytoplasm-labelled) exhibit biases in cell dimension estimation that strongly depend on the actual cell dimensions. Figure 2i shows that the relative width-estimation bias from the membrane image decreases as cell width increases, while the estimates from the cytoplasmic marker exhibit an opposite, but less pronounced effect. In the case of membrane-stained images, the shifts from projection and diffraction happen in the same direction, while they oppose each other in case of cytoplasmically-labelled images. An accurate model of the imaging system can be used to calculate correction factors for a given wavelength, which could then be applied to estimate the true dimensions from the observed profile. Virtual microscopy platforms, such as SyMBac, could be utilised to simulate these effects to computationally estimate such correction factors (Supplementary Information 7 ). However, it is difficult to recover the outline of an individual cell to accurately estimate size and shape using this approach. In the following section, we explore methods for incorporating these effects into training deep learning image segmentation models, enabling the models to accurately estimate cell sizes and shapes from 2D images.

Deep learning approaches for precise quantification of cell dimensions

There has been a surge in the popularity of deep learning approaches for cell image segmentation 30 , 31 , 32 , 33 , 34 , 35 . However, the accuracy of these models is inherently linked to the quality of their training data. Generating training data for microscopic images of microbes presents unique challenges compared to standard computer vision tasks. Here, projection and diffraction effects are comparable to object dimensions and, as a result, impede computational boundary identification, as mentioned in our previous work 12 . Manual annotation is also affected because 2D images lack clear boundaries and contain intensity gradients. In essence, the images are blurry. To evaluate the performance of human annotators, we provided them with synthetic images with accurate ground truths and conducted a benchmarking experiment. Supplementary Information 8 details this experiment and the corresponding results, revealing that human annotator performance is not only highly variable but also consistently exhibits an underestimation bias stemming from projection effects.

Inaccuracies and biases in training data, whether originating from computational thresholding or human annotation, compromise the integrity of object-image relationships, thereby leading to the corrupted performance of deep-learning models. The subsequent analysis shows that the highly versatile Omnipose algorithm (specifically bact_fluor_omni) 30 , when trained on human-annotated synthetic fluorescence images, compromises its efficacy in cell segmentation (Fig. 3 ). This phenomenon parallels findings from our recent publication 12 , where we demonstrated that the segmentation outputs from pretrained models inherit biases from their training datasets, resulting in significant variability in segmentation outcomes and marked deviations from the ground-truth distribution.

figure 3

a SyMBac can be used to generate synthetic training-data, containing realistic images and accompanying perfect ground-truth, to retrain image segmentation models. Illustrative examples of synthetic images and ground-truth pairs are shown for training Omnipose to learn cell masks from images of cytoplasm-stained samples (top) and images of membrane-stained samples (bottom). b Using synthetic data as ground truth, we can check the performance of the pretrained bact_fluor_omni model. To alleviate the effects of human annotation quality, we retrained the model on samples of simulated agar-pad data generated using SyMBac. Examples of validation data, with ground-truth masks and mask outputs from the pretrained and retrained models are shown. To compare ground-truth masks and output masks, each is coloured based on its total area, and the colurmap is given below (scale bar = 2 µm). c Comparison of the output distribution of cell sizes shows that the pretrained model does not reconstruct the underlying ground truth distribution, whereas the output distribution from the SyMBac-trained model more closely mimics the underlying distribution. d To show that synthetic data also boosts segmentation accuracy on real data, we analysed patches of densely packed cells to find groups of cells aligned across their long axis. Since cell width is tightly controlled, we can use these patches of aligned cells to estimate a value for the true population mean width (full analysis is given in Supplementary Information 10 ). We then generated training data matching the real data’s experimental parameters and retrained Omnipose. The resulting distribution of widths for isolated cells and cells within dense colonies is plotted for both the pretrained and retrained model, showing that retraining on synthetic data makes width estimation more accurate. (Ground truth: 0.94 ± 0.066 μm, pretrained: 1.2 ± 0.10 μm, SyMBac trained: 0.94 ± 0.062 μm). e Using synthetic data of membrane-stained cells as ground truth, we trained an Omnipose model to segment cells. We compared the output widths to those widths measured by calculating the interpeak distance between the labelled cell walls/membranes, as shown in Fig. 2 . (Mask colour represents cell width, and the colurmap is given below, scale bar = 2 µm). f The fractional underestimation of a membrane-stained cell’s width (given by the interpeak distance) is highly dependent upon the width itself, and the imaging wavelength. This is true for a cell imaged in widefield, where the DoF is approximately equal to its width (Width DoF in the legend). Training Omnipose on synthetic data of membrane-stained cells makes the deep-learning model (DL) insensitive to the scale of the cell, as well as the imaging wavelength, unlike the interpeak distance method (error bar = 1 SD). g Comparison of the output mask width distribution of the two simulated datasets to the ground-truth mask width distribution shows that when trained on appropriate synthetic data, the entire population distribution can be faithfully reconstructed irrespective of the imaging wavelength.

The virtual microscopy pipeline offers an advantage in addressing the issue of user subjectivity and bias in training data. One can generate realistic synthetic microscopic images of microbes accompanied by accurate ground-truth information (Fig. 3a ). Training deep-learning models with such synthetic training data enables the models to learn precise object-image relationships (detailed in Supplementary Information 9 , Supplementary Information 10 , and ref. 12 ) and mitigates the problem of inaccuracies and user subjectivity in traditional training data. The same Omnipose model, when retrained with synthetic data, produces a segmentation output that more accurately predicted the ground-truth information in the test data, as demonstrated in the ground-truth mask comparisons (Fig. 3b ) and input-output size distributions (Fig. 3c ). The comparison of cell size distributions indicates that Omnipose training data contain enlarged cell masks.

To experimentally verify and validate the enhanced performance of the retrained Omnipose model compared to the pretrained version, we devised a new assay that leverages the tight width regulation of bacteria 36 . This involved placing a high density of cells on agar pads, capturing images of both isolated cells and cell clusters, and then estimating the ground-truth widths of individual cells based on their average width in aligned space-filling patches (further explained in Supplementary Information 10 ). The average cell widths estimated from the patches were tightly distributed (0.94 ± 0.066 μm). The estimated mean width from the patch analysis should match the average widths of isolated individuals, as the cells were not grown on the agarose pad and therefore, were not allowed to differentially adjust to the imaging environment. Subsequently, the widths obtained from this analysis were compared with those derived from the segmentation outputs of both the pretrained Omnipose model (1.2 ± 0.10 μm) and the retrained model using synthetic images (0.94 ± 0.062 μm). The results demonstrate that the retrained Omnipose model exhibits both higher precision and accuracy in estimating cell widths compared to its pretrained counterpart (Fig. 3d ). The comparison of masks presented in Supplementary Information 11 reveals that the original Omnipose model generates substantially larger masks for isolated cells than for cells within clusters, resulting in significant variability and bias in the predicted cell width. In contrast, the output masks from the Omnipose model retrained with synthetic data demonstrate robust performance.

Motivated by these results, we explored an additional application of this approach in the analysis of membrane-stained images; we retrained the Omnipose model with pairs of synthetic fluorescent images of membrane-stained cells and corresponding ground truths (Fig. 3a —bottom). The estimated cell outline from the contour of the membrane-stained images (as described in ref. 37 ) significantly underestimated the cell area compared to the ground truth masks (Fig. 3e ). The relative error in width estimation was size and wavelength dependent (Fig. 3f ), consistent with the previous discussion. Conversely, the comparison of output masks from the retrained Omnipose and the ground-truth cell masks illustrates the high accuracy and precision of the deep-learning model. The model robustly learns the offset created by diffraction and projection as a function of size, and the estimated width closely tracks the ground truth across a wide range of input widths and in a wavelength-independent manner (Fig. 3f, g ). The combination of these digital experiments and real experiments illustrates how synthetic training images can capture the subtle effects of projection and diffraction and augment our capabilities of estimating true cell sizes using deep-learning models.

Quantifying fluorescence intensities of individual cells

Next, we address the issue of quantifying the intensity of individual cells from their fluorescence images. Measuring the total fluorescence intensity of labelled molecules within a cell is crucial for estimating their abundance. This capability enables researchers to monitor the dynamics of cellular processes using time-resolved single-cell image data 38 . The variation in signal intensities among individual cells within the population and over time offers insights into the key regulatory variables and noise sources 39 .

Usually, for the sake of experimental simplicity, microcolonies of microbial cells are cultivated on agarose pads 27 . This setup enables the tracking of individual cell intensities over time and the comparison of intensities among colony cells at different time points. However, such experimental designs, including microfluidic devices with densely packed cells 40 , 41 , introduce a significant artefact in single-cell intensity measurements due to a combination of diffraction and projection effects from the imaging system. The PSF of an imaging system disperses light away from its source. In the context of a cell filled with fluorescent emitters, the emitted light extends beyond the true cell boundaries, making solitary cells appear dimmer (see Fig. 4 a, b). In densely packed clusters, the dispersed light is erroneously attributed to neighbouring cells. We previously termed this phenomenon ‘light bleedthrough’ 42 . Light bleedthrough substantially distorts intensity estimates of cells within a colony, leading to misinterpretations of the strength and noise in gene expression levels, as explained below.

figure 4

a The radial intensity profile of a simulated fluorescence cell illustrates how intensity is lost from the cell contours due to diffraction effects (scale bar = 0.5 μm). b Example snapshots of simulated fluorescence images of a growing microcolony using a 100× 1.49 NA objective’s instrumental point spread function (iPSF). All cells in the simulation have identical ground truth intensities. Isolated cells and individual cells in small microcolony sizes show low cell intensities, while as the colony grows larger, cells in dense regions, such as at the centre of the colony, begin receiving more light contributions from surrounding cells, artificially increasing their perceived intensity. (Scale bar = 2 μm). c As colony size increases, the mean observed intensity of each cell in the colony also increases (Error bars = 99% CI). Relative changes in intensity compared to isolated individual cells are shown on the left-hand y -axis. Cells approach the true mean intensity as the colony size increases, as shown on the right-hand y -axis. d A false-coloured image of cells on a real agarose pad showing ‘preformed’ microcolonies of various sizes, along with a single cell (white arrow). The relative intensity scale is shown on the right. Important features to note are the similarity to the simulated data shown in ( b ) (Scale bar = 10 μm). e The intensity distribution of cells depends on the size of the clusters they belong to. Isolated individual cells have low mean intensities, while cells from preformed microcolonies with more than 50 cells have 3× higher mean intensities. f Experimental data shows that the average intensity of cells increases with the population size of microcolonies (shown in orange). The trend from simulated colonies is shown in blue.

With the SyMBac virtual microscopy system, we can quantify these effects and verify them through experiments on real microcolonies (see example images of synthetic microcolonies in Fig. 4 , real examples in Fig. 4d ). Crucially, while measuring the instrumental PSF (iPSF) of one’s microscope is a standard procedure, they are not typically imaged over a domain large enough to capture the effects of light bleedthrough at long distances (> ~15 μm for high NA objectives), since the signal to noise ratio becomes low. Thus, we pursued analytical fits to the iPSF to extend its range for simulating long-range diffraction effects. The most suitable method involves extracting the pupil, followed by reconstructing a phase-retrieved PSF which includes appropriate aberrations 43 . However, as detailed in Supplementary Information 12 , although the reconstructed PSF effectively captured the aberrations in our system, it failed to replicate the long-range effects observed in the iPSF. A theoretical PSF (tPSF) model 44 gave a much better fit to the entire iPSF (Supplementary Information 13 ). However, since we are interested in simulating the long-range effects of diffraction, one must extrapolate the function domain of the fitted PSF. We found that the tPSF did not extrapolate well when fitted to a crop of the iPSF. We, therefore, resorted to an ad hoc empirical function fit, which we call an “effective” PSF (ePSF), which we verified was able to extrapolate to the entire function domain despite being fitted on only a small crop of the iPSF (see Supplementary Information 14 ). All simulations of light bleedthrough effects in microcolonies were carried out using this ePSF model.

Colony size affects single-cell intensity quantification

Our simulations suggest that, due to the loss from light bleedthrough, an isolated cell appears only 30% as bright as its true intensity (Fig. 4c ). 70% of the intensity is lost to the surroundings and can end up in nearby cells. As the microcolony size increases, more neighbouring cells contribute to the intensity of cells within the colony through the light bleedthrough effect. Consequently, the mean intensity of cells within a simulated microcolony rises monotonically with colony size, reaching 70% of the true intensity in very large colonies (>1000 cells, see Fig. 4b, c ). As the colony size tends to infinity, the mean intensity of an individual cell should converge to the true mean intensity, since all the lost intensities are allocated to other cells within the colony. These simulations predict that individuals within a colony of a hundred cells should appear, on average, 2–2.5 times brighter than isolated cells (Fig. 4c ).

To validate these predictions, we conducted experiments with microcolonies on agar pads, comparing the intensity of cells within colonies of different sizes with that of isolated cells. To ensure a consistent intensity distribution among all cells, we placed a high density of cells on agar pads and captured instantaneous images of cell clusters, rather than allowing colonies to form gradually, which could introduce temporal intensity variations. This method allowed us to obtain ‘preformed microcolonies’ of varying sizes while maintaining the original cellular content distribution. As expected from the analysis of simulated microcolonies, snapshots of real cell clusters clearly show higher intensity in cells from larger colonies compared to isolated cells (see Fig. 4d ). The intensity distributions of cells in large ‘preformed colonies’ (number of cells >300) do not overlap with those of isolated individuals (see Fig. 4e ). The trend in Fig. 4f , which illustrates the mean intensity of cells relative to their preformed colony sizes, is qualitatively similar to the trend predicted from the digital experiments. Nevertheless, the magnitude of bleedthrough effects witnessed in the experimental microcolonies exceeds that of the simulated colonies. This disparity may arise from the mismatch between the long-distance tails of the ePSF and the actual iPSF of the system, the possible contribution of scattering in the imaging medium, or field-dependent aberrations not captured in our simulations.

Light bleedthrough affects noise and correlations in single-cell intensity measurements

The light bleedthrough effect goes beyond its impact on mean intensity, introducing subtle local variations in individual cell intensities. Since the degree of bleedthrough depends on the number of neighbouring cells, the intensity of individual cells varies based on their position within the colony (Fig. 5 ). Cells closer to the centre, receiving contributions from more neighbours, appear brighter in images than those near the edges (see Fig. 4b, d ). Consistent with the predictions from our digital simulations of microcolonies, the experimental data reveal a correlation between spatial intensity patterns, the number of neighbouring cells, and intra-colony position, (see Fig. 5a–c ). Additional example images of microcolonies from imaging experiments are shown in Supplementary Information 15 .

figure 5

a Schematic representation of a cell (green) some distance from the centre of the microcolony, with its neighbouring cells labelled (lighter green). The intensity of individual cells depends on their position within the microcolony, given by d c / D , where dc is the distance from the colony centre, and D is the colony diameter, and the number of direct neighbours ( cell N ). b Cells closer to the centre of a simulated colony appear brighter than cells at the periphery due to light bleedthrough (Error bars = 99% CI, data sample averaged for colonies of size 20–1000 cells). The position-dependent trend predicted from simulated microcolonies (green) is consistent with experimental results (orange). c Simulated cells with more neighbours appear brighter (Error bars = 99% CI, data sample averaged for colonies of size 20–250) consistent with experiments (orange). d Two simulated colonies with CVs of 0.01 and 0.22, respectively, along with their convolution with the iPSF, showing that their observed CVs are very similar despite markedly different underlying cell intensity distributions. e At low noise, true CV < 0.15, where the underlying cell intensity distribution is uniform, the PSF causes artificial position-dependent variation in cell intensities (increasing CV). Conversely, when the input CV is high (true CV > 0.15), the PSF acts as a blurring filter, lowering the variance in the population by allocating intensities from brighter cells to neighbouring dimmer cells (lowering CV).

Such phenomena, where the intensity of individuals appears to be dependent on their position or number of neighbouring cells, can lead to misinterpretations in quantifying intensity correlations and cellular heterogeneity 5 , 8 . It may wrongly suggest that an individual cell’s intensity is influenced by interactions with neighbouring cells, incorrectly implying nonexistent biological mechanisms. In studies using intensity distribution patterns as evidence of cell-cell interactions, researchers should consider the confounding influence of optical effects and implement appropriate controls to differentiate genuine biological interactions from optical artefacts 7 . The use of digital control experiments via virtual microscopy platforms, like SyMBac, can help identify potential artefacts specific to a given experimental design, including optical specifications and sample configurations.

Light bleedthrough effects also cause a major artefact in noise estimation from single-cell data, which is somewhat counterintuitive. In the absence of true population variability (coefficient of variation, CV = 0), positional factors within the microcolony and the number of neighbouring cells can artificially introduce variability and result in a higher estimated CV (Fig. 5d ). Conversely, when substantial variation exists among cells, the PSF acts as a smoothing filter, redistributing intensity from brighter to dimmer cells (see Fig. 5d, e ), leading to an underestimation of inherent variability. These paradoxes emphasise the complexities introduced by diffraction effects in the temporal quantification of gene expression variability. They produce similar results across different underlying distributions (further examples in Supplementary Information 16 ) and present challenges for correction because the effect’s magnitude and direction depend on an unknown ground truth CV, as well as the size and shape of the microcolonies.

It is conceivable that one could leverage deconvolution to correctly assign light to specific pixels in the image. In actual image formation, the PSF has an infinite range, leading to long-range diffraction effects accumulating within dense microcolonies. This results in diffracted light potentially ending up several pixels away from the original point source. However, deconvolution methods applied in the literature sometimes use kernel sizes far smaller than the data and thus merely result in a sharpening of the image, failing to accurately reassign light from beyond the kernel’s boundaries. This phenomenon is illustrated using experimental and simulated images in Supplementary Information 17 and 18 . As shown in Fig. 5e , where deconvolution is performed with a kernel measuring 125 × 125 pixels, only a marginal improvement in noise estimation accuracy is seen. Ideally, the deconvolution should use a kernel size as large as the data being deconvolved. Since such a large iPSF is unattainable, deconvolution with the full ePSF was performed. While it gave a marked improvement over the iPSF, it was unable to fully recover the underlying ground truth intensity distribution in experimental data (Supplementary Information 17 ).

Microfluidic imaging platforms for robust intensity quantification

Given the challenges associated with accurately knowing the PSF and the exact configuration of cells within a microcolony, the task of estimating the true intensities of individual cells, especially for quantifying noise or correlations, becomes nearly impossible. Additionally, the influence of projection and scattering effects and potential inhomogeneities in the growth environment 45 is hard to eliminate. Therefore, we suggest that, considering knowledge of diffraction effects, researchers could design their experiments differently. For instance, utilising a structured imaging platform, where cells are maintained at a fixed distance from each other, can help minimise the bleedthrough effects.

To systematically analyse the constraint on such an imaging platform, in our simulations on an array of digital cells, we explored how the extent of intensity bleedthrough depends on the inter-cell distance in such an array (Fig. 6a ). The percentage bleedthrough contribution from neighbouring cells is plotted as a function of distance along the short and long axes of cells ( x and y , respectively) (Fig. 6b ). The heatmap illustrates that in a closely packed array, shown in the left-top corner, the intensity of individual cells receives an additional ~100% contribution from neighbours, causing cells to appear 2x brighter than isolated cells—a finding consistent with our earlier discussion. To reduce the light bleedthrough effects to <1% of the true intensity, cells need to be at least 10 μm apart from each other (a conservative estimate based on the ePSF).

figure 6

a A simulated array of cells with controllable x and y cell pitch. The grid corresponds to the heatmap in panel b , with increased x and y spacing between cells lowering the intensity bleedthrough from neighbours. b The heatmap shows the overall intensity bleedthrough percentage of a cell within an ordered array of neighbours as a function of inter-cell distance. c Characterisation of intensity bleedthrough from distant cells within the mother machine microfluidic device. A schematic representation of the mother machine is given, along with a phase-contrast image taken of the same device, with the (mCherry) fluorescence channel overlaid to indicate labelled and unlabelled cells. Varying the spacing between mother machine trenches affects the amount of light bleedthrough incurred. d Bleedthrough from within a single trench in the mother machine; an unlabelled cell’s intensity is measured and its apparent intensity increases as the number of additional labelled cells within a trench increases, despite it being unlabelled.

Microfluidic devices, such as the ‘mother machine’ 2 , provide a viable solution for maintaining cells at a fixed, constant distance from one another. This device keeps cells in short vertical arrays placed at a specific distance apart. By selecting the appropriate spacing between these arrays through specific device design and optimisation, researchers can effectively eliminate the bleedthrough effect, allowing for accurate estimation of the heterogeneity and fluctuation dynamics of intensities from individual cells 46 .

To assess the effect of bleedthrough in these scenarios, we conducted experiments by mixing unlabelled cells with fluorescently labelled cells in the mother machine (black and red coloured cells in Fig. 6c and d respectively). Details of the experimental design, analysis, and results are shown in Supplementary Information 19 . Intensities of unlabelled cells, as a function of the number and distance of their neighbouring fluorescent cells, were calculated to estimate the percentage bleedthrough effect. The results from this analysis, along with data from a previous paper using a similar approach 46 , show a quantitative match with the simulated trends. A distance of >10 μm between trenches is sufficient to reduce the extent of bleedthrough to below ~1%. Microfluidic device design and performance verification, using digital microscopy experiments, should be routinely employed to estimate and eliminate unwanted optical effects in microscopy data.

Counting single molecules

Quantifying the abundance of low-copy tagged molecules introduces unique challenges. When the collective fluorescent signal from these tagged molecules approaches the background autofluorescence of the cell, interpreting intensity values in terms of abundance becomes complex. In the case of species with moderately low copy numbers per cell (approximately 50–100 copies per cell), some researchers have employed techniques like background deconvolution 4 . However, these methods fall short of achieving single-cell resolution since they deconvolve the entire distribution along with autofluorescence levels. Moreover, these distribution deconvolution techniques are ineffective for proteins with very low copy numbers (less than 20), which often play critical roles in gene expression regulation, including gene copy numbers, transcription factors (TFs), mRNAs and plasmids 47 . To reliably quantify the abundance of low-copy proteins, it is essential to count individual fluorescently tagged molecules within a cell 48 .

Accurately determining the copy number of labelled molecules remains a formidable task due to the interaction of the diffraction limit and the 2D projection of 3D-distributed emitters. In 2D images, individually tagged molecules manifest as diffraction-limited, blurry spots, with the extent of blur contingent upon their distance from the focal plane. Consequently, when spots are positioned closer to each other than the resolution limit (in the XY plane, regardless of their position in Z, due to projection) they can merge into a single spot in the projected image. Adding to the complexity, the 3D characteristics of the PSF make it difficult to detect out-of-focus spots (Fig. 7a ). Both of these effects collectively contribute to an underestimation of the molecule count, even when there are only two copies per cell.

figure 7

a Schematic of an epifluorescence single molecule imaging setup, where the position of emitters within the cell determines the extent of defocus in their image and determines their detection probability. b Molecules in a cell exist in three dimensions. Shown are nine molecules in a digital cell and their ground truth positions projected in xy . Upon convolution with the PSF, the resultant image is a projection of all nine molecules, of which only three are in focus. The remaining six molecules are dim and hard to detect, and two of them are very close. c Simulated sampling of points within a digital cell shows ‘naive’ detection bounds and sources of undercounting. If points had an SNR lower than the 99th percentile of the background PSNR, they were considered too dim to detect. Additionally, if two molecules are within 1 Rayleigh criterion of one another, they are considered too close to resolve, and thus these molecules are lost due to diffraction. The remaining population of molecules are considered countable. d Relative contributions of each mode of undercounting (lost to diffraction and lost to defocus) are plotted as a function of the true count of molecules in a narrow cell ( r  = 0.5 μm) and a thicker cell ( r  = 1 μm). The resolvable fraction decreases rapidly with increasing density of molecules, whereas the detectable fraction stays constant.

In Fig. 7b , we illustrate the extent of undercounting from various sources using digital imaging experiments. We employ a “naive counting” criterion, as described in the Methods section, which includes the enumeration of molecules that are out of focus, those that are undercounted due to proximity to diffraction-limited spots, and the cumulative count of both individual and cluster molecules perceived as singular due to the diffraction limit. This approach allows us to identify error sources that are influenced by cell dimensions (spot density and projection) and experimental setups (diffraction and depth of field). The results in Fig. 7c show that, when the count is small ( n  = 5), most spots are isolated, resulting in minimal losses due to diffraction effects. However, increasing the cell thickness (from cell radius = 0.5 μm to radius = 1 μm) leads to a significant fraction of spots becoming undetectable as they get blurred and dimmed due to the defocused PSF. On the other hand, as the molecular count increases, there is a corresponding increase in the proportion of molecules that are undercounted due to the diffraction limit since an increase in spot density leads to an increase in the fraction of spots being unresolvable from their neighbours. However, the proportion of molecules lost due to defocus remains constant, dependent solely on the volume fraction of the cell situated outside the focal plane (Fig. 7d ).

This digital experiment reveals that the combined effects of projection and diffraction lead to substantial undercounting, even for molecules present in very low quantities; in an average-sized bacterium, a single snapshot may incorrectly count only two molecules as a single entity approximately 5% of the time (Supplementary Information 20 ). The proportion of undercounting escalates rapidly with the increase in the number of molecules present per cell, as depicted in Fig. 7d , and for a copy number of 15 molecules per cell, one underestimates by 40%.

Smart-microscopy approaches for improving counting performance

The term ‘smart-microscopy approaches’ denotes utilising domain knowledge of a specific imaging system and subject to craft targeted microscopy solutions, encompassing both acquisition and analysis. To improve counting performance, knowledge of the depth-dependent detection probability and the cell volume can be leveraged to calculate a correction factor, addressing the loss of molecules due to defocus. At the focal plane, a molecule will be most in focus, and given that it is bright enough, will exceed a threshold SNR for detection. The probability of exceeding this SNR threshold decreases as the molecule shifts out of the focal plane. We call this probability function D ( z ). We derived this empirical depth-dependent detection probability function for our imaging system from the instrumental PSF (shown in Fig. 8a and detailed in Supplementary Information 21 ).

figure 8

a The depth-dependent probability function D ( z ) of an imaging system is shown in blue. The cell’s cross-sectional density is given in red as a ( z ). The overlapping area between these two profiles (given by the cross-correlation a ( z ) ★ D ( z )) gives an estimate of the fraction of the molecules which will be observed. To maximise the number of molecules detected, one can shift the objective by an optimal amount, given by δz optimal , which is where the two functions have maximum overlap. b A schematic showing a cell in the optimal focal position relative to the detection probability function, thus detecting the maximum number of molecules possible. The true number of molecules can then be estimated by multiplying the observed count (green) by a correction factor (accounting for the lost molecules, shown in black), which intuitively is the reciprocal of the overlapping area (full derivation in Supplementary Information 21 ). Another approach to detecting more molecules is to modify a ( z ) by physically compressing the cell (using Microfluidics-Assisted Cell Screening (MACS)), bringing the entire cell’s volume within the maximum region of D ( z ). c Applying the correction factor or compressing the cell using MACS improves the counting performance compared to the naive estimate. Both of these approaches reduce the error from defocus, but undercounting errors at higher counts occur due to diffraction effects. d A schematic architecture of the Deep-STORM single molecule localisation network is shown, which was trained using synthetic single molecule images. e Applying Deep-STORM to molecule counting improves performance, but combining it with MACS leads to near-perfect detection and counting up to a higher density of molecules.

Upon closer examination, we observed that this function is offset from the objective’s focal plane because of the asymmetric nature of the iPSF along the Z axis. To maximise the number of detectable molecules within a cell, it is necessary to optimise the overlap between the cell’s cross-section and this function. A cell’s cross-sectional density is given in red as a ( z ) (Fig. 8a ). The integral of a ( z ) between two z positions within the cell will give the volume fraction, and hence the fraction of molecules between those positions (assuming a uniform distribution of emitters within the cell). The detection probability D ( z ) can be shifted by focusing the microscope’s objective up and down. Thus, the overlapping area between these two profiles (given by the cross-correlation a ( z ) ★ D ( z ) gives an estimate of the fraction of the molecules which will be observed, for all shifts of D ( z ). To maximise the number of molecules detected, one can shift the objective by an optimal amount, given by δz optimal , which is where the two functions have maximum overlap (Fig. 8a ).

Once the focal plane is adjusted, we can compute the fraction of molecules lost to the out-of-focus volume to find an empirical correction factor (Fig. 8b ). This correction leads to improved counting performance, but only on averaged counts, as demonstrated in Fig. 8d . Alternatively, using microfluidic platforms to flatten cells can bring a larger number of spots into focus (Fig. 8b ). Additionally, the expanded cross-section of the flattened cells in the imaging plane slightly reduces the undercounting effect caused by the diffraction limit (Fig. 8c ), consistent with previous findings in the work of Okumus et al. (Microfluidics-Assisted Cell Screening—MACS) 10 , 49 .

To further improve the counting performance, we explored the potential of designing an image analysis pipeline that leverages knowledge of defocus and spatial patterns from simulated data to enhance counting accuracy. To pursue this, we retrained Deep-STORM, a well-established deep-learning network designed for super-resolving single-molecule images 50 . Deep-STORM leverages a convolutional neural network architecture to super-resolve single molecules based on local intensity patterns and spatial relationships. We have trained the Deep-STORM model using simulated synthetic images, which contain a varying number of spots with appropriate defocus depending on their position within the digital phantom cells (Fig. 8d ). This training enabled the model to consistently and accurately count molecules to a larger copy-number (Fig. 8e ) compared to the naive counting estimates and the performance demonstrated in previous research in this field 10 .

Just as observed with deep-learning models used for cell segmentation (discussed in the “Deep-learning approaches for informed cell segmentation” section), training these models with realistic synthetic data significantly enhances their ability to detect single molecules. Our previous analysis demonstrates that flattening the cells using platforms like MACS reduces the fraction of defocused spots and emulates a situation in which most spots are in focus (Fig. 8b ). Indeed, Deep-STORM models, when implemented in a simulated MACS type scenario, performed reliably up to a very large number of molecules per cell (>~25 molecules/cell, Fig. 8e ).

The advancement of quantitative microbiology relies heavily on the accurate interpretation of microscopy data. We have employed virtual microscopy experiments and targeted real experiments to systematically explore the challenges and potential pitfalls associated with using microscopy data to quantify the size and content of microbial cells. Our focus was on projection and diffraction effects, particularly significant for microbial cells due to their size.

Our findings reveal significant impacts of projection and diffraction on the performance of image segmentation algorithms in accurately identifying cell outlines from fluorescence and brightfield images of bacteria. Both traditional segmentation techniques and machine-learning approaches experience biases in cell size estimation. The extent and direction of this bias depend on various factors, including labelling methodologies, imaging configurations and the cell’s dimensions, which makes it difficult to correct. However, we found that the bias and error can be mitigated when using machine-learning methods trained with synthetic data that incorporates these effects.

Timelapse imaging methodologies, commonly employing agar pads and microfluidic devices, are frequently utilised for investigating live-cell gene expression dynamics and heterogeneity 39 , 40 , 41 , 51 . Using digital image simulations and experimental fluorescence imaging of cell clusters, we found that the accurate quantification of true cellular fluorescence signals in clustered configurations (‘microcolonies’), is difficult due to diffraction-induced misallocation of light intensity from adjacent cells. Such distortions impact both the estimation of expression variation and correlation analyses conducted on these platforms 5 , 7 , 8 , 52 . Deconvolution can improve, but not entirely eliminate, these artefacts and its fidelity to ground truth strictly depends on the precision and size of the deconvolution kernel. In this case, experimental design changes, such as the use of imaging platforms like microfluidic devices, where cells can be kept at specified distances, can reduce such distortions 42 , 46 .

Similar challenges arise in the quantification of low copy number moieties, such as mRNA, plasmids, or proteins, complicating the accurate counting of more than five individual molecules. Caution is warranted when interpreting ‘single-molecule’ images and results from estimated molecular ‘counts.’ To address these challenges, alternative experimental designs and deep-learning-based analysis protocols were proposed and substantial improvements in counting accuracy were demonstrated.

In summary, the analysis presented here underscores the critical importance of understanding the artefacts and aberrations incorporated into microscopy data to extract meaningful information about microbiology, whether it involves the shape and size of cells or their content from intensity measurements or single-molecule counting. We advocate for the routine use of digital experiments with virtual microscopy platforms to test limitations of experimental design and potential optical illusions, ensuring ‘informed’ interpretations of imaging data. This knowledge can further inform the design of ‘smart microscopy’ experiments, leveraging domain knowledge to create appropriate imaging platforms and machine-learning models trained with relevant ‘synthetic images.’ The analysis and discussion presented in this paper should guide improved experiment design and help with quantitative interpretation of microscopy experiments in microbiology.

Computational methods

Virtual fluorescence microscopy using symbac.

In this study, image simulations were conducted utilising the SyMBac Python library 12 . Unless specified otherwise, all virtual fluorescence microscopy images were generated following a consistent workflow: 1. The 3D spherocylindrical hull of a digital cell was positioned within a defined environment—either (a) isolated, (b) among scattered cells, or (c) within a microcolony if colony growth simulation data were available. 2. Fluorescent emitters were uniformly sampled within the cell volume and indexed within a 3D array, the value at each index denoting the emitter count. For cells with homogeneously distributed fluorescence, a “density” value was established, defined as the average number of emitters per volumetric element within a cell. Thus, the total number of molecules within a cell was calculated as the product of this density and the cell volume. 3. Subsequently, diffraction and projection effects were simulated through the convolution of this dataset with a point spread function (PSF). Convolution was executed employing either a theoretical, effective, or instrumentally measured PSF (tPSF, ePSF, or iPSF respectively). The PSF_generator class within SyMBac was used to generate synthetic PSFs in accordance with the model from Aguet 44 . To simulate the projection and out-of-focus light characteristic of a widefield fluorescence microscope, the centre of the point spread function (PSF) is assumed to be aligned with the midplane of the cell. Each slice of the PSF is accordingly convolved with the corresponding slice of the cell, as illustrated in Fig. 1b , Supplementary Information 22 , and Supplementary Information 23 . If using a tPSF or ePSF, convolution is done at a high resolution, and then downsampled to the pixel size of the simulated camera in order to capture the high-frequency features of the kernel.

To artificially modulate the depth of field of the microscope and thereby mitigate projection artefacts, the number of PSF planes convolved with the cell can be truncated. For instance, in a 1 µm wide cell simulated at a pixel size of 0.0038 µm, there would be 263 slices in the Z-direction. To generate an image devoid of projection artefacts, only the middle Z-slice of the cell is convolved with the middle Z-slice of the 3D PSF. This method is applicable for both simulated and empirically measured PSFs. To eliminate diffraction effects, the PSF can be substituted with an identity kernel, which, upon convolution, reproduces the original data. To modulate the effect of diffraction continuously, the frequency of light employed to simulate the kernel can be arbitrarily adjusted.

Simulations of fluorescence images of individual cells

To preclude errors stemming from length underestimation in width assessments, digital cells with a fixed length of 6 μm were utilised. The cell width was manipulated to range between 0.5 and 3.0 μm, while the simulated depth of field was adjusted between 3 μm (for the widest cell) and 0.0038 μm (for a single Z-slice). The PSF generator was configured to “3d fluo” mode for 3D convolution, employing the model from Aguet 44 . Additionally, an identity kernel served as the PSF for a theoretical undiffracted microscope with an imaging wavelength of 0 μm.

To simulate images of cells with cytoplasmic fluorescent markers, 3D spherocylindrical cells were rendered, and emitters sampled as described above. To simulate membrane fluorescent cells, fluorescent emitters were sampled only within a single pixel layer corresponding to the outermost cell volume. We ensured to render images at a high resolution, allowing accurate drawing of the cell membrane. Rendering at a lower resolution, even a single pixel would be significantly thicker than the true thickness of the cell membrane.

Simulated images of fluorescent single cells, either with cytoplasmic markers or membrane markers, were generated at different wavelengths, widths and depths of focus. It is important to note here that when the depth of field is changed, this is a simulation of a non–physical effect. In actuality, the volume of out-of-focus light captured by the microscope is determined by the objective lens. By adjusting the number of Z-PSF layers with which the 3D cell volume is convolved, the non-physical manipulation of out-of-focus light collection by the microscope’s objective is simulated. Despite its non-physical nature, this is a valuable exercise for identifying sources of measurement bias and error. A similar argument applies to diffraction: the use of an identity kernel simulates an image devoid of diffraction effects. Though non-physical, this is instrumental in examining how an image is compromised by the microscope’s PSF. In real experiments, both projection and diffraction effects co-occur; hence, the comparative analysis is limited to simulated images that incorporate both phenomena.

Cell size quantification and analysis

Following the simulation of individual cells, errors in size estimation attributable to diffraction and projection were quantified using two methodologies. For cells marked with cytoplasmic fluorescence, a binary mask of the resultant synthetic image was generated employing Otsu’s thresholding algorithm 53 . Dimensions along the two principal axes of the binary object were then calculated to determine length and width. In contrast, for synthetic images featuring fluorescent membrane markers, dimensions were determined by measuring the inter-peak distance along the one-dimensional intensity profile, which was aligned with the two principal axes of the cell.

Simulations of fluorescence images of microcolonies of cells

The agar pad simulation feature of SyMBac was used to generate microcolony simulations 12 , each of which was terminated when reaching a size of 1000 cells. All microcolonies are restricted to monolayers. This pipeline leverages the CellModeller platform, which is integrated into SyMBac, for the creation of ground-truth microcolonies 54 . For these simulations, a constant cell width of 1 μm was maintained, with cellular division programmed to occur at a target length of 3.5 μm ±  Uniform (−0.25 μm, 0.25 μm) resulting in sequences of densely packed, proliferating colonies.

Individual cells in each colony simulation contain uniformly distributed fluorescent emitters. To control the coefficient of variation (CV) of intensity among the cells within a microcolony, we fixed the mean density of emitters within a cell but varied the variance of a truncated (at 0) normal distribution in order to sample single-cell intensities with a desired CV. The CV was sampled between 0 and 0.3. Synthetic images from these colonies were generated with 3D convolution in the same manner as described in the previous section, but with multiple PSFs: (1) a theoretical PSF being rendered for a 1.49NA objective, a refractive index of 1.518, a working distance of 0.17 mm, and with imaging wavelengths of 0.4 μm, 0.55 μm and 0.7 μm for separate simulations. (2) the same ground-truth data were convolved with an instrumental PSF captured from a Nikon 100× 1.49NA Plan Apo Ph3 objective lens with the same parameters described, imaged at 0.558 μm wavelength light (bandwidth 40 nm) and (3) the effective PSF fit of the instrumental PSF to simulate long range diffraction effects. This generated synthetic microscopy images and corresponding ground-truth masks of microcolonies under varied imaging conditions. More details on simulation and examples can be seen in Supplementary Video 1 .

Colony image intensity quantification and analysis

Quantification of single-cell intensities from synthetic microcolonies.

Calculating the intensity of each cell within the synthetic microcolony images did not require segmentation because the ground truth mask positions are available from the simulations. Thus, for each ground truth mask, the average intensity in the corresponding position in the synthetic microscope image was enumerated and used to calculate the CV, which could be compared to the ground truth CV. To assess whether the ground truth CV can be recovered under ideal circumstances, we performed Richardson–Lucy deconvolution (with 100 iterations) using the original PSF 55 , 56 . The deconvolved CV can then be compared to the ground truth CV by enumerating the deconvolved image’s average intensity within each ground truth mask position.

The distance of each cell from the centre of the colony was calculated by calculating each cell’s Euclidean distance to the mean position of all cells within the colony (the colony centroid). The distance was normalised by dividing by the maximum Feret’s radius of the colony as calculated by Scikit-image’s regionprops function 57 . The number of neighbours for each cell was calculated by first dilating all cell masks by 4 pixels to ensure that neighbouring cell masks touch. The mask image was then converted into a region adjacency graph, where each node is a cell, and edges represent a neighbour connection between two cells (cells that touch). The graph is then traversed and each node’s degree (corresponding to that cell’s neighbour number) is enumerated (Fig. 5c ).

Quantification of single-cell intensities from experimental microcolonies

Experimental microcolony data were analysed using the same methods as synthetic microcolonies but were first segmented in phase contrast using the Omnipose model to generate masks (example given in Supplementary Information 24 since only average intensity per cell is required, it is not critical to have very accurate size estimation). For datasets generated this way, there is no ground truth intensity estimation. Fluorescence images were first background subtracted by subtracting the mean of the lowest 5% of pixel intensities within the image. Mean cell intensity was defined as the sum of the pixel intensities within each mask, divided by the cell mask area. Deconvolution was performed using the iPSF and the ePSF, and the intensity of deconvolved cells recorded. Since no ground truth data exists, we did not estimate the CV of real data but rather focussed on showing the effects of colony size, cell neighbour number, and cell position within a colony on the observed cell intensity. These values were quantified using the same methods as for synthetic colonies.

Manual annotation platform and analysis

To assess the effects of projection and diffraction on the performance of manual (human) annotation of cells (for the purposes of training data preparation), five individuals were each asked to annotate the same dataset of 600 simulated cells, where the ground truth was known but not disclosed to the annotators. The cells had an average width of 1 µm, and were partitioned into 4 possible groups: projection “off” cells (cells where the imaging DoF is 1 pixel wide)/projection “on” cells (where the depth of field contains the entire 3D cell), and cells imaged with a fluorescent emitter capable of emitting 0.2 µm wavelength light, and a 0.6 µm wavelength light emitter, distributed with a density of 0.4 emitters per volume element within each cell. One-hundred fifty of each cell type were scattered with uniform random position and orientation on a 2048 × 2048 plane with a pixel size of 0.065. Convolution for the two wavelengths was performed once again with the Aguet PSF model. Camera noise was added after convolution using SyMBac’s camera class, using a baseline intensity of 100, a sensitivity of 2.9 analogue-digital units/electron, and a dark noise variance of 8. Annotators ran a Python script which presented them with a Napari window 58 and a layer upon which to label cells. The order in which the images with various effects were displayed was randomised.

Deep-learning models for image segmentation

In addition to comparing human annotation accuracy, we sought to test the accuracy of a pretrained model (in this case, Omnipose’s bact_fluor_omni model) on simulated images where the perfect ground truth dimension of each cell is known. We generated 200 images containing on average, 200 synthetic cells per image (approximately 40,000 total cells) according to the same method described in the previous section, but with areas of synthetic cells varying between 0.15 and 3.5 µm 2 . The PSF model used an imaging wavelength of 600 nm. Ground truth mask and image pairs were saved. Images were then segmented according to the recommended Omnipose documentation with the bact_fluor_omni model. Ground truth cells were matched with their prediction, and the IoU for each cell was calculated.

We then assessed the benefit of training a model on synthetic data matching one’s experimental setup exactly. The performance gained by training a model on an independent set of synthetic images with perfect ground truth available was checked by generating a test set of 200 images according to the aforementioned method. These were used to train a new model according to the Omnipose documentation for 4000 epochs. The 4000th epoch model was then used to segment the original test data, and ground truth cells were matched with their predictions, and the IoUs calculated once again.

Analysis of cell wall labelled cells on agar pads

After the acquisition of cells labelled with FDAAs, images were segmented with Omnipose. Since size estimation of the cell by segmentation is not important, the pretrained bact_phase_omni model was sufficient to segment the cells. To ensure that all the signal from the fluorescently labelled cell wall was captured, cell masks were binary dilated. After this, all individual cells were cropped, and Scikit-image’s regionprops function 57 was used to calculate the orientation of the cells and rotate them.

Simulations of fluorescence images of individual molecules within cells

Fluorescent single-molecule images of single cells were generated in the same manner as described before but with very few emitters per cell (1–30). These low-density fluorescent cells were convolved with the tPSF (to capture high-frequency information, since long-range effects are not needed for this analysis) using the layer-by-layer technique previously described. All analyses were performed on 3 cell types: (1) A typical 1 µm wide, 1 µm deep, 5 µm long cell, (2) an enlarged 2 µm wide, 2 µm deep, 5 µm long cell, (3) a cell trapped by the MACS 10 platform, 2 µm wide, 0.6 µm deep, 5.5 µm long.

We employed two techniques to count the single molecules in these cells. The first approach, which we term the naive approach, involved sampling fluorescent emitters within the cell and partitioning the emitters into 3 groups: (1) Molecules lost to depth of field; these were defined as molecules more than 0.25 µm away from the centre of the cell. (2) Molecules lost to diffraction; these were defined as molecules residing within 1 Rayleigh diffraction limit of at least one other molecule (with a modification term for the defocus, approximated by the model for the broadening of a Gaussian beam 59 ). (3) Resolved molecules; these are the sum of any remaining resolvable single molecules and clusters of molecules within 1 Rayleigh diffraction limit of another (appearing as a single molecule). Rather than applying image processing techniques to count spots, this approach allowed us to identify and partition different sources of miscounting error.

The second approach we applied was Deep-STORM 50 , a deep learning method for super-resolution single-molecule localisation. A more sophisticated method such as this should perform better than the naive method since it can learn to use defocus, changes in local intensity, and local spatial patterning information to better estimate the number of molecules in a region. We trained Deep-STORM by downloading and modifying the ZeroCostDL4Mic 60 implementation for local use. While Deep-STORM is typically trained on simulated data and comes with its own simulator, it does not take into account thick samples such as the depth of entire bacterial cells where defocus is appreciable. Therefore, we generated our own synthetic training data by reducing the number of fluorescent emitters in each cell to between 1 and 30. Individual models were trained for the regular cell and the MACS cell, both with SNRs (signal-to-noise ratios) of 8, which is typical for a bacterial single-molecule experiment.

All image simulation and image analysis methods made heavy use of scikit-image, NumPy, CuPy and SciPy 57 , 61 , 62 , 63 .

Methods for experimental validation

Strain preparation.

For imaging cells labelled with membrane stains, we used the strain E. coli MG1655 7740 ΔmotA with no fluorescent reporter. Cells were grown overnight from a glycerol stock in LB medium at 37 °C with 250 RPM shaking. The following day, cells were diluted by 100× into 1 mL of fresh LB. The fresh LB was supplemented simultaneously with both HADA and RADA to a final concentration of 1 mM each. HADA and RADA get incorporated into the bacterial cell wall during growth, allowing imaging of only the cell outline using fluorescence microscopy 64 , 65 . Cells were allowed to grow in the presence of the FDAAs for 2 h, after which a 300 µL aliquot was spun down and washed with phosphate-buffered saline (PBS) according to the protocol in 66 , taking care to ensure that cells were kept on ice between washes and when not being used.

For imaging microcolonies of fluorescently tagged cells, we used the strain E. coli MG1655 7740 ΔmotA with a constitutively produced cyan fluorescent protein (SCFP3A, FPbase ID: HFE84) under the control of prpsL. Cells were grown overnight from a glycerol stock in LB medium at 37 °C. The following day, cells were diluted by 200× in fresh LB and grown to an OD or 0.1–0.2 to ensure large cell size. Once the desired OD was reached, 1 mL of cells were spun down at 4000× g for 5 min, and the pellet resuspended in PBS for imaging.

Single-cell imaging on agar pad

Agar pads were prepared according to the protocol described in Young et al. 27 . Since only snapshot microscopy was to be performed, agar pads were instead prepared with PBS instead of growth medium, and were kept as consistently thick as possible. Agar pads were cut to approximately 22 × 22 mm, and placed upon a 22 × 40 mm coverslip. Cells on the agar pad were imaged using a Nikon ECLIPSE Ti2 inverted microscope using a 100× (NA = 1.49) objective with F-type immersion oil ( n  = 1.518) with a second 1.5× post-magnification lens inserted, for an effective magnification of 150×. The camera used was an ORCA-Fusion C14440-20UP digital CMOS camera from Hamamatsu, with a pixel size of 6.5 µm × 6.5 µm. Cells stained with HADA were imaged with excitation light: 365 nm, 435 nm filter, 100% power, 1 second exposure. RADA was imaged with excitation light: 561 nm, 595 nm filter, 100% power, 0.5 s exposure. Focussing and field of view selection was again done using phase contrast, but special care was taken to account for chromatic aberration by adjusting the Z-plane offset between the focussed phase contrast image and the RADA and HADA images. This was crucial to ensuring that the cell wall was in focus in each image.

Imaging microcolonies on agar pads

Since cells can change their intensities during growth on an agar pad, we image preformed colonies to image the effects of diffraction on the cells (example images shown in Supplementary Information 11 and Supplementary Information 15 ). To generate preformed microcolonies, a higher OD of cells (0.1–0.2) was preferred. 3 µL of cell suspension was pipetted directly onto the agar pad and allowed to “dry” for 5 min, after which a second 22 × 40 mm coverslip was placed upon it. Agar pads were then immediately imaged using the ECLIPSE Ti2 inverted microscope using a 100× (NA = 1.49) objective. This enabled us to collect samples of cell clusters (preformed colonies) of various sizes. To avoid photobleaching, well-separated fields of view were first selected and focussed on phase contrast. Fluorescent images were captured by excitation with 440 nm light at an LED power of 50% for 900 ms (light source: Lumencore Spectra III Light Engine), and with a filter wavelength of 475 nm. Images were captured as multidimensional 16-bit ND2 files for further analysis.

Imaging cells in the mother machine

The mother machine chips were prepared and loaded with cells according to the protocol described in Bakshi et al. 46 . A single mother machine lane was supplied fresh LB by a syringe pump at 15 ul/min. Cells in the mother machine were imaged using the Nikon ECLIPSE Ti2 inverted microscope with a 40× (NA = 0.95) objective lens with a 1.5× post-objective magnification lens. The time-lapse images were acquired using the Hamamatsu ORCA-Fusion Digital CMOS camera, with a pixel size of 6.5 μm × 6.5 μm. Samples were illuminated with a brightfield light source and a fluorescence light source (Lumencor Spectra III) at 3 min intervals for 5 h. Fluorescence images were captured at fast scan mode with a 594 nm excitation LED at 100% power for 100 ms exposure time, and a 632 nm filter.

Point spread function acquisition

Our microscope’s (a Nikon ECLIPSE Ti2) point spread function was captured using fluorescent 0.1 µm TetraSpeck Microspheres from Invitrogen. Slides with fluorescent microspheres were prepared according to 67 , with the only changes being a bead dilution of 1000×, and the use of Fluoromount-G Mounting Medium from Invitrogen. PSFs were captured using 0.70 NA, 0.95 NA, and 1.49 NA objective lenses, with magnifications of 20×, 40×, and 100×, respectively. PSFs were captured with and without the addition of a 1.5× post-magnification lens. Z-stacks were taken of the beads with 0.05 µm spacing. The most in-focus Z-stack was determined by taking the radial profile of the PSF and finding the Z-slice with the highest peak intensity and narrowest FWHM. Intensity peaks were then found and beads were selected to maximise the crop area. Bead stacks were then centred around the mean peak intensity and averaged to produce a low noise iPSF.

Data availability

Sample datasets, including instrumental point spread functions, microscope images of membrane-stained cells and microcolonies, synthetic benchmarking data and mother machine data have been uploaded to https://zenodo.org/records/10525762 68 .

Code availability

All code written for this paper, and used to generate figures is uploaded to https://github.com/georgeoshardo/projection_diffraction 69 . For backward compatibility, the version of SyMBac used in this paper has been frozen and included in this repository.

Campos, M. et al. A constant size extension drives bacterial cell size homeostasis. Cell 159 , 1433–1446 (2014).

Article   CAS   PubMed Central   PubMed   Google Scholar  

Wang, P. et al. Robust growth of Escherichia coli . Curr. Biol. 20 , 1099–1103 (2010).

Taheri-Araghi, S. et al. Cell-size control and homeostasis in bacteria. Curr. Biol. 25 , 385–391 (2015).

Article   CAS   PubMed   Google Scholar  

Taniguchi, Y. et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329 , 533–538 (2010).

Dunlop, M. J., Cox, R. S., Levine, J. H., Murray, R. M. & Elowitz, M. B. Regulatory activity revealed by dynamic correlations in gene expression noise. Nat. Genet. 40 , 1493–1498 (2008).

Munsky, B., Neuert, G. & van Oudenaarden, A. Using gene expression noise to understand gene regulation. Science 336 , 183–187 (2012).

Julou, T. et al. Cell–cell contacts confine public goods diffusion inside Pseudomonas aeruginosa clonal microcolonies. Proc. Natl. Acad. Sci. USA 110 , 12577–12582 (2013).

van Vliet, S. et al. Spatially correlated gene expression in bacterial groups: the role of lineage history, spatial gradients, and cell-cell interactions. Cell Syst. 6 , 496–507.e6 (2018).

Article   PubMed Central   PubMed   Google Scholar  

Golding, I., Paulsson, J., Zawilski, S. M. & Cox, E. C. Real-time kinetics of gene activity in individual bacteria. Cell 123 , 1025–1036 (2005).

Okumus, B. et al. Mechanical slowing-down of cytoplasmic diffusion allows in vivo counting of proteins in individual cells. Nat. Commun. 7 , 11641 (2016).

Portillo, M. C., Leff, J. W., Lauber, C. L. & Fierer, N. Cell size distributions of soil bacterial and archaeal taxa. Appl. Environ. Microbiol. 79 , 7610–7617 (2013).

Hardo, G., Noka, M. & Bakshi, S. Synthetic micrographs of bacteria (SyMBac) allows accurate segmentation of bacterial cells using deep neural networks. BMC Biol. 20 , 263 (2022).

Yao, Z. & Carballido-López, R. Fluorescence imaging for bacterial cell biology: from localization to dynamics, from ensembles to single molecules. Annu. Rev. Microbiol. 68 , 459–476 (2014).

Cambré, A. & Aertsen, A. Bacterial vivisection: how fluorescence-based imaging techniques shed a light on the inner workings of bacteria. Microbiol. Mol. Biol. Rev . https://doi.org/10.1128/mmbr.00008-20 (2020).

Santin, Y. G., Doan, T., Journet, L. & Cascales, E. Cell width dictates type VI secretion tail length. Curr. Biol. 29 , 3707–3713.e3 (2019).

Smit, J. H., Li, Y., Warszawik, E. M., Herrmann, A. & Cordes, T. ColiCoords: a Python package for the analysis of bacterial fluorescence microscopy data. PLoS ONE 14 , e0217524 (2019).

Ursell, T. et al. Rapid, precise quantification of bacterial cellular dimensions across a genomic-scale knockout library. BMC Biol. 15 , 17 (2017).

Rust, M. J., Bates, M. & Zhuang, X. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat. Methods 3 , 793–796 (2006).

Westphal, V. et al. Video-rate far-field optical nanoscopy dissects synaptic vesicle movement. Science 320 , 246–249 (2008).

Betzig, E. et al. Imaging intracellular fluorescent proteins at nanometer resolution. Science 313 , 1642–1645 (2006).

Hess, S. T., Girirajan, T. P. K. & Mason, M. D. Ultra-high resolution imaging by fluorescence photoactivation localization microscopy. Biophys. J. 91 , 4258–4272 (2006).

Goudsmits, J. M. H., van Oijen, A. M. & Robinson, A. A tool for alignment and averaging of sparse fluorescence signals in rod-shaped bacteria. Biophys. J. 110 , 1708–1715 (2016).

Mekterović, I., Mekterović, D. & Maglica, Z. BactImAS: a platform for processing and analysis of bacterial time-lapse microscopy movies. BMC Bioinform. 15 , 251 (2014).

Article   Google Scholar  

Jones, T. R. et al. CellProfiler analyst: data exploration and analysis software for complex image-based screens. BMC Bioinform. 9 , 482 (2008).

Guberman, J. M., Fay, A., Dworkin, J., Wingreen, N. S. & Gitai, Z. PSICIC: noise and asymmetry in bacterial division revealed by computational image analysis at sub-pixel resolution. PLOS Comput. Biol. 4 , e1000233 (2008).

Paintdakhi, A. et al. Oufti: an integrated software package for high-accuracy, high-throughput quantitative microscopy analysis. Mol. Microbiol. 99 , 767–777 (2016).

Young, J. W. et al. Measuring single-cell gene expression dynamics in bacteria using fluorescence time-lapse microscopy. Nat. Protoc. 7 , 80–88 (2012).

Article   CAS   Google Scholar  

Stylianidou, S., Brennan, C., Nissen, S. B., Kuwada, N. J. & Wiggins, P. A. SuperSegger: robust image segmentation, analysis and lineage tracking of bacterial cells. Mol. Microbiol. 102 , 690–700 (2016).

Smith, A., Metz, J. & Pagliara, S. MMHelper: an automated framework for the analysis of microscopy images acquired with the mother machine. Sci. Rep. 9 , 10123 (2019).

Cutler, K. J. et al. Omnipose: a high-precision morphology-independent solution for bacterial cell segmentation. Nat. Methods 19 , 1438–1448 (2022).

Ollion, J. & Ollion, C. DistNet: deep tracking by displacement regression: application to bacteria growing in the mother machine. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2020 (eds Martel, A. L. et al.) 215–225 (Springer International Publishing, Cham, 2020) https://doi.org/10.1007/978-3-030-59722-1_21 .

O’Connor, O. M., Alnahhas, R. N., Lugagne, J.-B. & Dunlop, M. J. DeLTA 2.0: a deep learning pipeline for quantifying single-cell spatial and temporal dynamics. PLOS Comput. Biol. 18 , e1009797 (2022).

Sauls, J. T. et al. Mother machine image analysis with MM3. https://doi.org/10.1101/810036 (2019).

Spahn, C. et al. DeepBacs for multi-task bacterial image analysis using open-source deep learning approaches. Commun. Biol. 5 , 1–18 (2022).

Valen, D. A. V. et al. Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments. PLOS Comput. Biol. 12 , e1005177 (2016).

Furchtgott, L., Wingreen, N. S. & Huang, K. C. Mechanisms for maintaining cell shape in rod-shaped Gram-negative bacteria. Mol. Microbiol. 81 , 340–353 (2011).

Reshes, G., Vanounou, S., Fishov, I. & Feingold, M. Cell shape dynamics in Escherichia coli . Biophys. J. 94 , 251–264 (2008).

Muzzey, D. & van Oudenaarden, A. Quantitative time-lapse fluorescence microscopy in single cells. Annu. Rev. Cell Dev. Biol. 25 , 301–327 (2009).

Locke, J. C. W. & Elowitz, M. B. Using movies to analyse gene circuit dynamics in single cells. Nat. Rev. Microbiol. 7 , 383–392 (2009).

Ullman, G. et al. High-throughput gene expression analysis at the level of single proteins using a microfluidic turbidostat and automated cell tracking. Philos. Trans. R. Soc. B Biol. Sci. 368 , 20120025 (2013).

Prindle, A. et al. A sensing array of radically coupled genetic ‘biopixels’. Nature 481 , 39–44 (2012).

Hardo, G. & Bakshi, S. Challenges of analysing stochastic gene expression in bacteria using single-cell time-lapse experiments. Essays Biochem. 65 , 67–79 (2021).

Hanser, B. M., Gustafsson, M. G. L., Agard, D. A. & Sedat, J. W. Phase‐retrieved pupil functions in wide‐field fluorescence microscopy. J. Microsc. 216 , 32–48 (2004).

Aguet, F., Geissbühler, S., Märki, I., Lasser, T. & Unser, M. Super-resolution orientation estimation and localization of fluorescent dipoles using 3-D steerable filters. Opt. Express 17 , 6829–6848 (2009).

Warren, M. R. et al. Spatiotemporal establishment of dense bacterial colonies growing on hard agar. eLife 8 , e41093 (2019).

Bakshi, S. et al. Tracking bacterial lineages in complex and dynamic environments with applications for growth control and persistence. Nat. Microbiol. 6 , 783–791 (2021).

Gordon, G. S. et al. Chromosome and low copy plasmid segregation in E. coli : visual evidence for distinct mechanisms. Cell 90 , 1113–1121 (1997).

Skinner, S. O., Sepúlveda, L. A., Xu, H. & Golding, I. Measuring mRNA copy number in individual Escherichia coli cells using single-molecule fluorescent in situ hybridization. Nat. Protoc. 8 , 1100–1113 (2013).

Okumus, B. et al. Single-cell microscopy of suspension cultures using a microfluidics-assisted cell screening platform. Nat. Protoc. 13 , 170–194 (2018).

Nehme, E., Weiss, L. E., Michaeli, T. & Shechtman, Y. Deep-STORM: super-resolution single-molecule microscopy by deep learning. Optica 5 , 458–464 (2018).

Jong, I. G. de, Beilharz, K., Kuipers, O. P. & Veening, J.-W. Live cell imaging of Bacillus subtilis and Streptococcus pneumoniae using automated time-lapse microscopy. J. Vis. Exp . https://doi.org/10.3791/3145 (2011)

Elowitz, M. B. & Leibler, S. A synthetic oscillatory network of transcriptional regulators. Nature 403 , 335–338 (2000).

Otsu, N. A threshold selection method from Gray-Level histograms. IEEE Trans. Syst. Man Cybern. 9 , 62–66 (1979).

Rudge, T. J., Steiner, P. J., Phillips, A. & Haseloff, J. Computational modeling of synthetic microbial biofilms. ACS Synth. Biol. 1 , 345–352 (2012).

Lucy, L. B. An iterative technique for the rectification of observed distributions. Astron. J. 79 , 745 (1974).

Richardson, W. H. Bayesian-based iterative method of image restoration*. JOSA 62 , 55–59 (1972).

van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2 , e453 (2014).

Sofroniew, N. et al. napari: a multi-dimensional image viewer for Python. Zenodo https://doi.org/10.5281/zenodo.7276432 (2022).

Self, S. A. Focusing of spherical Gaussian beams. Appl. Opt. 22 , 658–661 (1983).

von Chamier, L. et al. Democratising deep learning for microscopy with ZeroCostDL4Mic. Nat. Commun. 12 , 2276 (2021).

Okuta, R., Unno, Y., Nishino, D., Hido, S. & Crissman. CuPy: a NumPy-Compatible Library for NVIDIA GPU Calculations (2017).

Harris, C. R. et al. Array programming with NumPy. Nature 585 , 357–362 (2020).

Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17 , 261–272 (2020).

Kuru, E. et al. In situ probing of newly synthesized peptidoglycan in live bacteria with fluorescent D-amino acids. Angew. Chem. Int. Ed. 51 , 12519–12523 (2012).

Hsu, Y.-P. et al. Full color palette of fluorescent D-amino acids for in situ labeling of bacterial cell walls. Chem. Sci. 8 , 6313–6321 (2017).

Kuru, E., Tekkam, S., Hall, E., Brun, Y. V. & Van Nieuwenhze, M. S. Synthesis of fluorescent D-amino acids and their use for probing peptidoglycan synthesis and bacterial growth in situ. Nat. Protoc. 10 , 33–52 (2015).

Measuring a Point Spread Function. iBiology https://www.ibiology.org/talks/measuring-a-point-spread-function/ (2012).

Hardo, G., Li, R. & Bakshi, S. Example data for preprint version of: ‘Projection and Diffraction Affects Accurate Quantification of Microbiology from Microscopy Data’. Zenodo https://doi.org/10.5281/zenodo.10525762 (2024).

Hardo, G. georgeoshardo/projection_diffraction (2024).

Download references

Acknowledgements

We thank Prof. Bartlomiej Waclaw, Prof. Ricardo Henriques, Dr. Diana Fusco, Dr. Temur Yusunov, and Kevin J. Cutler for their feedback on this work and the members of Bakshi Lab for their helpful feedback on this study. All figures included in the present paper are original and contain no third party material.

Author information

Authors and affiliations.

Department of Engineering, University of Cambridge, Cambridge, UK

Georgeos Hardo, Ruizhe Li & Somenath Bakshi

You can also search for this author in PubMed   Google Scholar

Contributions

G.H. and S.B. conceived of the study. G.H. designed the computational models and deep learning tools. G.H. performed the point spread function, microcolony, and single-cell agar pad experiments and analysed the corresponding data. R.L. performed mother machine experiments and analysis. G.H. and S.B. wrote the paper. All co-authors contributed to the final version of the paper.

Corresponding author

Correspondence to Somenath Bakshi .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supporting_movie_1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Hardo, G., Li, R. & Bakshi, S. Quantitative microbiology with widefield microscopy: navigating optical artefacts for accurate interpretations. npj Imaging 2 , 26 (2024). https://doi.org/10.1038/s44303-024-00024-4

Download citation

Received : 11 February 2024

Accepted : 21 June 2024

Published : 02 September 2024

DOI : https://doi.org/10.1038/s44303-024-00024-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

data presentation methods in quantitative research

  • Open access
  • Published: 31 August 2024

Effects of pecha kucha presentation pedagogy on nursing students’ presentation skills: a quasi-experimental study in Tanzania

  • Setberth Jonas Haramba 1 ,
  • Walter C. Millanzi 1 &
  • Saada A. Seif 2  

BMC Medical Education volume  24 , Article number:  952 ( 2024 ) Cite this article

1 Altmetric

Metrics details

Introduction

Ineffective and non-interactive learning among nursing students limits opportunities for students’ classroom presentation skills, creativity, and innovation upon completion of their classroom learning activities. Pecha Kucha presentation is the new promising pedagogy that engages students in learning and improves students’ speaking skills and other survival skills. It involves the use of 20 slides, each covering 20 seconds of its presentation. The current study examined the effect of Pecha Kucha’s presentation pedagogy on presentation skills among nursing students in Tanzania.

The aim of this study was to establish comparative nursing student’s presentation skills between exposure to the traditional PowerPoint presentations and Pecha Kucha presentations.

The study employed an uncontrolled quasi-experimental design (pre-post) using a quantitative research approach among 230 randomly selected nursing students at the respective training institution. An interviewer-administered structured questionnaire adopted from previous studies to measure presentation skills between June and July 2023 was used. The study involved the training of research assistants, pre-assessment of presentation skills, training of participants, assigning topics to participants, classroom presentations, and post-intervention assessment. A linear regression analysis model was used to determine the effect of the intervention on nursing students’ presentation skills using Statistical Package for Social Solution (SPSS) version 26, set at a 95% confidence interval and 5% significance level.

Findings revealed that 63 (70.87%) participants were aged ≤ 23 years, of which 151 (65.65%) and 189 (82.17%) of them were males and undergraduate students, respectively. Post-test findings showed a significant mean score change in participants’ presentation skills between baseline (M = 4.07 ± SD = 0.56) and end-line (M = 4.54 ± SD = 0.59) that accounted for 0.4717 ± 0.7793; p  < .0001(95%CI) presentation skills mean score change with a medium effect size of 0.78. An increase in participants’ knowledge of Pecha Kucha presentation was associated with a 0.0239 ( p  < .0001) increase in presentation skills.

Pecha Kucha presentations have a significant effect on nursing students’ presentation skills as they enhance inquiry and mastery of their learning content before classroom presentations. The pedagogical approach appeared to enhance nursing students’ confidence during the classroom presentation. Therefore, there is a need to incorporate Pecha Kucha presentation pedagogy into nursing curricula and nursing education at large to promote student-centered teaching and learning activities and the development of survival skills.

Trial registration

It was not applicable as it was a quasi-experimental study.

Peer Review reports

The nursing students need to have different skills acquired during the learning process in order to enable them to provide quality nursing care and management in the society [ 1 ]. The referred nursing care and management practices include identifying, analyzing, synthesizing, and effective communication within and between healthcare professionals [ 1 ]. Given an increasing global economy and international competition for jobs and opportunities, the current traditional classroom learning methods are insufficient to meet such 21st - century challenges and demands [ 2 ]. The integration of presentation skills, creativity, innovation, collaboration, information, and media literacy skills helps to overcome the noted challenges among students [ 2 , 3 , 4 ]. The skills in question constitute the survival skills that help the students not only for career development and success but also for their personal, social and public quality of life as they enable students to overcome 21st challenges upon graduation [ 2 ].

To enhance the nursing students’ participation in learning, stimulating their presentation skills, critical thinking, creativity, and innovation, a combination of teaching and learning pedagogy should be employed [ 5 , 6 , 7 , 8 ]. Among others, classroom presentations, group discussions, problem-based learning, demonstrations, reflection, and role-play are commonly used for those purposes [ 5 ]. However, ineffective and non-interactive learning which contribute to limited presentation skills, creativity, and innovation, have been reported by several scholars [ 9 , 10 , 11 ]. For example, poor use and design of student PowerPoint presentations led to confusing graphics due to the many texts in the slides and the reading of about 80 slides [ 12 , 13 , 14 ]. Indeed, such non-interactive learning becomes boring and tiresome among the learners, and it is usually evidenced by glazing eyes, long yawning, occasional snoring, the use of a phone and frequent trips to the bathroom [ 12 , 14 ].

With an increasing number of nursing students in higher education institutions in Tanzania, the students’ traditional presentation pedagogy is insufficient to stimulate their presentation skills. They limit nursing student innovation, creativity, critical thinking, and meaningful learning in an attempt to solve health challenges [ 15 , 16 ].These hinder nursing students ability to communicate effectively by being able to demonstrate their knowledge and mastery of learning content [ 17 , 18 ]. Furthermore, it affects their future careers by not being able to demonstrate and express their expertise clearly in a variety of workplace settings, such as being able to present at scientific conferences, participating in job interviews, giving clinic case reports, handover reports, and giving feedback to clients [ 17 , 18 , 19 ].

Pecha Kucha presentation is a new promising approach for students’ learning in the classroom context as it motivates learners’ self-directed and collaborative learning, learner creativity, and presentation skills [ 20 , 21 , 22 ]. It encourages students to read more materials, enhances cooperative learning among learners, and is interesting and enjoyable among students [ 23 ].

Pecha Kucha presentation originated from the Japanese word “ chit chat , ” which represents the fast-paced presentation used in different fields, including teaching, marketing, advertising, and designing [ 24 , 25 , 26 ]. It involves 20 slides, where each slide covers 20 s, thus making a total of 6 min and 40 s for the whole presentation [ 22 ]. For effective learning through Pecha Kucha presentations, the design and format of the presentation should be meaningfully limited to 20 slides and targeted at 20 s for each slide, rich in content of the presented topic using high-quality images or pictures attuned to the content knowledge and message to be delivered to the target audiences [ 14 , 16 ]. Each slide should contain a primordial message with well-balanced information. In other words, the message should be simple in the sense that each slide should contain only one concept or idea with neither too much nor too little information, thus making it easy to be grasped by the audience [ 14 , 17 , 19 ].

The “true spirit” of Pecha Kucha is that it mostly consists of powerful images and meaningful specific text rather than the text that is being read by the presenter from the slides, an image, and short phrases that should communicate the core idea while the speaker offers well-rehearsed and elaborated comments [ 22 , 28 ]. The presenter should master the subject matter and incorporate the necessary information from classwork [ 14 , 20 ]. The audience’s engagement in learning by paying attention and actively listening to the Pecha Kucha presentation was higher compared with that in traditional PowerPoint presentations [ 29 ]. The creativity and collaboration during designing and selecting the appropriate images and contents, rehearsal before the presentation, and discussion after each presentation made students satisfied by enjoying Pecha Kucha presentations compared with traditional presentations [ 21 , 22 ]. Time management and students’ self-regulation were found to be significant through the Pecha Kucha presentation among the students and teachers or instructors who could appropriately plan the time for classroom instruction [ 22 , 23 ].

However, little is known about Pecha Kucha presentation in nursing education in Sub-Saharan African countries, including Tanzania, since there is insufficient evidence for the research(s) that have been published on the description of its effects on enhancing students’ presentation skills. Thus, this study assessed the effect of Pecha Kucha’s presentation pedagogy on enhancing presentation skills among nursing students. In particular, the study largely focused on nursing students’ presentation skills during the preparation and presentation of the students’ assignments, project works, case reports, or field reports.

The study answered the null hypothesis H 0  = H 1, which hypothesized that there is no significant difference in nursing students’ classroom presentation skills scores between the baseline and end-line assessments. The association between nursing students’ presentation skills and participants’ sociodemographic characteristics was formulated and analyzed before and after the intervention. This study forms the basis for developing new presentation pedagogy among nursing students in order to stimulate effective learning and the development of presentation skills during the teaching and learning process and the acquisition of 21st - century skills, which are characterized by an increased competitive knowledge-based society due to changing nature and technological eruptions.

The current study also forms the basis for re-defining classroom practices in an attempt to enhance and transform nursing students’ learning experiences. This will cultivate the production of graduates nurses who will share their expertise and practical skills in the health care team by attending scientific conferences, clinical case presentations, and job interviews in the global health market. To achieve this, the study determined the baseline and end-line nursing students’ presentation skills during the preparation and presentation of classroom assignments using the traditional PowerPoint presentation and Pecha Kucha presentation format.

Methods and materials

This study was conducted in health training institutions in Tanzania. Tanzania has a total of 47 registered public and private universities and university colleges that offer health programs ranging from certificate to doctorate degrees [ 24 , 25 ]. A total of seven [ 7 ] out of 47 universities offer a bachelor of science in nursing, and four [ 4 ] universities offer master’s to doctorate degree programs in nursing and midwifery sciences [ 24 , 26 ]. To enhance the representation of nursing students in Tanzania, this study was conducted in Dodoma Municipal Council, which is one of Tanzania’s 30 administrative regions [ 33 ]. Dodoma Region has two [ 2 ] universities that offer nursing programs at diploma and degree levels [ 34 ]. The referred universities host a large number of nursing students compared to the other five [ 5 ] universities in Tanzania, with traditional students’ presentation approaches predominating nursing students’ teaching and learning processes [ 7 , 32 , 35 ].

The two universities under study include the University of Dodoma and St. John’s University of Tanzania, which are located in Dodoma Urban District. The University of Dodoma is a public university that provides 142 training programs at the diploma, bachelor degree, and master’s degree levels with about 28,225 undergraduate students and 724 postgraduate students [ 26 , 27 ]. The University of Dodoma also has 1,031 nursing students pursuing a Bachelor of Science in Nursing and 335 nursing students pursuing a Diploma in Nursing in the academic year 2022–2023 [ 33 ]. The St. John’s University of Tanzania is a non-profit private university that is legally connected with the Christian-Anglican Church [ 36 ]. It has student enrollment ranging from 5000 to 5999 and it provides training programs leading to higher education degrees in a variety of fields, including diplomas, bachelor degrees, and master’s degrees [ 37 ]. It hosts 766 nursing students pursuing a Bachelor of Science in Nursing and 113 nursing students pursuing a Diploma in Nursing in the academic year 2022–2023 [ 30 , 31 ].

Study design and approach

An uncontrolled quasi-experimental design with a quantitative research approach was used to establish quantifiable data on the participants’ socio-demographic profiles and outcome variables under study. The design involved pre- and post-tests to determine the effects of the intervention on the aforementioned outcome variable. The design involved three phases, namely the baseline data collection process (pre-test via a cross-sectional survey), implementation of the intervention (process), and end-line assessment (post-test), as shown in Fig.  1 [ 7 ].

figure 1

A flow pattern of study design and approach

Target population

The study involved nursing students pursuing a Diploma in nursing and a bachelor of science in nursing in Tanzania. The population was highly expected to demonstrate competences and mastery of different survival and life skills in order to enable them to work independent at various levels of health facilities within and outside Tanzania. This cohort of undergraduate nursing students also involved adult learners who can set goals, develop strategies to achieve their goals, and hence achieve positive professional behavioral outcomes [ 7 ]. Moreover, as per annual data, the average number of graduate nursing students ranges from 3,500 to 4,000 from all colleges and universities in the country [ 38 ].

Study population

The study involved first- and third-year nursing students pursuing a Diploma in Nursing and first-, second-, and third-year nursing students pursuing a Bachelor of Science in Nursing at the University of Dodoma. The population had a large number of enrolled undergraduate nursing students, thus making it an ideal population for intervention, and it approximately served as a good representation of the universities offering nursing programs [ 11 , 29 ].

Inclusion criteria

The study included male and female nursing students pursuing a Diploma in nursing and a bachelor of science in nursing at the University of Dodoma. The referred students included those who were registered at the University of Dodoma during the time of study. Such students live on or off campus, and they were not exposed to PK training despite having regular classroom attendance. This enhanced enrollment of adequate study samples from each study program, monitoring of study intervention, and easy control of con-founders.

Exclusion criteria

All students recruited in the study were assessed at baseline, exposed to a training package and obtained their post-intervention learning experience. None of the study participants, who either dropped out of the study or failed to meet the recruitment criteria.

Sample size determination

A quasi-experimental study on Pecha Kucha as an alternative to traditional PowerPoint presentations at Worcester University, United States of America, reported significant student engagement during Pecha Kucha presentations compared with traditional PowerPoint presentations [ 29 ]. The mean score for the classroom with the traditional PowerPoint presentation was 2.63, while the mean score for the Pecha Kucha presentation was 4.08. This study adopted the formula that was used to calculate the required sample size for an uncontrolled quasi-experimental study among pre-scholars [ 39 ]. The formula is stated as:

Where: Zα was set at 1.96 from the normal distribution table.

Zβ was set at 0.80 power of the study.

Mean zero (π0) was the mean score of audiences’ engagement in using PowerPoint presentation = 2.63.

Mean one (π1) was the mean score of audience’s engagement in using Pecha Kucha presentation = 4.08.

Sampling technique

Given the availability of higher-training institutions in the study area that offer undergraduate nursing programs, a simple random sampling technique was used, whereby two cards, one labelled “University of Dodoma” and the other being labelled “St. Johns University of Tanzania,” were prepared and put in the first pot. The other two cards, one labelled “yes” to represent the study setting and the other being labelled “No” to represent the absence of study setting, were put in the second pot. Two research assistants were asked to select a card from each pot, and consequently, the University of Dodoma was selected as the study setting.

To obtain the target population, the study employed purposive sampling techniques to select the school of nursing and public health at the University of Dodoma. Upon arriving at the School of Nursing and Public Health of the University of Dodoma, the convenience sampling technique was employed to obtain the number of classes for undergraduate nursing students pursuing a Diploma in Nursing and a Bachelor of Science in Nursing. The study sample comprised the students who were available at the time of study. A total of five [ 5 ] classes of Diploma in Nursing first-, second-, and third-years and Bachelor of Science in Nursing first-, second-, and third-years were obtained.

To establish the representation for a minimum sample from each class, the number of students by sex was obtained from each classroom list using the proportionate stratified sampling technique (sample size/population size× stratum size) as recommended by scholars [ 40 ]. To recruit the required sample size from each class by gender, a simple random sampling technique through the lottery method was employed to obtain the required sample size from each stratum. During this phase, the student lists by gender from each class were obtained, and cards with code numbers, which were mixed with empty cards depending on the strata size, were allocated for each class and strata. Both labeled and empty cards were put into different pots, which were labeled appropriately by their class and strata names. Upon arriving at the specific classroom and after the introduction, the research assistant asked each nursing student to pick one card from the respective strata pot. Those who selected cards with code numbers were recruited in the study with their code numbers as their participation identity numbers. The process continued for each class until the required sample size was obtained.

To ensure the effective participation of nursing students in the study, the research assistant worked hand in hand with the facilitators and lecturers of the respective classrooms, the head of the department, and class representatives. The importance, advantages, and disadvantages of participating in the study were given to study participants during the recruitment process in order to create awareness and remove possible fears. During the intervention, study participants were also given pens and notebooks in an attempt to enable them to take notes. Moreover, the bites were provided during the training sessions. The number of participants from each classroom and the sampling process are shown in Fig.  2 [ 7 ].

figure 2

Flow pattern of participants sampling procedures

Data collection tools

The study adapted and modified the students’ questionnaire on presentation skills from scholars [ 20 , 23 , 26 , 27 , 28 , 29 ]. The modification involved rephrasing the question statement, breaking down items into specific questions, deleting repeated items that were found to measure the same variables, and improving language to meet the literacy level and cultural norms of study participants.

The data collection tool consisted of 68 question items that assessed the socio-demographic characteristics of the study participants and 33 question items rated on a five-point Likert scale, which ranges from 5 = strongly agree, 4 = agree, 3 = not sure, 2 = disagree, and 1 = strongly disagree. The referred tool was used to assess the students’ skills during the preparation and presentation of the assignments using the traditional PowerPoint presentation and Pecha Kucha presentation formats.

The students’ assessment specifically focused on the students’ ability to prepare the presentation content, master the learning content, share presentation materials, and communicate their understanding to audiences in the classroom context.

Validity and reliability of research instruments

Validity of the research instrument refers to whether the instrument measures the behaviors or qualities that are intended to be measured, and it is a measure of how well the measuring instrument performs its function [ 41 ]. The structured questionnaire, which intends to assess the participants’ presentation skills was validated for face and content validity. The principal investigator initially adapted the question items for different domains of students’ learning when preparing and presenting their assignment in the classroom.

The items were shared and discussed by two [ 2 ] educationists, two [ 2 ] research experts, one [ 1 ] statistician, and supervisors in order to ensure clarity, appropriateness, adequacy, and coverage of the presentation skills using Pecha Kucha presentation format. The content validity test was used until the saturation of experts’ opinions and inputs was achieved. The inter-observer rating scale on a five-point Likert scale ranging from 5-points = very relevant to 1-point = not relevant was also used.

The process involved addition, input deletion, correction, and editing for relevance, appropriateness, and scope of the content for the study participants. Some of the question items were broken down into more specific questions, and new domains evolved. Other question items that were found to measure the same variables were also deleted to ease the data collection and analysis. Moreover, the grammar and language issues were improved for clarity based on the literacy level of the study participants.

Reliability of the research instruments refers to the ability of the research instruments or tools to provide similar and consistent results when applied at different times and circumstances [ 41 ]. This study adapted the tools and question items used by different scholars to assess the impact of PKP on student learning [ 12 , 15 , 18 ].

To ensure the reliability of the tools, a pilot study was conducted in one of the nursing training institutions in order to assess the complexity, readability, clarity, completeness, length, and duration of the tool. Ambiguous and difficult (left unanswered) items were modified or deleted based on the consensus that was reached with the consulted experts and supervisor before subjecting the questionnaires to a pre-test.

The study involved 10% of undergraduate nursing students from an independent geographical location for a pilot study. The findings from the pilot study were subjected to explanatory factor analysis (Set a ≥ 0.3) and scale analysis in order to determine the internal consistency of the tools using the Cronbach alpha of ≥ 0.7, which was considered reliable [ 42 , 43 , 44 ]. Furthermore, after the data collection, the scale analysis was computed in an attempt to assess their internal consistency using SPPSS version 26, whereby the Cronbach alpha for question items that assessed the participants’ presentation skills was 0.965.

Data collection method

The study used the researcher-administered questionnaire to collect the participants’ socio-demographic information, co-related factors, and presentation skills as nursing students prepare and present their assignments in the classroom. This enhanced the clarity and participants’ understanding of all question items before providing the appropriate responses. The data were collected by the research assistants in the classroom with the study participants sitting distantly to ensure privacy, confidentiality, and the quality of the information that was provided by the research participants. The research assistant guided and led the study participants to answer the questions and fill in information in the questionnaire for each section, domain, and question item. The research assistant also collected the baseline information (pre-test) before the intervention, which was then compared with the post-intervention information. This was done in the first week of June 2023, after training and orientation of the research assistant on the data collection tools and recruitment of the study participants.

Using the researcher-administered questionnaire, the research assistant also collected the participants’ information related to presentation skills as they prepared and presented their given assignments after the intervention during the second week of July 2023. The participants submitted their presentations to the principle investigator and research assistant to assess the organization, visual appeal and creativity, content knowledge, and adherence to Pecha Kucha presentation requirements. Furthermore, the evaluation of the participants’ ability to share and communicate the given assignment was observed in the classroom presentation using the Pecha Kucha presentation format.

Definitions of variables

Pecha kucha presentation.

It refers to a specific style of presentation whereby the presenter delivers the content using 20 slides that are dominated by images, pictures, tables, or figures. Each slide is displayed for 20 s, thus making a total of 400 s (6 min and 40 s) for the whole presentation.

Presentation skills in this study

This involved students’ ability to plan, prepare, master learning content, create presentation materials, and share them with peers or the audience in the classroom. They constitute the learning activities that stimulate creativity, innovation, critical thinking, and problem-solving skills.

Measurement of pecha kucha preparation and presentation skills

The students’ presentation skills were measured using the four [ 4 ] learning domains. The first domain constituted the students’ ability to plan and prepare the presentation content. It consisted of 17 question items that assessed the students’ ability to gather and select information, search for specific content to be presented in the classroom, find out the learning content from different resources, and search for literature materials for the preparation of the assignment using traditional PowerPoint presentations and Pecha Kucha formats. It also aimed to ascertain a deeper understanding of the contents or topic, learning ownership and motivation to learn the topics with clear understanding and the ability to identify the relevant audience, segregate, and remove unnecessary contents using the Pecha Kucha format.

The second domain constituted the students’ mastery of learning during the preparation and presentation of their assignment before the audience in the classroom. It consisted of six [ 6 ] question items that measured the students’ ability to read several times, rehearse before the classroom presentation, and practice the assignment and presentation harder. It also measures the students’ ability to evaluate the selected information and content before their actual presentation and make revisions to the selected information and content before the presentation using the Pecha Kucha format.

The third domain constituted the students’ ability to prepare the presentation materials. It consisted of six [ 6 ] question items that measured the students’ ability to organize the information and contents, prepare the classroom presentation, revise and edit presentation resources, materials, and contents, and think about the audience and classroom design. The fourth domain constituted the students’ ability to share their learning. It consisted of four [ 4 ] question items that measured the students’ ability to communicate their learning with the audience, present a new understanding to the audience, transfer the learning to the audience, and answer the questions about the topic or assignment given. The variable was measured using a 5-point Likert scale. The average scores were computed for each domain, and an overall mean score was calculated across all domains. Additionally, an encompassing skills score was derived from the cumulative scores of all four domains, thus providing a comprehensive evaluation of the overall skills level.

Implementation of intervention

The implementation of the study involved the training of research assistants, sampling of the study participants, setting of the venue, pre-assessment of the students’ presentation skills using traditional PowerPoint presentations, training and demonstration of Pecha Kucha presentations to study participants, and assigning the topics to study participants. The implementation of the study also involved the participants’ submission of their assignments to the Principal Investigator for evaluation, the participants’ presentation of their assigned topic using the Pecha Kucha format, post-intervention assessment of the students’ presentation skills, data analysis, and reporting [ 7 ]. The intervention involved Principal Investigator and two [ 2 ] trained research assistants. The intervention in question was based on the concept of multimedia theory of cognitive learning (MTCL) for enhancing effective leaning in 21st century.

Training of research assistants

Two research assistants were trained with regard to the principles, characteristics, and format of Pecha Kucha presentations using the curriculum from the official Pecha Kucha website. Also, research assistants were oriented to the data collection tools and methods in an attempt to guarantee the relevancy and appropriate collection of the participants’ information.

Schedule and duration of training among research assistants

The PI prepared the training schedule and venue after negotiation and consensus with the research assistants. Moreover, the Principle Investigator trained the research assistants to assess the learning, learn how to collect the data using the questionnaire, and maintain the privacy and confidentiality of the study participants.

Descriptions of interventions

The intervention was conducted among the nursing students at the University of Dodoma, which is located in Dodoma Region, Tanzania Mainland, after obtaining their consent. The participants were trained regarding the concepts, principles, and characteristics of Pecha Kucha presentations and how to prepare and present their assignments using the Pecha Kucha presentation format. The study participants were also trained regarding the advantages and disadvantages of Pecha Kucha presentations. The training was accompanied by one example of an ideal Pecha Kucha presentation on the concepts of pressure ulcers. The teaching methods included lecturing, brainstorming, and small group discussion. After the training session, the evaluation was conducted to assess the participants’ understanding of the Pecha Kucha conceptualization, its characteristics, and its principles.

Each participant was given a topic as an assignment from the fundamentals of nursing, medical nursing, surgical nursing, community health nursing, mental health nursing, emergency critical care, pediatric, reproductive, and child health, midwifery, communicable diseases, non-communicable diseases, orthopedics and cross-cutting issues in nursing as recommended by scholars [ 21 , 38 ]. The study participants were given 14 days for preparation, rehearsal of their presentation using the Pecha Kucha presentation format, and submission of the prepared slides to the research assistant and principle investigator for evaluation and arrangement before the actual classroom presentation. The evaluation of the participants’ assignments involved the number of slides, quality of images used, number of words, organization of content and messages to be delivered, slide transition, duration of presentation, flow, and organization of slides.

Afterwards, each participant was given 6 min and 40 s for the presentation and 5 min to 10 min for answering the questions on the topic presented as raised by other participants. An average of 4 participants obtained the opportunity to present their assignments in the classroom every hour. After the completion of all presentations, the research assistants assessed the participant’s presentation skills using the researcher-administered questionnaire. The collected data were entered in SPSS version 26 and analyzed in an attempt to compare the mean score of participants’ presentation skills with the baseline mean score. The intervention sessions were conducted in the selected classrooms, which were able to accommodate all participants at the time that was arranged by the participant’s coordinators, institution administrators, and subject facilitators of the University of Dodoma, as described in Table  1 [ 7 ].

Evaluation of intervention

During the classroom presentation, there were 5 to 10 min for classroom discussion and reflection on the content presented, which was guided by the research assistant. During this time, the participants were given the opportunity to ask the questions, get clarification from the presenter, and provide their opinion on how the instructional messages were presented, content coverage, areas of strength and weakness for improvement, and academic growth. After the completion of the presentation sessions, the research assistant provided the questionnaire to participants in order to determine their presentation skills during the preparation of their assignments and classroom presentations using the Pecha Kucha presentation format.

Data analysis

The findings from this study were analyzed using the Statistical Package for Social Science (SPSS) computer software program version 26. The percentages, frequencies, frequency distributions, means, standard deviations, skewness, and kurtosis were calculated, and the results were presented using the figures, tables, and graphs. The mean score analysis was computed, and descriptive statistical analysis was used to analyze the demographic information of the participants in an attempt to determine the frequencies, percentages, and mean scores of their distributions. A paired sample t-test was used to compare the mean score differences of the presentation skills within the groups before and after the intervention. The mean score differences were determined based on the baseline scores against the post-intervention scores in order to establish any change in terms of presentation skills among the study participants.

The association between the Pecha Kucha presentation and the development of participants’ presentation skills was established using linear regression analysis set at a 95% confidence interval and 5% (≤ 0.05) significance level in an attempt to accept or reject the null hypothesis.

However, N-1 dummy variables were formed for the categorical independent variables so as to run the linear regression for the factors associated with the presentation skills. The linear regression equation with dummy variables is presented as follows:

Β 0 is the intercept.

Β 1 , Β 2 , …. Β k-1 are the coefficients which correspond to the dummy variables representing the levels of X 1 .

Β k is the coefficient which corresponds to the dummy variable representing the levels of X 2 .

Β k+1 is the coefficient which corresponds to the continuous predictor X 3 .

X 1,1 , X 1,2 ,……. X 1,k-1 are the dummy variables corresponding to the different levels of X 1 .

ε represents the error term.

The coefficients B1, B2… Bk indicate the change in the expected value of Y for each category relative to the reference category. If the Beta estimate is positive for the categorical or dummy variables, it means that the corresponding covariate has a positive impact on the outcome variable compared to reference category. However, if the beta estimate is positive for the case of continuous covariates, it means that the corresponding covariate has direct proportion effect on the outcome variables.

The distribution of the outcome variables was approximately normally distributed since the normality of the data is one of the requirements for parametric analysis. A paired t test was performed to compare the presentation skills of nursing students before and after the intervention.

Social-demographic characteristics of the study participants

The study involved a total of 230 nursing students, of whom 151 (65.65%) were male and the rest were female. The mean age of study participants was 23.03 ± 2.69, with the minimum age being 19 and the maximum age being 37. The total of 163 (70.87%) students, which comprised a large proportion of respondents, were aged less than or equal to 23, 215 (93.48%) participants were living on campus, and 216 (93.91) participants were exposed to social media.

A large number of study participants (82.17%) were pursuing a bachelor of Science in Nursing, with the majority being first-year students (30.87%). The total of 213 (92.61%) study participants had Form Six education as their entry qualification, with 176 (76.52%) participants being the product of public secondary schools and interested in the nursing profession. Lastly, the total of 121 (52.61%) study participants had never been exposed to any presentation training; 215 (93.48%) students had access to individual classroom presentations; and 227 (98.70%) study participants had access to group presentations during their learning process. The detailed findings for the participants’ social demographic information are indicated in Table  2 [ 46 ].

Baseline nursing students’ presentation skills using traditional powerPoint presentations

The current study assessed the participant’s presentation skills when preparing and presenting the materials before the audience using traditional PowerPoint presentations. The study revealed that the overall mean score of the participants’ presentation skills was 4.07 ± 0.56, including a mean score of 3.98 ± 0.62 for the participants’ presentation skills during the preparation of presentation content before the classroom presentation and a mean score of 4.18 ± 0.78 for the participants’ mastery of learning content before the classroom presentation. Moreover, the study revealed a mean score of 4.07 ± 0.71 for participants’ ability to prepare presentation materials for classroom presentations and a mean score of 4.04 ± 0.76 for participants’ ability to share the presentation materials in the classroom, as indicated in Table  3 [ 46 ].

Factors Associated with participants’ presentation skills through traditional powerPoint presentation

The current study revealed that the participants’ study program has a significant effect on their presentation skills, whereby being the bachelor of science in nursing was associated with a 0.37561 (P value < 0.027) increase in the participants’ presentation skills.The year of study also had significant effects on the participants’ presentation skills, whereby being a second-year bachelor student was associated with a 0.34771 (P value < 0.0022) increase in the participants’ presentation skills compared to first-year bachelor students and diploma students. Depending on loans as a source of student income retards presentation skills by 0.24663 (P value < 0.0272) compared to those who do not depend on loans as the source of income. Furthermore, exposure to individual presentations has significant effects on the participants’ presentation skills, whereby obtaining an opportunity for individual presentations was associated with a 0.33732 (P value 0.0272) increase in presentation skills through traditional PowerPoint presentations as shown in Table  4 [ 46 ].

Nursing student presentation skills through pecha kucha presentations

The current study assessed the participant’s presentation skills when preparing and presenting the materials before the audience using Pecha Kucha presentations. The study revealed that the overall mean score and standard deviation of participants’ presentation skills using the Pecha Kucha presentation format were 4.54 ± 0.59, including a mean score of 4.49 ± 0.66 for participant’s presentation skills during preparation of the content before classroom presentation and a mean score of 4.58 ± 0.65 for participants’ mastery of learning content before classroom presentation. Moreover, the study revealed a mean score of 4.58 ± 0.67 for participants ability to prepare the presentation materials for classroom presentation and a mean score of 4.51 ± 0.72 for participants ability to share the presentation materials in the classroom using Pecha Kucha presentation format as indicated in Table  5 [ 46 ].

Comparing Mean scores of participants’ presentation skills between traditional PowerPoint presentation and pecha kucha Presentation

The current study computed a paired t-test to compare and determine the mean change, effect size, and significance associated with the participants’ presentation skills when using the traditional PowerPoint presentation and Pecha Kucha presentation formats. The study revealed that the mean score of the participants’ presentation skills through the Pecha Kucha presentation was 4.54 ± 0.59 (p value < 0.0001) compared to the mean score of 4.07 ± 0.56 for the participants’ presentation skills using the traditional power point presentation with an effect change of 0.78. With regard to the presentation skills during the preparation of presentation content before the classroom presentation, the mean score was 4.49 ± 0.66 using the Pecha Kucha presentation compared to the mean score of 3.98 ± 0.62 for the traditional PowerPoint presentation. Its mean change was 0.51 ± 0.84 ( p  < .0001) with an effect size of 0.61.

Regarding the participants’ mastery of learning content before the classroom presentation, the mean score was 4.58 ± 0.65 when using the Pecha Kucha presentation format, compared to the mean score of 4.18 ± 0.78 when using the traditional power point presentation. Its mean change was 0.40 ± 0.27 ( p  < .0001) with an effect size of 1.48. Regarding the ability of the participants to prepare the presentation materials for classroom presentations, the mean score was 4.58 ± 0.67 when using the Pecha Kucha presentation format, compared to 4.07 ± 0.71 when using the traditional PowerPoint presentation. Its mean change was 0.51 ± 0.96 ( p  < .0001) with an effect size of 0.53.

Regarding the participants’ presentation skills when sharing the presentation material in the classroom, the mean score was 4.51 ± 0.72 when using the Pecha Kucha presentation format, compared to 4.04 ± 0.76 when using the traditional PowerPoint presentations. Its mean change was 0.47 ± 0.10, with a large effect size of 4.7. Therefore, Pecha Kucha presentation pedagogy has a significant effect on the participants’ presentation skills than the traditional PowerPoint presentation as shown in Table  6 [ 46 ].

Factors associated with presentation skills among nursing students through pecha kucha presentation

The current study revealed that the participant’s presentation skills using the Pecha Kucha presentation format were significantly associated with knowledge of the Pecha Kucha presentation format, whereby increase in knowledge was associated with a 0.0239 ( p  < .0001) increase in presentation skills. Moreover, the current study revealed that the presentation through the Pecha Kucha presentation format was not influenced by the year of study, whereby being a second-year student could retard the presentation skills by 0.23093 (p 0.039) compared to a traditional PowerPoint presentation. Other factors are shown in Table  7 [ 46 ].

Social-demographic characteristics profiles of participants

The proportion of male participants was larger than the proportion of female participants in the current study. This was attributable to the distribution of sex across the nursing students at the university understudy, whose number of male nursing students enrolled was higher than female students. This demonstrates the high rate of male nursing students’ enrolment in higher training institutions to pursue nursing and midwifery education programs. Different from the previous years, the nursing training institutions were predominantly comprised of female students and female nurses in different settings. This significant increase in male nursing students’ enrollment in nursing training institutions predicts a significant increase in the male nursing workforce in the future in different settings.

These findings on Pecha Kucha as an alternative to PowerPoint presentations in Massachusetts, where the proportion of female participants was large as compared to male participants, are different from the experimental study among English language students [ 29 ]. The referred findings are different from the results of the randomized control study among the nursing students in Anakara, Turkey, where a large proportion of participants were female nursing students [ 47 ]. This difference in participants’ sex may be associated with the difference in socio-cultural beliefs of the study settings, country’s socio-economic status, which influence the participants to join the nursing profession on the basis of securing employment easily, an opportunity abroad, or pressure from peers and parents. Nevertheless, such differences account for the decreased stereotypes towards male nurses in the community and the better performance of male students in science subjects compared to female students in the country.

The mean age of the study participants was predominantly young adults with advanced secondary education. Their ages reflect adherence to national education policy by considering the appropriate age of enrollment of the pupils in primary and secondary schools, which comprise the industries for students at higher training institutions. This age range of the participants in the current study suits the cognitive capability expected from the participants in order to demonstrate different survival and life skills by being able to set learning goals and develop strategies to achieve their goals according to Jean Piaget’s theory of cognitive learning [ 41 , 42 ].

Similar age groups were noted in the study among nursing students in a randomized control study in Anakara Turkey where the average age was 19.05 ± 0.2 [ 47 ]. A similar age group was also found in a randomized control study among liberal arts students in Anakara, Turkey, on differences in instructor, presenter, and audience ratings of Pecha Kucha presentations and traditional student presentations where the ages of the participants ranged between 19 and 22 years [ 49 ].

Lastly, a large proportion of the study participants had the opportunity for individual and group presentations in the classroom despite having not been exposed to any presentation training before. This implies that the teaching and learning process in a nursing education program is participatory and student-centered, thus giving the students the opportunity to interact with learning contents, peers, experts, webpages, and other learning resources to become knowledgeable. These findings fit with the principle that guides and facilitates the student’s learning from peers and teachers according to the constructivism theory of learning by Lev Vygotsky [ 48 ].

Effects of pecha kucha presentation pedagogy on participants’ presentation skills

The participants’ presentation skills were higher for Pecha Kucha presentations compared with traditional PowerPoint presentations. This display of the Pecha Kucha presentation style enables the nursing students to prepare the learning content, master their learning content before classroom presentations, create good presentation materials and present the materials, before the audience in the classroom. This finding was similar to that at Padang State University, Indonesia, among first-year English and literature students whereby the Pecha Kucha Presentation format helped the students improve their skills in presentation [ 20 ]. Pecha Kucha was also found to facilitate careful selection of the topic, organization and outlining of the students’ ideas, selection of appropriate images, preparation of presentations, rehearsing, and delivery of the presentations before the audience in a qualitative study among English language students at the Private University of Manila, Philippines [ 23 ].

The current study found that Pecha Kucha presentations enable the students to perform literature searches from different webpages, journals, and books in an attempt to identify specific contents during the preparation of the classroom presentations more than traditional PowerPoint presentations. This is triggered by the ability of the presentation format to force the students to filter relevant and specific information to be included in the presentation and search for appropriate images, pictures, or figures to be presented before the audience. Pecha Kucha presentations were found to increase the ability to perform literature searches before classroom presentations compared to traditional PowerPoint presentations in an experimental study among English language students at Worcester State University [ 29 ].

The current study revealed that Pecha Kucha presentations enable the students to create a well-structured classroom presentation effectively by designing 20 meaningful and content-rich slides containing 20 images, pictures, or figures and a transitional flow of 20 s for each slide, more than the traditional PowerPoint presentation with an unlimited number of slides containing bullets with many texts or words. Similarly, in a cross-sectional study of medical students in India, Pecha Kucha presentations were found to help undergraduate first-year medical students learn how to organize knowledge in a sequential fashion [ 26 ].

The current study revealed that Pecha Kucha presentations enhance sound mastery of the learning contents and presentation materials before the classroom presentation compared with traditional PowerPoint presentations. This is hastened by the fact that there is no slide reading during the classroom Pecha Kucha presentation, thus forcing students to read several times, rehearse, and practice harder the presentation contents and materials before the classroom presentation. Pecha Kucha presentation needed first year English and literature students to practice a lot before their classroom presentation in a descriptive qualitative study at Padang State University-Indonesia [ 20 ].

The current study revealed that the participants became more confident in answering the questions about the topic during the classroom presentation using the Pecha Kucha presentation style than during the classroom presentation using the tradition PowerPoint presentation. This is precipitated by the mastery level of the presentation contents and materials through rehearsal, re-reading, and material synthesis before the classroom presentations. Moreover, Pecha Kucha was found to significantly increase the students’ confidence during classroom presentation and preparation in a qualitative study among English language students at the Private University of Manila, Philippines [ 23 ].

Hence, there was enough evidence to reject the null hypothesis in that there was no significant difference in nursing students’ presentation skills between the baseline and end line. The Pecha Kucha presentation format has a significant effect on nursing student’s classroom presentation skills as it enables them to prepare the learning content, have good mastery of the learning contents, create presentation materials, and confidently share their learning with the audience in the classroom.

The current study’s findings complement the available pieces of evidence on the effects of Pecha Kucha presentations on the students’ learning and development of survival life skills in the 21st century. Pecha kucha presentations have more significant effects on the students’ presentation skills compared with traditional PowerPoint presentations. It enables the students to select the topic carefully, organize and outline the presentation ideas, select appropriate images, create presentations, rehearse the presentations, and deliver them confidently before an audience. It also enables the students to select and organize the learning contents for classroom presentations more than traditional PowerPoint presentations.

Pecha Kucha presentations enhance the mastery of learning content by encouraging the students to read the content several times, rehearse, and practice hard before the actual classroom presentation. It increases the students’ ability to perform literature searches before the classroom presentation compared to a traditional PowerPoint presentation. Pecha Kucha presentations enable the students to create well-structured classroom presentations more effectively compared to traditional PowerPoint presentations. Furthermore, Pecha Kucha presentations make the students confident during the presentation of their assignments and project works before the audience and during answering the questions.

Lastly, Pecha Kucha presentations enhance creativity among the students by providing the opportunity for them to decide on the learning content to be presented. Specifically, they are able to select the learning content, appropriate images, pictures, or figures, organize and structure the presentation slides into a meaningful and transitional flow of ideas, rehearse and practice individually before the actual classroom presentation.

Strength of the study

This study has addressed the pedagogical gap in nursing training and education by providing new insights on the innovative students’ presentation format that engages students actively in their learning to bring about meaningful and effective students’ learning. It has also managed to recruit, asses, and provide intended intervention to 230 nursing students without dropout.

Study limitation

The current study has pointed out some of the strengths of the PechaKucha presentations on the students’ presentation skills over the traditional students’ presentations. However, the study had the following limitations: It involved one group of nursing students from one of the public training institutions in Tanzania. The use of one university may obscure the interpretation of the effects of the size of the intervention on the outcome variables of interest, thus limiting the generalization of the study findings to all training institutions in Tanzania. Therefore, the findings from this study need to be interpreted by considering this limitation. The use of one group of nursing students from one university to explore their learning experience through different presentation formats may also limit the generalization of the study findings to all nursing students in the country. The limited generalization may be attributed to differences in socio-demographic characteristics, learning environments, and teaching and learning approaches. Therefore, the findings from this study need to be interpreted by considering this limitation.

Suggestions for future research

The future research should try to overcome the current study limitations and shortcomings and extend the areas assessed by the study to different study settings and different characteristics of nursing students in Tanzania as follows: To test rigorously the effects of Pecha Kucha presentations in enhancing the nursing students’ learning, the future studies should involve nursing students’ different health training institutions rather than one training institution. Future studies should better use the control students by randomly allocating the nursing students or training institutions in the intervention group or control group in order to assess the students’ learning experiences through the use of Pecha Kucha presentations and PowerPoint presentations consecutively. Lastly, future studies should focus on nursing students’ mastery of content knowledge and students’ classroom performance through the use of the Pecha Kucha presentation format in the teaching and learning process.

Data availability

The datasets generated and analyzed by this study can be obtained from the corresponding author on reasonable request through [email protected] & [email protected].

Abbreviations

Doctor (PhD)

Multimedia Theory of Cognitive Learning

National Council for Technical and Vocational Education and Training

Principle Investigator

Pecha Kucha presentation

Statistical Package for Social Sciences

Tanzania Commission for Universities

World Health Organization

International Council of Nurses. Nursing Care Continuum Framework and Competencies. 2008.

Partnership for 21st Century Skills. 21st Century Skills, Education & Competitiveness. a Resour Policy Guid [Internet]. 2008;20. https://files.eric.ed.gov/fulltext/ED519337.pdf

Partnership for 21st Century Skills. 21St Century Knowledge and Skills in Educator Preparation. Education [Internet]. 2010;(September):40. https://files.eric.ed.gov/fulltext/ED519336.pdf

Partnership for 21st Century Skills. A State Leaders Action Guide to 21st Century Skills: A New Vision for Education. 2006; http://apcrsi.pt/website/wp-content/uploads/20170317_Partnership_for_21st_Century_Learning.pdf

World Health Organization. Four-Year Integrated Nursing And Midwifery Competency-Based Prototype Curriculum for the African Region [Internet]. Republic of South Africa: WHO Regional Office for Africa. 2016; 2016. 13 p. https://apps.who.int/iris/bitstream/handle/10665/331471/9789290232612-eng.pdf?sequence=1&isAllowed=y

World Health Organization, THREE-YEAR REGIONAL PROTOTYPE PRE-SERVICE COMPETENCY-BASED NURSING, CURRICULUM [Internet]. 2016. https://apps.who.int/iris/bitstream/handle/10665/331657/9789290232629-eng.pdf?sequence=1&isAllowed=y

Haramba SJ, Millanzi WC, Seif SA. Enhancing nursing student presentation competences using Facilitatory Pecha kucha presentation pedagogy: a quasi-experimental study protocol in Tanzania. BMC Med Educ [Internet]. 2023;23(1):628. https://bmcmededuc.biomedcentral.com/articles/ https://doi.org/10.1186/s12909-023-04628-z

Millanzi WC, Osaki KM, Kibusi SM. Non-cognitive skills for safe sexual behavior: an exploration of baseline abstinence skills, condom use negotiation, Self-esteem, and assertiveness skills from a controlled problem-based Learning Intervention among adolescents in Tanzania. Glob J Med Res. 2020;20(10):1–18.

Google Scholar  

Millanzi WC, Herman PZ, Hussein MR. The impact of facilitation in a problem- based pedagogy on self-directed learning readiness among nursing students : a quasi- experimental study in Tanzania. BMC Nurs. 2021;20(242):1–11.

Millanzi WC, Kibusi SM. Exploring the effect of problem-based facilitatory teaching approach on metacognition in nursing education: a quasi-experimental study of nurse students in Tanzania. Nurs Open. 2020;7(April):1431–45.

Article   Google Scholar  

Millanzi WC, Kibusi SM. Exploring the effect of problem based facilitatory teaching approach on motivation to learn: a quasi-experimental study of nursing students in Tanzania. BMC Nurs [Internet]. 2021;20(1):3. https://bmcnurs.biomedcentral.com/articles/ https://doi.org/10.1186/s12912-020-00509-8

Hadiyanti KMW, Widya W. Analyzing the values and effects of Powerpoint presentations. LLT J J Lang Lang Teach. 2018;21(Suppl):87–95.

Nichani A. Life after death by power point: PechaKucha to the rescue? J Indian Soc Periodontol [Internet]. 2014;18(2):127. http://www.jisponline.com/text.asp?2014/18/2/127/131292

Uzun AM, Kilis S. Impressions of Pre-service teachers about Use of PowerPoint slides by their instructors and its effects on their learning. Int J Contemp Educ Res. 2019.

Unesco National Commission TM. UNESCO National Commission Country ReportTemplate Higher Education Report. [ UNITED REPUBLIC OF TANZANIA ]; 2022.

TCU. VitalStats on University Education in Tanzania. 2020. 2021;1–4. https://www.tcu.go.tz/sites/default/files/VitalStats 2020.pdf.

Kwame A, Petrucka PM. A literature-based study of patient-centered care and communication in nurse-patient interactions: barriers, facilitators, and the way forward. BMC Nurs [Internet]. 2021;20(1):158. https://bmcnurs.biomedcentral.com/articles/ https://doi.org/10.1186/s12912-021-00684-2

Kourkouta L, Papathanasiou I. Communication in Nursing Practice. Mater Socio Medica [Internet]. 2014;26(1):65. http://www.scopemed.org/fulltextpdf.php?mno=153817

Foulkes M. Presentation skills for nurses. Nurs Stand [Internet]. 2015;29(25):52–8. http://rcnpublishing.com/doi/ https://doi.org/10.7748/ns.29.25.52.e9488

Solusia C, Kher DF, Rani YA. The Use of Pecha Kucha Presentation Method in the speaking for Informal Interaction Class. 2020;411(Icoelt 2019):190–4.

Sen G. What is PechaKucha in Teaching and How Does It Work? Clear Facts About PechaKucha in Classroom [Internet]. Asian College of Teachers. 2016 [cited 2022 Jun 15]. https://www.asiancollegeofteachers.com/blogs/452-What-is-PechaKucha-in-Teaching-and-How-Does-It-Work-Clear-Facts-About-PechaKucha-in-Classroom-blog.php

Pecha Kucha Website. Pecha Kucha School [Internet]. 2022. https://www.pechakucha.com/schools

Mabuan RA. Developing Esl/Efl Learners Public Speaking Skills through Pecha Kucha Presentations. Engl Rev J Engl Educ. 2017;6(1):1.

Laieb M, Cherbal A. Improving speaking performance through Pecha Kucha Presentations among Algerian EFL Learners. The case of secondary School students. Jijel: University of Mohammed Seddik Ben Yahia; 2021.

Angelina P, IMPROVING INDONESIAN EFL STUDENTS SPEAKING, SKILL THROUGH PECHA KUCHA. LLT J A. J Lang Lang Teach [Internet]. 2019;22(1):86–97. https://e-journal.usd.ac.id/index.php/LLT/article/view/1789

Abraham RR, Torke S, Gonsalves J, Narayanan SN, Kamath MG, Prakash J, et al. Modified directed self-learning sessions in physiology with prereading assignments and Pecha Kucha talks: perceptions of students. Adv Physiol Educ. 2018;42(1):26–31.

Coskun A. The Effect of Pecha Kucha Presentations on Students’ English Public Speaking Anxiety. Profile Issues Teach Prof Dev [Internet]. 2017;19(_sup1):11–22. https://revistas.unal.edu.co/index.php/profile/article/view/68495

González Ruiz C, STUDENT PERCEPTIONS OF THE USE OF PECHAKUCHA, In PRESENTATIONS IN SPANISH AS A FOREIGN LANGUAGE. 2016. pp. 7504–12. http://library.iated.org/view/GONZALEZRUIZ2016STU

Warmuth KA. PechaKucha as an Alternative to Traditional Student Presentations. Curr Teach Learn Acad J [Internet]. 2021;(January). https://www.researchgate.net/publication/350189239

Hayashi PMJ, Holland SJ. Pecha Kucha: Transforming Student Presentations. Transform Lang Educ [Internet]. 2017; https://jalt-publications.org/files/pdf-article/jalt2016-pcp-039.pdf

Solmaz O. Developing EFL Learners ’ speaking and oral presentation skills through Pecha Kucha presentation technique. 2019;10(4):542–65.

Tanzania Commission for Universities. University Institutions operating in Tanzania. THE UNITED REPUBLIC OF TANZANIA; 2021.

The University of Dodoma. About Us [Internet]. 2022 [cited 2022 Aug 22]. https://www.udom.ac.tz/about

NACTVET. Registered Institutions [Internet]. The United Republic of Tanzania. 2022. https://www.nacte.go.tz/?s=HEALTH

TCU. University education in tanzania 2021. VitalStats, [Internet]. 2022;(May):63. https://www.tcu.go.tz/sites/default/files/VitalStats 2021.pdf.

St. John University of Tanzania. About St. John University [Internet]. 2022 [cited 2022 Aug 22]. https://sjut.ac.tz/our-university/

TopUniversitieslist. St John’s University of Tanzania Ranking [Internet]. World University Rankings & Reviews. 2023 [cited 2023 Jul 1]. https://topuniversitieslist.com/st-johns-university-of-tanzania/

Tanzania Nursing and Midwifery Council. TANZANIA NURSING AND MIDWIFERY COUNCIL THE REGISTRATION AND LICENSURE EXAMINATION GUIDELINE FOR NURSESAND MIDWIVES IN TANZANIA REVISED VERSION. : 2020; https://www.tnmc.go.tz/downloads/

Salim MA, Gabrieli P, Millanzi WC. Enhancing pre-school teachers’ competence in managing pediatric injuries in Pemba Island, Zanzibar. BMC Pediatr. 2022;22(1):1–13.

Iliyasu R, Etikan I. Comparison of quota sampling and stratified random sampling. Biometrics Biostat Int J [Internet]. 2021;10(1):24–7. https://medcraveonline.com/BBIJ/comparison-of-quota-sampling-and-stratified-random-sampling.html

Surucu L, Ahmet M, VALIDITY, AND RELIABILITY IN QUANTITATIVE RESEARCH. Bus Manag Stud An Int J [Internet]. 2020;8(3):2694–726. https://bmij.org/index.php/1/article/view/1540

Lima E, de Barreto P, Assunção SM. Factor structure, internal consistency and reliability of the posttraumatic stress disorder checklist (PCL): an exploratory study. Trends Psychiatry Psychother. 2012;34(4):215–22.

Taber KS. The Use of Cronbach’s alpha when developing and Reporting Research Instruments in Science Education. Res Sci Educ. 2018;48(6):1273–96.

Tavakol M, Dennick R. Making sense of Cronbach’s alpha. Int J Med Educ. 2011;2(2011):53–5.

Madar P, London W, ASSESSING THE STUDENT :. PECHAKUCHA. 2013;3(2):4–10.

Haramba, S. J., Millanzi, W. C., & Seif, S. A. (2023). Enhancing nursing student presentation competencies using Facilitatory Pecha Kucha presentation pedagogy: a quasi-experimental study protocol in Tanzania. BMC Medical Education, 23(1), 628. https://doi.org/10.1186/s12909-023-04628-z

Bakcek O, Tastan S, Iyigun E, Kurtoglu P, Tastan B. Comparison of PechaKucha and traditional PowerPoint presentations in nursing education: A randomized controlled study. Nurse Educ Pract [Internet]. 2020;42:102695. https://linkinghub.elsevier.com/retrieve/pii/S1471595317305097

Mcleod G. Learning theory and Instructional Design. Learn Matters. 2001;2(2003):35–43.

Warmuth KA, Caple AH. Differences in Instructor, Presenter, and Audience Ratings of PechaKucha and Traditional Student Presentations. Teach Psychol [Internet]. 2022;49(3):224–35. http://journals.sagepub.com/doi/10.1177/00986283211006389

Download references

Acknowledgements

The supervisors at the University of Dodoma, statisticians, my employer, family members, research assistants and postgraduate colleagues are acknowledged for their support in an attempt to facilitate the development and completion of this manuscript.

The source of funds to conduct this study was the registrar, Tanzania Nursing and Midwifery Council (TNMC) who is the employer of the corresponding author. The funds helped the author in developing the protocol, printing the questionnaires, and facilitating communication during the data collection and data analysis and manuscript preparation.

Author information

Authors and affiliations.

Department of Nursing Management and Education, The University of Dodoma, Dodoma, United Republic of Tanzania

Setberth Jonas Haramba & Walter C. Millanzi

Department of Public and Community Health Nursing, The University of Dodoma, Dodoma, United Republic of Tanzania

Saada A. Seif

You can also search for this author in PubMed   Google Scholar

Contributions

S.J.H: conceptualization, proposal development, data collection, data entry, data cleaning and analysis, writing the original draft of the manuscript W.C.M: Conceptualization, supervision, review, and editing of the proposal, and the final manuscript S.S.A: Conceptualization, supervision, review, and editing of the proposal and the final manuscript.

Corresponding author

Correspondence to Setberth Jonas Haramba .

Ethics declarations

Ethics approval and consent to participate.

All methods were carried out under the relevant guidelines and regulations. Since the study involved the manipulation of human behaviors and practices and the exploration of human internal learning experiences, there was a pressing need to obtain ethical clearance and permission from the University of Dodoma (UDOM) Institution of Research Review Ethics Committee (IRREC) in order to conduct this study. The written informed consents were obtained from all the participants, after explaining to them the purpose, the importance of participating in the study, the significance of the study findings to students’ learning, and confidentiality and privacy of the information that will be provided. The nursing students who participated in this study benefited from the knowledge of the Pecha Kucha presentation format and how to prepare and present their assignments using the Pecha Kucha presentation format.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Haramba, S.J., Millanzi, W.C. & Seif, S.A. Effects of pecha kucha presentation pedagogy on nursing students’ presentation skills: a quasi-experimental study in Tanzania. BMC Med Educ 24 , 952 (2024). https://doi.org/10.1186/s12909-024-05920-2

Download citation

Received : 16 October 2023

Accepted : 16 August 2024

Published : 31 August 2024

DOI : https://doi.org/10.1186/s12909-024-05920-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Nursing students
  • Pecha Kucha presentation pedagogy and presentation skills

BMC Medical Education

ISSN: 1472-6920

data presentation methods in quantitative research

IMAGES

  1. Quantitative Research Methods PowerPoint Template

    data presentation methods in quantitative research

  2. Week 12: Quantitative Research Methods

    data presentation methods in quantitative research

  3. Quantitative Studies Chapter 1 DATA PRESENTATION

    data presentation methods in quantitative research

  4. Quantitative Analysis

    data presentation methods in quantitative research

  5. Quantitative Pathology Research

    data presentation methods in quantitative research

  6. PPT

    data presentation methods in quantitative research

VIDEO

  1. Presentation of Data |Chapter 2 |Statistics

  2. Understanding Data: Levels, Scales, Types, Sources & Software

  3. Understanding Quantitative Research Methods

  4. Validity in Quantitative Data

  5. Quantitative Research Methods 2024

  6. chapter -6: data analysis and presentation

COMMENTS

  1. CHAPTER FOUR DATA PRESENTATION, ANALYSIS AND ...

    Learn how to present, analyze and interpret data in a research paper with this PDF chapter from ResearchGate, a leading academic platform.

  2. Principles of Effective Data Visualization

    Learn how to create clear and compelling data visualizations from this article that reviews the principles and best practices of data visualization.

  3. Understanding Data Presentations (Guide + Examples)

    What is a Data Presentation? A data presentation is a slide deck that aims to disclose quantitative information to an audience through the use of visual formats and narrative techniques derived from data analysis, making complex data understandable and actionable. This process requires a series of tools, such as charts, graphs, tables, infographics, dashboards, and so on, supported by concise ...

  4. Statistical data presentation

    In this article, the techniques of data and information presentation in textual, tabular, and graphical forms are introduced. Text is the principal method for explaining findings, outlining trends, and providing contextual information. A table is best suited for representing individual information and represents both quantitative and ...

  5. Presentation of Quantitative Research Findings

    Valid and clear presentation of research findings is an important aspect of health services research. This chapter presents recommendations and examples for the presentation of quantitative findings, focusing on tables and graphs. The recommendations in this field are largely experience-based. Tables and graphs should be tailored to the needs ...

  6. Data Presentation

    Although nominal (i.e. qualitative) data often occurs in business and economics, more common is quantitative data, arising from the use of ordinal and interval/ratio measuring scales. In this chapter we will discuss methods of presenting such data in ways which enable a rapid appreciation of its principal features.

  7. 17 Important Data Visualization Techniques

    Learning the most effective data visualization techniques can be the first step in becoming more data-driven and adding value to your organization.

  8. Chapter Four: Quantitative Methods (Part 3

    This section of the chapter is about how to make sense of your study, in terms of data interpretation, data write-up, and data presentation, as seen in the above diagram. Research Methods Chapter One: Introduction Chapter Two: Understanding the distinctions among research methods Chapter Three: Ethical research, writing, and creative work

  9. Data Collection, Presentation and Analysis

    This chapter covers the topics of data collection, data presentation and data analysis. It gives attention to data collection for studies based on experiments, on data derived from existing published or unpublished data sets, on observation, on simulation and digital twins, on surveys, on interviews and on focus group discussions.

  10. Present Your Data Like a Pro

    While a good presentation has data, data alone doesn't guarantee a good presentation. It's all about how that data is presented. The quickest way to confuse your audience is by sharing too ...

  11. Scales of Measurement and Presentation of Statistical Data

    In the data collection, the type of questionnaire and the data recording tool differ according to the data types. Similarly, in the data analysis, statistical tests/methods differ from one data type to another. Data presentation is an important step to communicate our information and findings to the audience and readers in an effective way.

  12. Presenting the Results of Quantitative Analysis

    Mikaila Mariel Lemonik Arthur. This chapter provides an overview of how to present the results of quantitative analysis, in particular how to create effective tables for displaying quantitative results and how to write quantitative research papers that effectively communicate the methods used and findings of quantitative analysis.

  13. Quantitative Data Analysis Methods & Techniques 101

    Learn the basics of quantitative data analysis, including the most popular analysis methods and techniques, explained simply with examples.

  14. A Really Simple Guide to Quantitative Data Analysis

    It is important to know w hat kind of data you are planning to collect or analyse as this w ill. affect your analysis method. A 12 step approach to quantitative data analysis. Step 1: Start with ...

  15. Quantitative Data Analysis: A Comprehensive Guide

    Dive into the concept of quantitative data analysis. Understand its steps, benefits and methods, and learn the importance of data analysis in quantitative research.

  16. Quantitative Data Analysis Guide: Methods, Examples & Uses

    Although quantitative data analysis is a powerful tool, it cannot be used to provide context for your research, so this is where qualitative analysis comes in. Qualitative analysis is another common research method that focuses on collecting and analyzing non-numerical data, like text, images, or audio recordings to gain a deeper understanding ...

  17. Quantitative Data

    Quantitative data refers to numerical data that can be measured or counted. This type of data is often used in scientific research and is typically collected through methods such as surveys, experiments, and statistical analysis.

  18. Presentation of Quantitative Data: Data Visualization

    2.6 Some Final Practical Graphical Presentation Advice. This chapter has presented a number of topics related to graphical presentation and visualization of quantitative data. Many of the topics are an introduction to data analysis, which we will visit in far greater depth in later chapters.

  19. Data Analysis Techniques for Quantitative Study

    It then provides some statistical analysis techniques for quantitative study with examples using tables and graphs, making it easier for the readers to understand the data presentation techniques ...

  20. Data Collection

    Learn how to collect data for your research project, whether it is qualitative or quantitative, and explore the advantages and disadvantages of different methods and examples.

  21. Qualitative VS Quantitative Definition

    Quantitative Research: A Data-Driven Approach. Unlike qualitative methods, quantitative research relies primarily on the collection and analysis of objective, measurable numerical data. This structured empirical evidence is then manipulated using statistical, graphical and mathematical techniques to derive patterns, trends and conclusions.

  22. 10.3: Types of Quantitative Data Analysis and Presentation Format

    If your thesis is quantitative research, you will be conducting various types of analyses (see the following table).

  23. Differences and Method to Qualitative vs. Quantitative Research

    When it comes to collecting, analyzing, and interpreting research findings, two primary approaches stand out - qualitative and quantitative methods. While both are essential for gaining a deeper understanding of various topics, they differ significantly in their approach, methodology, data collection techniques, analysis procedures, and the ...

  24. Presenting data in tables and charts

    The present paper aims to provide basic guidelines to present epidemiological data using tables and graphs in Dermatology. Although simple, the preparation of tables and graphs should follow basic recommendations, which make it much easier to understand the data under analysis and to promote accurate communication in science.

  25. Comprehensive Guide to Quantitative Research Methods in Education

    The next section of this tutorial provides an overview of the procedures in conducting quantitative data analysis. There are specific procedures for conducting the data collection, preparing for and analyzing data, presenting the findings, and connecting to the body of existing research. ... Next, review the literature on quantitative research ...

  26. Data Analysis Techniques for Quantitative Study

    This chapter describes the types of data analysis techniques in quantitative research and sampling strategies suitable for quantitative studies, particularly probability sampling, to produce credible and trustworthy explanations of a phenomenon. Initially, it briefly...

  27. Quantitative microbiology with widefield microscopy: navigating optical

    Time-resolved live-cell imaging using widefield microscopy is instrumental in quantitative microbiology research. It allows researchers to track and measure the size, shape, and content of ...

  28. Balancing Qualitative and Quantitative Research Methods: Insights and

    An examination of the research methods and research designs employed suggests that on the quantitative side structured interview and questionnaire research within a cross-sectional design tends to ...

  29. A Quantitative Systematic Literature Review of Combination Punishment

    As such, it may be argued that this research gap limits clinicians' capacity to make evidence-based decisions in treating challenging behavior, which could lead to either applying prolonged ineffective behavioral treatments or using non-behavioral restrictive methods such as the concurrent application of multiple psychotropic medications ...

  30. Effects of pecha kucha presentation pedagogy on nursing students

    Introduction Ineffective and non-interactive learning among nursing students limits opportunities for students' classroom presentation skills, creativity, and innovation upon completion of their classroom learning activities. Pecha Kucha presentation is the new promising pedagogy that engages students in learning and improves students' speaking skills and other survival skills. It involves ...