• Directories
  • Books & Ebooks
  • Gray Literature
  • Encyclopedias & Dictionaries
  • Online Language Reference
  • Data Resources
  • Citations & Writing
  • Research Methods
  • Language Learning and Teaching
  • Start Your Research
  • Research Guides
  • University of Washington Libraries
  • Library Guides
  • UW Libraries
  • Linguistics

Linguistics: Research Methods

Selected research methods texts.

Book Cover

Methods in Contemporary Linguistics (online)

Book Cover

The Oxford Handbook of Linguistic Analysis (online)

Book Cover

Research Methods in Sociolinguistics: A Practical Guide (online)

Book Cover

The Routledge Encyclopedia of Research Methods in Applied Linguistics (online)

Book Cover

The Oxford Handbook of Linguistic Fieldwork (online)

Linguistics research methods.

type of research in language

Research Methods Database: SRMO

  • SAGE Research Methods Online (SRM) Guide to SRM, database featuring articles, books, case studies, datasets, and video. Covers practices of quantitative, qualitative, and mixed methods research methodologies.

Sage Research Methods: Literature reviews, interviews, focus groups, dissertations, research design, surveys, case studies, statistics

Print Books

type of research in language

  • << Previous: Citations & Writing
  • Next: Language Learning and Teaching >>
  • Last Updated: Aug 31, 2024 5:39 PM
  • URL: https://guides.lib.uw.edu/research/linguistics

Your browser is not supported

Sorry but it looks as if your browser is out of date. To get the best experience using our site we recommend that you upgrade or switch browsers.

Find a solution

  • Skip to main content
  • Skip to navigation
  • Macmillan English
  • Onestopenglish
  • Digital Shop

type of research in language

  • Back to parent navigation item
  • Sample material
  • Amazing World of Animals
  • Amazing World of Food
  • Arts and Crafts
  • Mathematics
  • Transport and Communication
  • Teaching Tools
  • Sustainable Development and Global Citizenship
  • Support for Teaching Children
  • Vocabulary & Phonics
  • Spelling Bee Games
  • Phonics & Sounds
  • The Alphabet
  • Onestop Phonics: The Alphabet
  • Alphabet Booklet
  • Interactive Flashcards
  • Warmers & Fillers
  • Young Learner Games
  • Stories and Poems
  • Fillers & Pastimes
  • Fun Fillers
  • Ready for School!
  • Topics & Themes
  • Young Learner Topics
  • Young Learner Festivals
  • Festival Worksheets
  • Art and Architecture
  • Business and Tourism
  • Geography and the Environment
  • Information Technology
  • Science and Nature
  • Topic-based Listening Lessons
  • Cambridge English
  • Cambridge English: Preliminary (PET)
  • Cambridge English: First (FCE)
  • Cambridge English: Proficiency (CPE)
  • Cambridge English: Advanced (CAE)
  • General English
  • News Lessons
  • Topics and Themes
  • Beyond (BrE)
  • Beyond: Arts and Media
  • Beyond: Knowledge
  • Go Beyond (AmE)
  • Go Beyond: Arts & Media
  • Go Beyond: Knowledge
  • Impressions
  • Macmillan Readers
  • A Time to Travel
  • Life & School
  • Skills for Problem Solving
  • Digital Skills for Teens
  • Support for Teaching Teenagers
  • Games Teaching Materials
  • Business and ESP
  • Business Lesson Plans
  • Business Skills Bank
  • Business Top Trumps
  • Elementary Business Lessons
  • HR Management
  • Let's Talk Business
  • Business News Lessons
  • ESP Lesson Plans
  • Career Readiness
  • Professional Communication Skills
  • Cambridge English: Business (BEC)
  • Everyday Life
  • Celebrations
  • Live from...
  • Live from London
  • Discussion Cards
  • Writing Lesson Plans
  • Life Skills
  • Support for Teaching Adults
  • Vocabulary Lesson Plans
  • Language for...
  • Vocabulary Teaching Materials
  • Macmillan Dictionary Blog
  • Vocabulary Infographics
  • Kahoot! Quizzes
  • Blog Articles
  • Professional Development
  • Lesson Share
  • Methodology: Projects and Activities
  • Methodology: Tips for Teachers
  • Methodology: The World of ELT
  • Advancing Learning
  • Online Teaching
  • More from navigation items

Applied linguistics for the language classroom

  • 1 Applied linguistics for the language classroom
  • 2 Applied linguistics: Research methods for language teaching
  • 3 Applied linguistics: Mobile phones
  • 4 Applied linguistics: Empowering language learning through assessment
  • 5 Applied linguistics: Teaching pronunciation
  • 6 Applied linguistics: Choosing a method

Applied linguistics: Research methods for language teaching

By Netta Avineri

  • No comments

Netta Avineri, Assistant Professor of TESOL at the Middlebury Institute of International Studies at Monterey, offers a step-by-step guide to conducting research in the language classroom.

Photo of a teacher writing or reading something, can be on a computer too if it's easier to find.

Source: Hero Images, Getty Images/Hero Images

Introduction

All language teachers are researchers – coming up with questions about our teaching, trying things out, reflecting, and making ongoing changes in response to various factors. Research is engagement in inquiry with the goal of understanding a phenomenon in the world through the systematic collection, analysis, and interpretation of data. This article will give you the basics so you can ACE the research process – conducting research that is Applicable (to your language classroom), Collaborative (integrating you into a teacher-researcher community practice), and Empowering (for you and the participants in your research).

Why conduct research in your language classroom?

Engaging in research allows you to learn about a range of perspectives on the issues you’re interested in. Research can allow you to have a clear rationale for your teaching choices. Conducting research can have a direct, relevant impact on your classroom, your students, and your teaching. It can also help you to refine your teaching philosophy and pedagogical approach. In addition, research provides you with an opportunity to become part of a teacher-researcher community of practice, which provides you with connections and networks upon which to depend and to which you can contribute.

For more information on action research see: www.teachingenglish.org.uk/article/action-research

Here I present the 11 steps to conducting research in your language classroom, so you can get a clear sense of the what, how, and why of the research process. Throughout every step of the research process, it is essential to be sensitive to the four Rs of ethics (the Reasons , Roles , Responsibilities , and Relationships central to your research). You can use these steps like a ‘to-do’ list throughout your process of inquiry.

1. Area of interest 

The first step in research is deciding what you’re interested in finding out more about – an area of interest or a topic – which can stem from questions you ask yourself about your students and your pedagogical approaches. Some examples of areas of interest are interaction in asynchronous learning environments, error correction, and focus on form in grammar teaching.

Some questions we may consider:

  • Sociolinguistic topics (e.g. which language variety should I use/teach in class?)
  • Linguistic matters (e.g. how helpful is it to give students vocabulary lists?)
  • Methodological concerns (e.g. should I focus on fluency or accuracy?)
  • Classroom management (e.g. how often should I put students in groups?)

Make sure to choose a topic that interests you and that is relevant to your language classroom context.

2. Literature review

The next step is conducting a literature review, so you can have a sense of what relevant academic fields are saying about your topic of interest. This will give you a picture of the state of the field and the kinds of methods that researchers use to conduct research on similar topics to yours, and allows you to see what gaps exist in the literature (i.e. which areas of inquiry still need to be explored). Cast a wide net when conducting the literature review by including peer-reviewed academic articles and books, blogs, documentaries, reports, institutional materials, and personal communication. The literature review process includes six steps: understanding, organizing, dialoguing/critiquing, synthesizing, reporting, and becoming (part of the literature). Once you have conducted your literature review you will have a clear sense of topics and themes in relevant fields.

3. Research questions

The next step is creating research questions, which are the guiding questions for your inquiry. Your research questions should be specific, empirical (data-based), and answerable. These questions can be inductive (open-ended) or deductive (closed-ended). An example of an inductive research question would be: ‘How do students respond when I use their first languages (L1s) in the classroom setting?’ An example of a deductive research question would be ‘When teachers engage in error correction by rephrasing beginning-level Mandarin students’ utterances during class, do the students disengage?’ Throughout the remaining steps of the research process, it is important to remain accountable to your research question, so that you collect, analyze, and interpret data that can answer your question.

4. Research design

Once you have crafted your research question you will select an appropriate research design. For example, if your research question is focused on teacher-student interactions it may be necessary to use classroom observations, field notes, video recordings, and/or transcripts. If your research question is focused on students’ perceptions of teaching methods it may be necessary to conduct focus groups, interviews, and/or questionnaires. It is essential for your research question to be intimately connected with the data collection methods you select.

5. Data collection

You will then begin collecting your data. Examples of data collection approaches are questionnaires, interviews, focus groups, reflections, case studies, ethnography, and visual data. Data collection involves multiple steps and considerations:

A. Selecting a sample population, which involves determining who will be part of the research and why, and interacting with them.

B. Piloting, which means trying out your data collection instruments, getting feedback, and making changes before distributing them to your entire sample.

C. Collecting qualitative data, which can be observed.

D. Collecting quantitative data, which can be measured.

6. Data analysis

Once you have collected your data you will begin data analysis, which involves making sense of your data and looking for patterns/themes across the dataset. For qualitative data (e.g. interview responses, observations) this will involve interpretive data analysis, and for quantitative data (e.g. responses to Likert Scale questions, test scores) this may involve statistical analysis.

7. Findings

When you have engaged in in-depth data analysis you will identify your findings – the main nuggets of information you have discovered based on themes (synthetic connection points) across the data.

8. Interpretation

Data analysis is ’data-close’, which involves looking closely at what your data tells you. Interpretation moves beyond the data itself to inferences, hunches, and intuition. The process of interpretation also allows you to connect your findings with what you found in the literature review, to see how your research contributes in unique ways to the field and pedagogical practices.

9. Argument

Based on your identification of findings along with your interpretation you can then build an argument, a discourse intended to persuade other members of your community of practice. This overarching argument will include material from a variety of sources to create a story about your data and participants. The most convincing arguments have sufficient data/evidence to back them up. At this stage, you will also want to return to your research question to make sure you have answered it!

10. Pedagogical implications

As language teachers, we want to be sure that our research is applicable to our own classrooms and hopefully to other teachers’ classrooms too. Therefore, at this stage you can identify the pedagogical implications of your research. This is your opportunity to ask yourself: ‘What should I do with my research results?’ Implications are more actionable and believable when based upon rigorous, thorough, and well-done research. In general, these implications are based on questions that include can or should (e.g. ‘Should I recast utterances for my beginning-level Mandarin students?’). These implications can be shared with others in your community of practice as well.

11. Sharing your findings

Now that you have gone through the previous ten steps of the research process you can share your findings, interpretations, argument, and implications. You may share your research in the form of articles, conference presentations, professional development workshops, research reports, departmental faculty meeting reports, listservs, and social media – to continuously build our communities of practice (e.g. English as a Foreign Language, Learner Autonomy, Technology; www.connectededucators.org , www.eslcafe.com ). This process of sharing involves identifying the relevant what, how, and why of our research for different audiences. It can also include giving and receiving relevant feedback as we continue to refine our research stories.

Engaging in applicable research can be empowering and collaborative. Research doesn’t always go as planned (just like our language lessons!), but the process of inquiry can create new possibilities for exploration into meaningful language pedagogy for both teachers and students.

Language Teaching Conferences:

www.tesol.org/attend-and-learn/academies-conferences-symposia/upcoming-regional-conferences

www.actfl.org/convention-expo

www.aaal.org/?page=Conference

www.aila.info/en/congresses.html

Language Teaching Journals:

www.tesol.org/docs/default-source/books/how-to-get-published-in-applied-linguistics-serials.pdf?sfvrsn=4

www.tefl-tips.com/2014/06/list-of-esl-efl-and-linguistic-journals.html?m=1

www.academia.edu/2064493/Choosing_the_right_international_journal_in_tesol_and_applied_linguistics

http://linguistlist.org/pubs/journals/browse-journals.cfm

This article is based on  Research Methods for Language Teaching  by Netta Avineri. To find out more about the book and to buy a copy, click  here . Download the sample below, to read the first chapter. 

Research Methods for Language Teaching: Sample chapter

  • Integrated Skills

Photo of a teacher writing or reading something, can be on a computer too if it's easier to find.

Applied linguistics: Mobile phones

Photo of a teacher talking one to one with a student.

Applied linguistics: Empowering language learning through assessment

Applied linguistics: teaching pronunciation.

Photo of a teacher reading an article or a book.

Applied linguistics: Choosing a method

Related articles.

By Liying Cheng & Janna Fox

Director of Assessment and Evaluation, Liying Cheng, and Associate Professor of Language Assessment and Testing, Janna Fox, look at how teachers can make more of assessment in the classroom.

CLIL: Complementing or Compromising English Language Teaching? An opinion from a CLIL Biology Teacher

By Lyubov Dombeva

Lyubov Dombeva - a CLIL biology teacher in Bulgaria - gives her reaction to criticisms that CLIL compromises, rather than complements, English Language Teaching.

Lesson Share: Playschool for grown-ups? Changing the rules of the game in language teaching

By Daniel Monaghon

Daniel Monaghon asks whether game-playing has a legitimate place in the ELT classroom.

No comments yet

Only registered users can comment on this article., more from methodology: the world of elt.

Communicative Language Teaching

Communicative Language Teaching

By Judson Wright

Read about the benefits and the practice of Communicative Language Teaching.

How to write

Learn new skills with this series of articles from the  Learn to Write ELT materials course.

By Leonardo Mercado

Leonardo Mercado, the Academic Director of Euroidiomas, discusses whether there is a place for mobile phones in the language classroom.

Join onestopenglish today

With more than 700,000 registered users in over 100 countries around the world, Onestopenglish is the number one resource site for English language teachers, providing access to thousands of resources, including lesson plans, worksheets, audio, video and flashcards.

  • Connect with us on Facebook
  • Connect with us on Twitter
  • Connect with us on Youtube

Onestopenglish is a teacher resource site, part of Macmillan Education, one of the world’s leading publishers of English language teaching materials.

  • Privacy Policy
  • Cookie policy
  • Manage cookies

©Macmillan Education Limited 2023. Company number: 1755588 VAT number: 199440621

Site powered by Webvision Cloud

Exploring Research Methods in Language Learning-teaching Studies

  • December 2018
  • Advances in Language and Literary Studies 9(6):27-33

Vahid Nimehchisalem at Universiti Putra Malaysia

  • Universiti Putra Malaysia

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Motuma Hirpassa Minda

  • Anna Kaganiec-Kamieńska
  • Elías De León
  • Corina Rodríguez
  • Félix Camarena
  • Carlos Villarreal
  • Zahra Ebrahimi
  • Raqib Chowdhury

Nayibe Rosado

  • Turki Al- Harbi

Truong Cong Bang

  • Y.S. Lincoln

John W Creswell

  • Vickie L Plano Clark

Michelle L Gutmann

  • William E. Hanson

Jennifer Greene

  • Valerie J. Caracelli
  • Wendy F. Graham

Diane Larsen-Freeman

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

W

  • General & Introductory Linguistics
  • Second Language Acquisition

type of research in language

Research Methods in Language Teaching and Learning: A Practical Guide

ISBN: 978-1-119-70163-7

Wiley-Blackwell

Digital Evaluation Copy

type of research in language

Kenan Dikilitas , Kate Mastruserio Reynolds , Li Wei

A practical guide to the methodologies used in language teaching and learning research, providing expert advice and real-life examples from leading TESOL researchers

Research Methods in Language Teaching and Learning provides practical guidance on the primary research methods used in second language teaching, learning, and education. Designed to support researchers and students in language education and learning, this highly accessible book covers a wide range of research methodologies in the context of actual practice to help readers fully understand the process of conducting research.

Organized into three parts, the book covers qualitative studies, quantitative studies, and systematic reviews. Contributions by an international team of distinguished researchers and practitioners explain and demonstrate narrative inquiry, discourse analysis, ethnography, heuristic inquiry, mixed methods, experimental and quasi-experimental studies, and more. Each chapter presents an overview of a method of research, an in-depth description of the research framework or data analysis process, and a meta-analysis of choices made and challenges encountered. Offering invaluable insights and hands-on research knowledge to students and early-career practitioners alike, this book:

  • Focuses on the research methods, techniques, tools, and practical aspects of performing research
  • Provides firsthand narratives and case studies to explain the decisions researchers make
  • Compares the relative strengths and weaknesses of different research methods
  • Includes real-world examples for each research method and framework to highlight the context of the study
  • Includes extensive references, further reading suggestions, and end-of-chapter review questions

Part of the Guides to Research Methods in Language and Linguistics series, Research Methods in Language Teaching and Learning is essential reading for students, educators, and researchers in all related fields, including TESOL, second language acquisition, English language teaching, and applied linguistics.

Kate Mastruserio Reynolds is Professor of TESOL and Literacy at Central Washington University, USA. She has authored and edited many works in the field of TESOL, including Introduction to TESOL: Becoming a Language Teaching Professional with Kenan Dikilits and Steve Close (Wiley Blackwell, 2021). She was Associate Editor of the vocabulary volume of The TESOL Encyclopedia of English Language Teaching (Wiley Blackwell, 2018).

Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more: https://www.cambridge.org/universitypress/about-us/news-and-blogs/cambridge-university-press-publishing-update-following-technical-disruption

We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings .

Login Alert

  • > Research Methods in Language Attitudes
  • > An Introduction to Language Attitudes Research

type of research in language

Book contents

  • Research Methods in Language Attitudes
  • Copyright page
  • Contributors
  • Acknowledgements
  • 1 An Introduction to Language Attitudes Research
  • Part 1 Analysis of the Societal Treatment of Language
  • Part 2 Direct Methods of Attitude Elicitation
  • Part 3 Indirect Methods of Attitude Elicitation
  • Part 4 Overarching Issues in Language Attitudes Research

1 - An Introduction to Language Attitudes Research

Published online by Cambridge University Press:  25 June 2022

By providing an introduction to language attitude theory, this chapter serves as a reference point for the subsequent chapters. The chapter begins by considering attitudes in general (their formation, functions, and components) before focusing specifically on language attitudes. The chapter examines the link between language and social identity, the notion of language attitudes as reflections of social mores and the related issue of language attitude change, and the difference between (and inter-relatedness of) language attitudes and ideologies. The chapter then discusses the implications and consequences of language attitudes at the micro as well as the macro level. Subsequently, the chapter covers the key individual and socio-structural factors that influence language attitudes, and it discusses the evaluative dimensions of language attitudes (and how they are connected to the aforementioned socio-structural factors). The chapter introduces the three types of methods by means of which language attitudes can be investigated – that is, the analysis of the societal treatment of language, direct methods, and indirect methods – and the key overarching issues in language attitudes research which are covered in the book (i.e. regarding different community types, different data types, priming, and mixed-methods approaches). The aims of the book, and its structure and contents, are explained.

Access options

Save book to kindle.

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle .

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service .

  • An Introduction to Language Attitudes Research
  • By Ruth Kircher , Lena Zipp
  • Edited by Ruth Kircher , Mercator European Research Centre on Multilingualism and Language Learning, and Fryske Akademy, Netherlands , Lena Zipp , Universität Zürich
  • Book: Research Methods in Language Attitudes
  • Online publication: 25 June 2022
  • Chapter DOI: https://doi.org/10.1017/9781108867788.002

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox .

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive .

The Relevance of Language for Scientific Research

  • First Online: 28 April 2021

Cite this chapter

type of research in language

  • Wenceslao J. Gonzalez 2  

225 Accesses

1 Citations

The historical framework of the origin of the relevance of language for scientific research is the previous step for its philosophical analysis, which considers a number of aspects of special importance. (1) Language is one of the constitutive elements of science. It accompanies the other elements that configure science: the structure in which scientific theories are articulated, scientific knowledge, research methods, scientific activity, scientific aims and the values of science. (2) Language has two main roles in the configuration of science. (a) It contributes to establishing scientific thinking (either in natural language or in a formal language, such as mathematics). Thus, language shapes how scientists conceive scientific activity (problems, models and contrasts). (b) Language has a heuristic function, insofar as it allows us to explore new possibilities, create new forms of expression for possible phenomena (in the short, middle and long run). (3) Language allows differences to be shaped between basic science, applied science and application of science. Thus, we have three main options: (i) explanatory and predictive statements; (ii) predictive and prescriptive statements; and (iii) statements oriented to differentiated contexts of use. (4) Scientific language cannot be reduced to structural components (macro-theoretical frameworks, theories, models, hypotheses, etc.), because it encompasses dynamic components of science as well. These dynamic components cannot be condensed merely in terms of processes and evolution, since at least the social sciences and the science of the artificial perform research on phenomena characterized by historicity. (5) In addition to the dimension of language in scientific activity, we should consider the language of this activity as connected with other human activities in social life. This leads to differences with the language of technology and the language of ordinary social life.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

type of research in language

The Multiple Aspects of the Philosophy of Science

type of research in language

Is There a Scientific Method? The Analytic Model of Science

type of research in language

Levels of Reality, Complexity, and Approaches to Scientific Method

This preeminent role is now best appreciated in the sciences of the Internet, within the framework of the sciences of the artificial, where the role of language in Web science is central, as semantic web research has highlighted. Cf. Tiropanis et al. ( 2015 ); and Hendler and Hall ( 2016 ).

According to James Hendler, “the social nature of the Web 2.0 sites primarily allows linking between people, not content, thus creating large, and valuable, social networks, but with impoverished semantic value among the tagget content,” Hendler and Golbeck ( 2008 , p. 15).

This relevance accentuates the semantic or the pragmatic trait according to the approach on the theory of meaning, which is the initial focus for characterizing scientific language. This leads to choices that give priority to the sense and reference of the words or to the meaning conceived as use. In addition, this in turn can lead to giving primacy to the significance of the words rather than to the context or to holistic options, when the meaning is seen to depend on a certain set or whole.

A philosophical characterization of the theory of meaning that has been very influential is found in Dummett ( 1975 ) and Dummett ( 1976 ).

Cf. Frege ( 1892 ). See also Frege ( 1918 ).

Frege “starts from meaning by taking the theory of meaning as the only part of philosophy whose results do not depend upon those of any other part, but which underlies all the rest. By doing this, he effected a revolution in philosophy as great as the similar revolution previously effected by Descartes; and he was able to this even though there was only one other part of philosophy to which Frege applied the results he obtained in the theory of meaning. We can, therefore, date a whole epoch in philosophy as beginning with the work of Frege, just as we can do with Descartes.” Dummett ([ 1973 ] 1981), p. 669.

Cf. Gadamer ( 1960 ). See also Gadamer ( 1975 ).

Besides the relationship between Frege and Wittgenstein in the initial phase of the Tractatus Logico-Philosophicus , there are also common points with the later period of Philosophiche Untersuchungen , cf. Dummett ( 1981b ).

The “received view” is an expression used by H. Putnam the same year as the publication of Kuhn’s main book. Cf. Putnam ( 1962 ). With it, the historical conjuncture lived at that time by the methodology of science of verificationist inspiration is adequately reflected.

On Kuhn’s stages of philosophical-methodological evolution and the turns in the role of language, see Gonzalez ( 2004b ).

Cf. Popper ( 1935 , 1945a , 1945b , 1957 ). See Gonzalez ( 2004a ).

Other authors paid attention to Frege over the years, which led to the publication of Dummett ( 1981a ). In turn, within the philosophy of language, in general, and in the sphere of reference theory, in particular, there was an abundant number of publications and in various philosophical directions. See, in this regard, Gonzalez ( 1986b ).

Cf. Gonzalez ( 2006a ), especially, pp. 1–19.

Cf. Salmon ( 1992 ), especially, pp. 408–410.

“These contemporary versions of scientific realisms include the following philosophical versions, among others: structural realism, critical realism, referential realism, entity realism, instrumental realism, socially embedded realism, constructive realism, some versions of scientific perspectivism (or perspectivalism), dispositional realism, convergent realism, pragmatic realism, selective realism, minimal realism, and the so-called ‘preservative realism’.” Gonzalez ( 2020a ), p. 4. The main features of these various orientations can be found in Gonzalez ( 2020a ), pp. 6–16.

The role of language, structure, knowledge, methods, activity, ends and values as constitutive elements of science is set out in Gonzalez ( 2013b ), especially, pp. 15–17.

All of them are particularly important for having a proper analysis of central scientific issues, such as scientific prediction. See Gonzalez ( 2015a ), pp. 11–13

On interdisciplinarity, see Niiniluoto ( 2020 ).

This issue is discussed in detail in Gonzalez ( 2021 ).

Ian Hacking has insisted that not everything is constructed. See Hacking ( 1999 ).

The referent is then part of what we consider to be a “fact,” so without a referent we can hardly have anything that we can call “fact.” Regarding what is a fact , Peter F. Strawson wrote: “facts are what statements (when true) state.” Strawson ( 1950 ), p. 136.

On collective morality, see Rescher ( 2003 ).

As science has a role in shaping technology, its values also have a role for values in technology, cf. Gonzalez ( 2015b ).

The use of mathematics for the heuristic function through predictions is what focused attention during the pandemic generated by the Covid-19 virus. Various mathematical models have been used to anticipate the possible future course of the disease at a general level and in each country. They have been used by the World Health Organization to make recommendations and by health authorities in each nation to make decisions.

The philosophical status of mathematical language has also been discussed. In this regard, there are also various philosophical orientations. See, in this regard, the text presented in the 2008 Biennial meeting of the Philosophy of Science Association: Psillos ( 2010 ). A different approach can be found in Azzouni and Bueno ( 2016 ).

It seems to me important to consider that mathematics is also a human activity, which has philosophical consequences. See Gonzalez ( 1991 ).

The idea of diversity in scientific explanations is already present in the influential book of Nagel ( 1961 ). Although, over the years, Salmon made various proposals for scientific explanations, they usually revolved around three elements: preference for the causal explanation over other types of explanations, emphasis on the role of probability, and recognition of the presence of pragmatic elements that modulate explanations. His final proposals can be found in Salmon ( 2002a ); and Salmon ( 2002b )

Cf. Gonzalez ( 2015a ), pp. 66, 192, 219, and 251.

A detailed analysis of the distinction between foresight, prediction, forecast and planning is in Gonzalez ( 2015a ), pp. 68–72.

Imre Lakatos insisted on this point. Cf. Lakatos ( 1970 ). See, in this regard, Gonzalez ( 2001 ).

See Sen ( 1986 ); especially, p. 3; and Gonzalez ( 1998a ).

On the various options for observation and experimentation, see Gonzalez ( 2010 ).

“It is a testament to the machinery of science that so much has been learned about covid-19 so rapidly. Since January the number of publications has been doubling every 14 days, reaching 1363 in the past week alone. They have covered everything from the genetics of the virus that causes the disease to computer models of its spread and the scope for vaccines and treatments.” The Economist ( 2020 ).

Current positions on this issue can be found in Gonzalez ( 2020c ).

That science investigates according to a methodological diversity and according to scales of reality, with epistemological differences according to levels is increasingly assumed. Cf. Gonzalez ( 2020d ).

To date, treatment of Covid-19 has often been completely individualized, if not purely ad hoc through trial and error, due to the absence of previous well-founded studies offering well-contrasted solutions to the disease.

de Regt ( 2017 ), p. 12; see also ( 2017 ), pp. 45 and 88.

There has been a very intense debate on the existence and characteristics of the scientific revolutions. After the very influential book by Thomas Kuhn on The Structure of Scientific Revolutions , an important contribution was made in Thagard ( 1992 ). See, in this regard, Gonzalez ( 2011a ).

Cf. Gonzalez ( 2011b , 2013c ).

In the case of sciences of Internet the novelty is clear, cf. Hall et al. ( 2016 ). The design of the network itself is clearly new, cf. Clark ( 2018 ). Overall, it can be said that we are in a new historical stage, which Luciano Floridi calls “hyperhistory,” cf. Floridi ( 2014 ).

“As the deluge of work on covid-19 has shown, fast, free-flowing scientific information is vital for progress. The virus has changed the way scientists do their work and talk to each other, we hope for good.” The Economist ( 2020 ).

At least three philosophical-methodological stages can be distinguished in Kuhn’s publications, cf. Gonzalez ( 2004b ), especially, pp. 48–66.

As a result of the criticism received in the first stage, Kuhn introduced a series of relevant philosophical-methodological changes in the second period. Among them was the prominent role of exemplars, as characteristic solutions to problems posed and accepted as such by the scientific community. In this regard, Kuhn’s second stage — with the exemplars as a route to learn scientific theories — has been associated with concept characterizations within the framework of cognitive psychology. It is clear that the first stage was under the influence of the Gestalt psychological school he knew. Later, that school moved towards a characterization of concepts more in tune with classical positions, where concepts represent features that are typical of a defined class of objects. Cf. Andersen et al. ( 2006 ). A critical analysis of the book can be found at Thagard ( 2009 ).

Cf. Kuhn ([1962] 1970 ), p. 127.

“Paradigm changes do cause scientists to see the world of their research-engagement differently,” Kuhn ([1962] 1970 ), p. 111.

Analytical philosophers who have dealt with perception include Gottlob Frege and Peter F. Strawson. For the former, see the chapter “Frege on Perception,” in Dummett ( 1993 [reprinted 1998]), pp. 84–98. For the second, see Strawson ( 1961 ) and Strawson ( 1979 ).

The chemical revolution receives special attention in Thagard’s perspective on conceptual change. Cf. Thagard ( 1992 ), pp. 34–61; especially, pp. 39–47. An analysis of Thagard’s conceptual revolutions and the need for new aspects can be found in Gonzalez ( 2011a ), pp. 15–21. A review of Thagard’s book is available in Gonzalez ( 1996 ).

For processes from an ontological viewpoint, see Rescher ( 1996 ).

This is the origin of the friendly controversy with Peter Strawson on the characteristics of the concepts. It started with Gonzalez ( 1998b ). The matter went on with his answer: Strawson ( 1998 ). It was then completed in another subsequent paper: Gonzalez ( 2003 ).

The central role of objectivity in the search for truth in science is emphasized in Gonzalez ( 2020b ).

That we do things with words is something that initially discussed by several analytical philosophers: Austin ( 1962 ); Strawson ( 1970 ); and Searle ( 1969 ).

It is interesting that John L. Austin translated into English (with reproduction of the German text) a very important Frege book: Frege ( 1884 ). It is also worth remembering the volume that Strawson edited related to thought and action: Strawson ( 1968 ).

On the relations between science and technology, with their consequent philosophical-methodological differences, see Gonzalez ( 2005 ).

This book is added to the books coming from previous congresses, which are grouped in the Gallaecia Series : Studies in Contemporary Philosophy and Methodology of Science : Progreso científico e innovación tecnológica , 1997; El Pensamiento de L. Laudan. Relaciones entre Historia de la Ciencia y Filosofía de la Ciencia 1998; Ciencia y valores éticos , 1999; Problemas filosóficos y metodológicos de la Economía en la Sociedad tecnológica actual , 2000; La Filosofía de Imre Lakatos : Evaluación de sus propuestas , 2001; Diversidad de la explicación científica , 2002; Análisis de Thomas Kuhn : Las revoluciones científicas , 2004; Karl Popper : Revisión de su legado , 2004; Science, Technology and Society : A Philosophical Perspective , 2005; Evolutionism : Present Approaches , 2008; Evolucionismo : Darwin y enfoques actuales , 2009; New Methodological Perspectives on Observation and Experimentation in Science , 2010; Conceptual Revolutions : From Cognitive Science to Medicine , 2011; Scientific Realism and Democratic Society : The Philosophy of Philip Kitcher , 2011; Las Ciencias de la Complejidad : Vertiente dinámica de las Ciencias de Diseño y sobriedad de factores , 2012, Creativity, Innovation, and Complexity in Science , 2013; Bas van Fraassen’s Approach to Representation and Models in Science , 2014; New Perspectives on Technology, Values, and Ethics : Theoretical and Practical, 2015; The Limits of Science : An Analysis from “Barriers” to “Confines” , 2016; Artificial Intelligence and Contemporary Society : The Role of Information , 2017; Philosophy of Psychology : Causality and Psychological Subject. New Reflections on James Woodward’s Contribution , 2018; and Methodological Prospects for Scientific Research : From Pragmatism to Pluralism , 2020.

Andersen, H., Barker, P., & Chen, X. (2006). The cognitive structure of scientific revolutions . Cambridge: Cambridge University Press.

Book   Google Scholar  

Austin, J. L. (1962). How to do things with words , edited by J. O. Urmson and Marina Sbisà. Oxford: Clarendon Press.

Google Scholar  

Azzouni, J., & Bueno, O. (2016). True nominalism: Referring versus coding. British Journal for the Philosophy of Science, 67 (3), 781–816.

Article   Google Scholar  

Carnap, R. (1931). Die logizistische Grunlegung der Mathematik. Erkenntnis, 2 , 91–121.

Chang, H. (2014). Epistemic activities and systems of practice: Units of analysis in philosophy of science after the practice turn. In L. Soler, S. Zwart, M. Lynch, & V. Israel-Jost (Eds.), Science after the practice turn in the philosophy, history and social studies of science (pp. 67–79). New York: Routledge.

Clark, D. D. (2018). Designing an internet . Cambridge, MA: The MIT Press.

de Regt, H. W. (2017). Understanding scientific understanding . Oxford: Oxford University Press.

Dummett, M. (1975). What is a theory of meaning? (I). In S. Guttenplam (Ed.), Mind and language (pp. 97–138). Oxford: Clarendon Press.

Dummett, M. (1976). What is a theory of meaning? (II). In G. Evans & J. McDowell (Eds.), Truth and meaning (pp. 67–137). Oxford: Clarendon Press.

Dummett, M. (1977). Elements of intuitionism . Oxford: Clarendon Press.

Dummett, M. ([1973] 1981). Frege: Philosophy of language . London: Duckworth, 2nd ed. (1st ed. 1973).

Dummett, M. (1981a). The interpretation of Frege’s philosophy . London: Duckworth.

Dummett, M. (1981b). Frege and Wittgenstein. In I. Block (Ed.), Perspectives on the philosophy of Wittgenstein (pp. 31–42). Oxford: Blackwell.

Dummett, M. (1991). Frege: Philosophy of mathematics . London: Duckworth.

Dummett, M. (1993). Origins of analytical philosophy (reprinted 1998). Cambridge, MA: Harvard University Press.

Floridi, L. (2014). The Fourth revolution - How the infosphere is reshaping human reality . Oxford: Oxford University Press.

Frege, G. (1884). Die Grundlagen der Arithmetik. Eine logischmathematische Untersuchung über den Begriff der Zahl . Breslau: Koebner. Translated into English by J. L. Austin (1950). The foundations of arithmetic (with reproduction of the German text). Oxford: B. Blackwell, (reprinted in 1978).

Frege, G. (1892). Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik , 100 , 25–50. Reprinted in G. Frege (1967). Kleine Schriften , (pp. 143–162), edited by I. Angelelli. Darmstadt: Wissenschaftliche Buchgesellschaft.

Frege, G. (1918). Der Gedanke. Breiträge zur Philosophie des deutschen Idealismus , 1 , 58–77. Reprinted in G. Frege (1967). Kleine Schriften (pp. 342–362) edited by I. Angelelli. Darmstadt: Wissenschaftliche Buchgesellschaft.

Gadamer, H. G. (1960). Wahrheit and Methode . Tübingen: J. C. B. Mohr.

Gadamer, H. G. (1975). Hermeneutics and social science. Cultural Hermeneutics, 2 , 307–316.

Gillies, D. A. (2012). The use of mathematics in physics and economics: A comparison. In D. Dieks, W. J. Gonzalez, S. Hartmann, M. Stöltzner, & M. Weber (Eds.), Probabilities, laws, and structures (pp. 351–362). Dordrecht: Springer.

Chapter   Google Scholar  

Gonzalez, W. J. (1986a). La Teoría de la Referencia. Strawson y la Filosofía Analítica . Salamanca-Murcia: Ediciones Universidad de Salamanca y Publicaciones de la Universidad de Murcia.

Gonzalez, W. J. (1986b). El problema de la referencia en la Filosofía Analítica. Estudio bibliográfico. Thémata, 3 , 169–213.

Gonzalez, W. J. (1991). Mathematics as activity. Daimon, 3 , 113–130.

Gonzalez, W. J. (1996). Towards a new framework for revolutions in science. Studies in History and Philosophy of Science, 27 (4), 607–625.

Gonzalez, W. J. (1998a). Prediction and prescription in economics: A philosophical and methodological approach. Theoria: An International Journal for Theory, History and Foundations of Science, 13 (32), 321–345.

Gonzalez, W. J. (1998b). P. F. Strawson’s moderate empiricism: The philosophical basis of his approach in theory of knowledge. In L. E. Hahn (Ed.), The philosophy of P. F. Strawson (pp. 329–358). Open Court, La Salle: The Library of Living Philosophers.

Gonzalez, W. J. (1999). Ciencia y valores éticos: De la posibilidad de la Ética de la Ciencia al problema de la valoración ética de la Ciencia Básica. Arbor, 162 (638), 139–171.

Gonzalez, W. J. (2001). Lakatos’s approach on prediction and novel facts. Theoria: An International Journal for Theory, History and Foundations of Science, 16 (42), 499–518.

Gonzalez, W. J. (2002). Caracterización de la “explicación científica” y tipos de explicaciones científicas. In W. J. Gonzalez (Ed.), Diversidad de la explicación científica (pp. 13–49). Barcelona: Ariel.

Gonzalez, W. J. (2003). El empirismo moderado en Filosofía Analítica: Una réplica a P. F. Strawson. In J. L. Falguera, A. J. T. Zilhão, C. Martínez, & J. M. Sagüillo (Eds.), Palabras y pensamientos: Una mirada analítica / Palavras e Pensamentos: Uma perspectiva analítica (pp. 207–237). Santiago de Compostela: Publicaciones Universidad de Santiago.

Gonzalez, W. J. (2004a). La evolución del pensamiento de Popper. In W. J. Gonzalez (Ed.), Karl Popper: Revisión de su legado (pp. 23–194). Madrid: Unión Editorial.

Gonzalez, W. J. (2004b). Las revoluciones científicas y la evolución de Thomas S. Kuhn. In W. J. Gonzalez (Ed.), Análisis de Thomas Kuhn: Las revoluciones científicas (pp. 15–103). Madrid: Trotta.

Gonzalez, W. J. (2005). The philosophical approach to science, technology and society. In W. J. Gonzalez (Ed.), Science, technology and society: A philosophical perspective (pp. 3–49). A Coruña: Netbiblo.

Gonzalez, W. J. (2006a). Novelty and Continuity in philosophy and methodology of science. In W. J. Gonzalez & J. Alcolea (Eds.), Contemporary perspectives in philosophy and methodology of science (pp. 1–28). A Coruña: Netbiblo.

Gonzalez, W. J. (2006b). Prediction as scientific test of economics. In W. J. Gonzalez & J. Alcolea (Eds.), Contemporary perspectives in philosophy and methodology of science (pp. 83–112). A Coruña: Netbiblo.

Gonzalez, W. J. (2008a). Economic values in the configuration of science. In E. Agazzi, J. Echeverria, & A. Gomez (Eds.), Epistemology and the social . Poznan Studies in the Philosophy of the Sciences and the Humanities (pp. 85–112). Amsterdam: Rodopi.

Gonzalez, W. J. (2008b). Evolutionism from a contemporary viewpoint: The philosophical-methodological approach. In W. J. Gonzalez (Ed.), Evolutionism: Present approaches (pp. 3–59). A Coruña: Netbiblo.

Gonzalez, W. J. (2008c). Rationality and prediction in the sciences of the artificial: Economics as a design science. In M. C. Galavotti, R. Scazzieri, & P. Suppes (Eds.), Reasoning, rationality, and probability (pp. 165–186). Stanford: CSLI Publications.

Gonzalez, W. J. (2010). Recent approaches on observation and experimentation: A philosophical-methodological viewpoint. In W. J. Gonzalez (Ed.), New methodological perspectives on observation and experimentation in science (pp. 9–48). A Coruña: Netbiblo.

Gonzalez, W. J. (2011a). The problem of conceptual revolutions at the present stage. In W. J. Gonzalez (Ed.), Conceptual revolutions: From cognitive science to medicine (pp. 7–38). A Coruña: Netbiblo.

Gonzalez, W. J. (2011b). Conceptual changes and scientific diversity: The role of historicity. In W. J. Gonzalez (Ed.), Conceptual revolutions: From cognitive science to medicine (pp. 39–62). A Coruña: Netbiblo.

Gonzalez, W. J. (2012). Methodological universalism in science and its limits: Imperialism versus complexity. In K. Brzechczyn & K. Paprzycka (Eds.), Thinking about provincialism in thinking (Poznan Studies in the Philosophy of the Sciences and the Humanities, vol. 100) (pp. 155–175). Amsterdam and New York: Rodopi.

Gonzalez, W. J. (2013a). Value Ladenness and the value-free ideal in scientific research. In C. Lütge (Ed.), Handbook of the philosophical foundations of business ethics (pp. 1503–1521). Dordrecht: Springer.

Gonzalez, W. J. (2013b). The roles of scientific creativity and technological innovation in the context of complexity of science. In W. J. Gonzalez (Ed.), Creativity, innovation, and complexity in science (pp. 11–40). A Coruña: Netbiblo.

Gonzalez, W. J. (2013c). The sciences of design as sciences of complexity: The dynamic trait. In H. Andersen, D. Dieks, W. J. Gonzalez, T. Uebel, & G. Wheeler (Eds.), New challenges to philosophy of science (pp. 299–311). Dordrecht: Springer.

Gonzalez, W. J. (2015a). Philosophico-methodological analysis of prediction and its role in economics . Dordrecht: Springer.

Gonzalez, W. J. (2015b). On the role of values in the configuration of technology: From axiology to ethics. In W. J. Gonzalez (Ed.), New perspectives on technology, values, and ethics: Theoretical and practical (Boston Studies in the Philosophy and History of Science) (pp. 3–27). Dordrecht: Springer.

Gonzalez, W. J. (2020a). Novelty in scientific realism: New approaches to an ongoing debate. In W. J. Gonzalez (Ed.), New approaches to scientific realism (pp. 1–23). Boston and Berlin: De Gruyter. https://doi.org/10.1515/9783110664737-001 .

Gonzalez, W. J. (2020b). Pragmatic realism and scientific prediction: The role of complexity. In W. J. Gonzalez (Ed.), New approaches to scientific realism (pp. 251–287). Boston and Berlin: De Gruyter. https://doi.org/10.1515/9783110664737-012 .

Gonzalez, W. J. (2020c). Pragmatism and pluralism as methodological alternatives to monism, reductionism and universalism. In W. J. Gonzalez (Ed.), Methodological prospects for scientific research: From pragmatism to pluralism , Synthese Library (pp. 1–18). Cham: Springer.

Gonzalez, W. J. (2020d). Levels of reality, complexity, and approaches to scientific method. In W. J. Gonzalez (Ed.), Methodological prospects for scientific research: From pragmatism to pluralism , Synthese Library (pp. 21–51). Cham: Springer.

Gonzalez, W. J. (2021). Semantics of science and theory of reference: An analysis of the role of language in basic science and applied science. In W. J. Gonzalez (Ed.), Language and scientific research (pp. 41–92). Cham: Palgrave Macmillan.

Gonzalez, W. J., & Arrojo, M. J. (2019). Complexity in the sciences of the internet and its relation to communication sciences. Empedocles: European Journal for the Philosophy of Communication, 10 (1), 15–33. https://doi.org/10.1386/ejpc.10.1.15_1 .

Hacking, I. (1999). The social construction of what? Cambridge, MA: Harvard University Press.

Hall, W., Hendler, J., & Staab, S. (2016). A manifesto for Web science @10 , 1–4. Retrieved May 16, 2018, from http://www.webscience.org/manifesto .

Hendler, J., & Golbeck, J. (2008). Metcalfe’s law, web 2.0, and the semantic web. Journal Web Semantics: Science, Services and Agents on the World Wide Web, 6 (1), 14–20.

Hendler, J., & Hall, W. (2016). Science of the world wide web. Science, 354 (6313), 703–704.

Hendry, D. F. (2012). Mathematical models and economic forecasting. Some uses and mis-uses of mathematics in economics. In D. Dieks, W. J. Gonzalez, S. Hartmann, M. Stöltzner, & M. Weber (Eds.), Probabilities, laws, and structures (pp. 319–335). Dordrecht: Springer.

Husserl, E. (1901 and 1902). Logische Untersuchungen . Max Niemeyer, Hall a.S.: Max Niemeyer, Erster Tail, 1901, und Hall a.S.: Max Niemeyer, Zweiter Tail, 1902.

Kuhn, Th. S. ([1962] 1970). The structure of scientific revolutions . Chicago: The University of Chicago Press.

Kuhn, Th. S. ([1983] 2000). Commensurability, comparability, communicability. In P. D. Asquith & Th. Nickles (Eds.), PSA 1982. Proceedings of the 1982 biennial meeting of the Philosophy of Science Association (pp. 669–688), vol. 2, Philosophy of Science Association. East Lansing, MI; reprinted in Th. S. Kuhn (2000), The road since structure: Philosophical essays, 1970–1993, with an autobiographical interview (pp. 33–53). Chicago: University of Chicago Press.

Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos & A. Musgrave (Ed.), Criticism and the growth of knowledge (pp. 91–196). Cambridge: Cambridge University Press; reprinted in I. Lakatos (1978), The methodology of scientific research programmes. Philosophical papers , vol. 1 (pp. 8–101). Cambridge: Cambridge University Press.

Melo-Martin, I., & Intemann, K. (2018). The fight against doubt: How to bridge the gap between scientists and the public . Oxford: Oxford University Press.

Morrison, M. (2015). Reconstructing reality. Models, mathematics, and simulations . New York: Oxford University Press.

Nagel, E. (1961). The structure of science. Problems in the logic of scientific explanation . New York: Harcourt, Brace and World.

Niiniluoto, I. (1993). The aim and structure of applied research. Erkenntnis, 38 (1), 1–21.

Niiniluoto, I. (1995). Approximation in applied science. Poznan Studies in the Philosophy of the Sciences and the Humanities, 42 , 127–139.

Niiniluoto, I. (2020). Interdisciplinarity from the perspective of critical scientific realism. In W. J. Gonzalez (Ed.), New approaches to scientific realism (pp. 231–250). Boston and Berlin: De Gruyter.

Popper, K. R. (1935). Logik der Forschung . Vienna: Julius Springer Verlag.

Popper, K. R. (1945a). The open society and its enemies . Vol. 1: The spell of Plato . London: George Routledge and Sons.

Popper, K. R. (1945b). The open society and its enemies . Vol. 2: The high tide of prophecy: Hegel, Marx and the aftermath . London: George Routledge and Sons.

Popper, K. R. (1957). The poverty of historicism . London: Routledge and Kegan.

Psillos, S. (2010). Scientific realism: Between Platonism and nominalism. Philosophy of Science, 77 (5), 947–958.

Putnam, H. (1962). What theories are not. In E. Nagel, P. Suppes, & A. Tarski (Eds.), Logic, methodology and philosophy of science (pp. 240–251). Stanford: Stanford University Press.

Rescher, N. (1977). Methodological pragmatism: A systems-theoretical approach to the theory of knowledge . Oxford: Blackwell; New York: New York University Press.

Rescher, N. (1996). Process metaphysics . Albany, NY: State University New York Press.

Rescher, N. (1999). Razón y valores en la Era científico-tecnológica . Barcelona: Paidós.

Rescher, N. (2003). Collective responsibility. In N. Rescher (Ed.), Sensible decisions. Issues of rational decision in personal choice and public policy (pp. 125–138). Lanham, MD: Rowman and Littlefield.

Salmon, M. H. (1992). Philosophy of the social sciences. In M. H. Salmon et al. (Eds.), Introduction to the philosophy of science (pp. 404–425). Englewood Cliffs, NJ: Prentice Hall.

Salmon, W. C. (1990). Four decades of scientific explanation . Minneapolis: University of Minnesota Press.

Salmon, W. C. (1998). Causality and explanation. N . York: Oxford University Press.

Salmon, W. C. (2002a). Explicación causal frente a no causal. In W. J. Gonzalez (Ed.), Diversidad de la explicación científica (pp. 97–115). Barcelona: Ariel.

Salmon, W. C. (2002b). Estructura de la explicación causal. In W. J. Gonzalez (Ed.), Diversidad de la explicación científica (pp. 141–159). Barcelona: Ariel.

Sankey, H. (2020). Scientific realism and the conflict with common sense. In W. J. Gonzalez (Ed.), New approaches to scientific realism (pp. 68–83). Boston and Berlin: De Gruyter.

Searle, J. R. (1969). Speech acts: An essay in the philosophy of language . Cambridge: Cambridge University Press.

Sen, A. (1986). Prediction and economic theory. In J. Mason, P. Mathias, & J. H. Westcott (Eds.), Predictability in science and society (pp. 3–23). London: The Royal Society and The British Academy.

Simon, H. A. (1991). Models of my life . New York: Basic Books (reprinted in The MIT Press, Cambridge, MA, 1996).

Simon, H. A. (1996). The sciences of the artificial (3rd ed.). Cambridge, MA: The MIT Press, (1st ed., 1969; 2nd ed., 1981).

Simon, H. A. ([1990] 1997). Prediction and prescription in systems modeling. Operations Research , 38 , 7–14; reprinted in H. A. Simon, Models of bounded rationality . Vol. 3: Empirically grounded economic reason (pp. 115–128). Cambridge, MA: The MIT Press.

Strawson, P. F. (1950). Truth (II). Proceedings of the Aristotelian Society , 24 , 129-156.

Strawson, P. F. (1961). Perception and identification. Proceeding of the Aristotelian Society , 35 , 81–120. Reprinted in P. F. Strawson (1974), Freedom and resentment and other essays (pp. 85–107). London: Methuen.

Strawson, P. F. (Ed.). (1968). Studies in the philosophy of thought and action . Oxford: Oxford University Press.

Strawson, P. F. (1970). Phrase et acte de parole. Langages, 17 , 19–33.

Strawson, P. F. (1979). Perception and its objects. In G. F. Macdonald (Ed.), Perception and identity (pp. 41–60). London: Macmillan.

Strawson, P. F. (1998). Reply to Wenceslao J. Gonzalez. In L. E. Hahn (Ed.), The philosophy of P. F. Strawson. The Library of Living Philosophers (pp. 359–360). La Salle, IL: Open Court.

Suppe, F. (1974). The search for philosophic understanding of scientific theories. In F. Suppe (Ed.), The structure of scientific theories (pp. 1–241). Urbana, IL: University of Illinois Press, (2nd ed. 1977).

Suppes, P. (1981). The plurality of science. In P. Asquith & I. Hacking (eds.), PSA 1978 , Philosophy of Science Association, vol. 2 (pp. 3–16). East Lansing, MI: Philosophy of Science Association. (It was reprinted in Suppes, P. (1984). Probabilistic metaphysics . Oxford: B. Blackwell, Oxford (reprint in 1985), pp. 118–134, and in Suppes, P. (1993). Models and methods in the philosophy of science: Selected essays (pp. 41–54). Dordrecht: Kluwer.)

Thagard, P. (1992). Conceptual revolutions . Princeton: Princeton University Press.

Thagard, P. (2009). The cognitive structure of scientific revolutions. British Journal for the Philosophy of Science, 60 (4), 843–847.

The Economist. (2020, May 9). High-speed science. The pandemic has caused scientists to work faster. That should be welcomed, p. 10. Section Leaders .

Tiropanis, T., Hall, W., Crowcroft, J., Contractor, N., & Tassiulas, L. (2015). Network science, Web science, and Internet science. Communications of ACM, 58 (8), 76–82.

Toulmin, S. E. (1953): The philosophy of science. An introduction , London: Hutchinson University Library (3rd reprint, 1957).

Toulmin, S. E. (1971). From logical systems to conceptual populations. In R. C. Buck & R. S. Cohen (Eds.), In memory of R. Carnap (pp. 552–564). Dordrecht: Reidel.

Download references

Author information

Authors and affiliations.

Center for Research in Philosophy of Science and Technology, University of A Coruña, Ferrol, Spain

Wenceslao J. Gonzalez

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Wenceslao J. Gonzalez .

Editor information

Editors and affiliations, rights and permissions.

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Gonzalez, W.J. (2021). The Relevance of Language for Scientific Research. In: Gonzalez, W.J. (eds) Language and Scientific Research. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-60537-7_1

Download citation

DOI : https://doi.org/10.1007/978-3-030-60537-7_1

Published : 28 April 2021

Publisher Name : Palgrave Macmillan, Cham

Print ISBN : 978-3-030-60536-0

Online ISBN : 978-3-030-60537-7

eBook Packages : Religion and Philosophy Philosophy and Religion (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Popular searches

  • How to Get Participants For Your Study
  • How to Do Segmentation?
  • Conjoint Preference Share Simulator
  • MaxDiff Analysis
  • Likert Scales
  • Reliability & Validity

Request consultation

Do you need support in running a pricing or product study? We can help you with agile consumer research and conjoint analysis.

Looking for an online survey platform?

Conjointly offers a great survey tool with multiple question types, randomisation blocks, and multilingual support. The Basic tier is always free.

Research Methods Knowledge Base

  • Navigating the Knowledge Base
  • Five Big Words
  • Types of Research Questions
  • Time in Research
  • Types of Relationships
  • Types of Data
  • Unit of Analysis
  • Two Research Fallacies
  • Philosophy of Research
  • Ethics in Research
  • Conceptualizing
  • Evaluation Research
  • Measurement
  • Research Design
  • Table of Contents

Fully-functional online survey tool with various question types, logic, randomisation, and reporting for unlimited number of surveys.

Completely free for academics and students .

Language Of Research

Learning about research is a lot like learning about anything else. To start, you need to learn the jargon people use, the big controversies they fight over, and the different factions that define the major players. We’ll start by considering five really big multi-syllable words that researchers sometimes use to describe what they do. We’ll only do a few for now, to give you an idea of just how esoteric the discussion can get (but not enough to cause you to give up in total despair). We can then take on some of the major issues in research like the types of questions we can ask in a project, the role of time in research , and the different types of relationships we can estimate. Then we have to consider defining some basic terms like variable , hypothesis , data , and unit of analysis . If you’re like me, you hate learning vocabulary, so we’ll quickly move along to consideration of two of the major fallacies of research, just to give you an idea of how wrong even researchers can be if they’re not careful (of course, there’s always a certainly probability that they’ll be wrong even if they’re extremely careful).

Cookie Consent

Conjointly uses essential cookies to make our site work. We also use additional cookies in order to understand the usage of the site, gather audience analytics, and for remarketing purposes.

For more information on Conjointly's use of cookies, please read our Cookie Policy .

Which one are you?

I am new to conjointly, i am already using conjointly.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is Qualitative Research? | Methods & Examples

What Is Qualitative Research? | Methods & Examples

Published on June 19, 2020 by Pritha Bhandari . Revised on September 5, 2024.

Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research.

Qualitative research is the opposite of quantitative research , which involves collecting and analyzing numerical data for statistical analysis.

Qualitative research is commonly used in the humanities and social sciences, in subjects such as anthropology, sociology, education, health sciences, history, etc.

  • How does social media shape body image in teenagers?
  • How do children and adults interpret healthy eating in the UK?
  • What factors influence employee retention in a large organization?
  • How is anxiety experienced around the world?
  • How can teachers integrate social issues into science curriculums?

Table of contents

Approaches to qualitative research, qualitative research methods, qualitative data analysis, advantages of qualitative research, disadvantages of qualitative research, other interesting articles, frequently asked questions about qualitative research.

Qualitative research is used to understand how people experience the world. While there are many approaches to qualitative research, they tend to be flexible and focus on retaining rich meaning when interpreting data.

Common approaches include grounded theory, ethnography , action research , phenomenological research, and narrative research. They share some similarities, but emphasize different aims and perspectives.

Qualitative research approaches
Approach What does it involve?
Grounded theory Researchers collect rich data on a topic of interest and develop theories .
Researchers immerse themselves in groups or organizations to understand their cultures.
Action research Researchers and participants collaboratively link theory to practice to drive social change.
Phenomenological research Researchers investigate a phenomenon or event by describing and interpreting participants’ lived experiences.
Narrative research Researchers examine how stories are told to understand how participants perceive and make sense of their experiences.

Note that qualitative research is at risk for certain research biases including the Hawthorne effect , observer bias , recall bias , and social desirability bias . While not always totally avoidable, awareness of potential biases as you collect and analyze your data can prevent them from impacting your work too much.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

type of research in language

Each of the research approaches involve using one or more data collection methods . These are some of the most common qualitative methods:

  • Observations: recording what you have seen, heard, or encountered in detailed field notes.
  • Interviews:  personally asking people questions in one-on-one conversations.
  • Focus groups: asking questions and generating discussion among a group of people.
  • Surveys : distributing questionnaires with open-ended questions.
  • Secondary research: collecting existing data in the form of texts, images, audio or video recordings, etc.
  • You take field notes with observations and reflect on your own experiences of the company culture.
  • You distribute open-ended surveys to employees across all the company’s offices by email to find out if the culture varies across locations.
  • You conduct in-depth interviews with employees in your office to learn about their experiences and perspectives in greater detail.

Qualitative researchers often consider themselves “instruments” in research because all observations, interpretations and analyses are filtered through their own personal lens.

For this reason, when writing up your methodology for qualitative research, it’s important to reflect on your approach and to thoroughly explain the choices you made in collecting and analyzing the data.

Qualitative data can take the form of texts, photos, videos and audio. For example, you might be working with interview transcripts, survey responses, fieldnotes, or recordings from natural settings.

Most types of qualitative data analysis share the same five steps:

  • Prepare and organize your data. This may mean transcribing interviews or typing up fieldnotes.
  • Review and explore your data. Examine the data for patterns or repeated ideas that emerge.
  • Develop a data coding system. Based on your initial ideas, establish a set of codes that you can apply to categorize your data.
  • Assign codes to the data. For example, in qualitative survey analysis, this may mean going through each participant’s responses and tagging them with codes in a spreadsheet. As you go through your data, you can create new codes to add to your system if necessary.
  • Identify recurring themes. Link codes together into cohesive, overarching themes.

There are several specific approaches to analyzing qualitative data. Although these methods share similar processes, they emphasize different concepts.

Qualitative data analysis
Approach When to use Example
To describe and categorize common words, phrases, and ideas in qualitative data. A market researcher could perform content analysis to find out what kind of language is used in descriptions of therapeutic apps.
To identify and interpret patterns and themes in qualitative data. A psychologist could apply thematic analysis to travel blogs to explore how tourism shapes self-identity.
To examine the content, structure, and design of texts. A media researcher could use textual analysis to understand how news coverage of celebrities has changed in the past decade.
To study communication and how language is used to achieve effects in specific contexts. A political scientist could use discourse analysis to study how politicians generate trust in election campaigns.

Qualitative research often tries to preserve the voice and perspective of participants and can be adjusted as new research questions arise. Qualitative research is good for:

  • Flexibility

The data collection and analysis process can be adapted as new ideas or patterns emerge. They are not rigidly decided beforehand.

  • Natural settings

Data collection occurs in real-world contexts or in naturalistic ways.

  • Meaningful insights

Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing, testing or improving systems or products.

  • Generation of new ideas

Open-ended responses mean that researchers can uncover novel problems or opportunities that they wouldn’t have thought of otherwise.

Researchers must consider practical and theoretical limitations in analyzing and interpreting their data. Qualitative research suffers from:

  • Unreliability

The real-world setting often makes qualitative research unreliable because of uncontrolled factors that affect the data.

  • Subjectivity

Due to the researcher’s primary role in analyzing and interpreting data, qualitative research cannot be replicated . The researcher decides what is important and what is irrelevant in data analysis, so interpretations of the same data can vary greatly.

  • Limited generalizability

Small samples are often used to gather detailed data about specific contexts. Despite rigorous analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased and unrepresentative of the wider population .

  • Labor-intensive

Although software can be used to manage and record large amounts of text, data analysis often has to be checked or performed manually.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Chi square goodness of fit test
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Inclusion and exclusion criteria

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2024, September 05). What Is Qualitative Research? | Methods & Examples. Scribbr. Retrieved September 23, 2024, from https://www.scribbr.com/methodology/qualitative-research/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs. quantitative research | differences, examples & methods, how to do thematic analysis | step-by-step guide & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

This paper is in the following e-collection/theme issue:

Published on 25.9.2024 in Vol 26 (2024)

Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook

Authors of this article:

Author Orcid Image

  • Rawan AlSaad 1 , PhD   ; 
  • Alaa Abd-alrazaq 1 , PhD   ; 
  • Sabri Boughorbel 2 , PhD   ; 
  • Arfan Ahmed 1 , PhD   ; 
  • Max-Antoine Renault 1 , PhD   ; 
  • Rafat Damseh 3 , PhD   ; 
  • Javaid Sheikh 1 , MD  

1 Weill Cornell Medicine-Qatar, Education City, Doha, Qatar

2 Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar

3 Department of Computer Science and Software Engineering, United Arab Emirates University, Al Ain, United Arab Emirates

Corresponding Author:

Rawan AlSaad, PhD

Weill Cornell Medicine-Qatar, Education City

Street 2700

Phone: 974 44928830

Email: [email protected]

In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews), text (eg, clinical notes and research articles), videos (eg, surgical procedures), and omics data (eg, genomics and proteomics). While advancements in large language models (LLMs) have enabled new applications for knowledge retrieval and processing in the medical field, most LLMs remain limited to processing unimodal data, typically text-based content, and often overlook the importance of integrating the diverse data modalities encountered in clinical practice. This paper aims to present a detailed, practical, and solution-oriented perspective on the use of multimodal LLMs (M-LLMs) in the medical field. Our investigation spanned M-LLM foundational principles, current and potential applications, technical and ethical challenges, and future research directions. By connecting these elements, we aimed to provide a comprehensive framework that links diverse aspects of M-LLMs, offering a unified vision for their future in health care. This approach aims to guide both future research and practical implementations of M-LLMs in health care, positioning them as a paradigm shift toward integrated, multimodal data–driven medical practice. We anticipate that this work will spark further discussion and inspire the development of innovative approaches in the next generation of medical M-LLM systems.

Introduction

Large language models (LLMs) are sophisticated machine learning algorithms designed to process, understand, and generate humanlike language, enabling key developments in applications such as automated conversation, text analysis, creative writing, and complex problem-solving [ 1 ]. In health care, LLMs have shown remarkable potential, primarily through their ability to process and analyze textual content [ 2 , 3 ]. These models play a crucial role in assisting with diagnoses as they can efficiently process extensive textual patient histories and vast medical literature, providing clinicians with valuable insights [ 4 - 7 ]. However, most current LLMs are primarily limited to processing and generating textual content. While this unimodal focus on text-based operation has been transformative in the medical field, it does not fully capture the complex and diverse nature of health care practice [ 8 ].

In health care, diagnosing and treating a patient often involves a health care professional engaging in a comprehensive approach: listening to the patient, reviewing their health records, examining medical images, and analyzing laboratory test results—and all this over time. This multidimensional process exceeds the capabilities of current unimodal LLM systems. Moreover, nontextual data types play a crucial role in diagnosis, effective treatment planning, research, and patient care [ 9 - 11 ]. Such data may include medical imaging (eg, x-rays, magnetic resonance imaging [MRI], computed tomography [CT] scans, positron emission tomography scans, and pathology slides), electrophysiological data (eg, electrocardiography, electroencephalography (EEG), and electromyography), sensory data (eg, data from sensors of medical devices, such as pacemakers and continuous glucose monitors), videos (eg, recordings of surgeries, procedures, and patient interactions), omics data (eg, genomics, proteomics, metabolomics, and transcriptomics), and audio data (eg, recordings of patient interviews and heart and respiratory sounds).

The introduction of LLMs has been a key development in the field of artificial intelligence (AI) and natural language processing (NLP). In 2010, the emergence of deep learning revolutionized LLMs. Recurrent neural networks (RNNs), particularly long short-term memory (LSTM) networks [ 12 ], allowed models to better capture sequential data and context. However, the major breakthrough occurred in 2017 with the introduction of transformer models [ 13 ], which are widely used for NLP tasks. A transformer is a type of neural network architecture that uses a self-attention mechanism to capture long-range dependencies between words in a sentence. While the computation in architectures such as RNNs and LSTM networks is sequential and slow for long sequences [ 14 ], self-attention can be parallelized and made highly scalable. Transformers have been widely trained using 2 objectives. The first objective is mask language modeling (MLM), where the objective is to learn text reconstruction by randomly masking several words in text (eg, 10%) and update the transformer weights toward this goal. Encoder transformers such as Bidirectional Encoder Representations From Transformers (BERT) [ 15 ] have been trained with the MLM objective. The second widely used objective is the next word prediction or causal language modeling. The self-attention mechanism is masked such that, at each position in the sequence, the model is able to attend only to the left words. This modeling approach mimics how text is read by humans in one direction. The self-attention mechanism allows for the computation of the probability of predicting the next word in a document by attending to the most relevant parts of the input sequence [ 13 , 16 ]. By applying the prediction autoregressively, the transformer model performs a text completion task by generating multiple words. Interestingly, transformers extend beyond just handling natural language data. They can effectively compute representations for various data types provided these can be represented as a sequence of tokens. The letters are elementary entities that constitute the sequences. The unique set of tokens represents the vocabulary. For example, in DNA sequence, each nucleotide could be represented by tokens from the vocabulary of 4 tokens: A, C, G, and T. This capability includes processing elements such as video frames, audio spectrograms, time-series data, code snippets, or protein sequences. BERT [ 15 ] is among the first major models to use transformers. Subsequently, a series of medical BERT models were proposed to accelerate medical research [ 6 , 17 - 20 ].

In 2022, OpenAI released ChatGPT (GPT-3.5), a significant iteration in the generative pretrained transformer (GPT) series [ 21 ]. As an LLM, ChatGPT has been trained on a vast collection of text data, which enables it to generate humanlike responses across a broad spectrum of topics and formats. ChatGPT has also shown its potential to become a valuable resource in health care, making significant contributions to various medical applications. It provided opportunities for advancing diagnostic accuracy, personalized treatment planning, and medical research, as well as optimizing health care administration and enhancing communication in patient care [ 22 - 28 ]. In addition, several open-source LLMs such as LLaMA [ 29 ], Flan-T5 [ 30 ], Vicuna [ 31 ], and Alpaca [ 32 ] have substantially driven progress and contributed to the field of LLMs. Although these LLM systems have achieved considerable success, they are predominantly limited to single data types. This limitation makes them less effective for the multimodal nature of medicine, where handling multiple data types is often required. Therefore, considerable efforts have been dedicated to creating LLMs that handle multimodal inputs and tasks, ultimately leading to the development of multimodal LLMs (M-LLMs). In 2023, OpenAI released GPT-4, an M-LLM with the dual capability to process and respond to both text and images. Following the release of GPT-4, several medically adapted versions of this model have been developed [ 33 - 37 ]. These specialized versions of GPT-4 have been tailored to interpret medical data, understand patient queries, and assist in diagnostic processes using both text and image modalities. Building on these insights, M-LLMs are increasingly recognized as systems capable of integrating various data types to facilitate comprehensive patient assessments, ensuring accurate diagnoses. In addition, they hold the potential to streamline operations, significantly improving efficiency in both clinical and administrative tasks. Most importantly, with appropriate oversight, M-LLMs could provide personalized care by tailoring treatment plans to meet the individual needs of patients, thereby enhancing the quality of health care services.

Recent studies [ 38 , 39 ] have explored the capabilities of M-LLMs within the health care sector. However, these studies exhibit several limitations. First, the range of data modalities examined is often restricted to text, images, videos, and audio [ 38 ], with some studies focusing narrowly on a limited number of clinical applications [ 39 ]. Second, the discussion regarding the potential uses of M-LLMs in health care is largely theoretical [ 38 ], leading to a significant gap in demonstrating their application in actual health care environments. Third, although the challenges of integrating diverse data types into M-LLMs are recognized, there is limited exploration of possible solutions or ongoing research aimed at overcoming these technical barriers [ 38 , 39 ].

This paper aims to present a detailed, practical, and solution-oriented perspective on the use of M-LLMs in the medical field. We unify the discussion by focusing on how M-LLMs can serve as a transformative tool that integrates various data modalities to enhance health care outcomes. Specifically, we aim to (1) broaden the analysis of M-LLM applications in health care to include additional data modalities, such as time-series data and omics data, alongside conventional modalities such as images, text, audio, and video; (2) highlight practical examples in which M-LLMs have been or could be effectively applied in health care settings; (3) outline current technological advancements to address the technical and ethical challenges; and (4) propose future research directions to fully exploit the capabilities of M-LLMs. Our unique contribution lies in providing a comprehensive framework that links these diverse aspects, offering a unified vision for the future of M-LLMs in health care.

Multimodal Learning

In the context of M-LLMs, the term multimodal encompasses a range of scenarios in data processing and interpretation. First, it refers to LLMs in which the input and output to the system involve different modalities, such as text-to-image or image-to-text conversions. Second, it describes LLM systems capable of handling inputs from multiple modalities, such as those that can process both text and images simultaneously. Finally, multimodality characterizes systems designed to generate outputs in >1 modality, such as systems capable of producing both textual and image-based content [ 40 ].

Several previous works have developed basic M-LLMs by aligning the well-trained encoders from different modalities with the textual feature space of LLMs. This approach enables LLMs to process inputs other than text, as seen in various examples [ 41 - 44 ]. For instance, Flamingo [ 45 ] uses a cross-attention layer to link a frozen image encoder with LLMs. LLaVA [ 46 ] uses a basic projection method to incorporate image features into the word embedding space. Similarly, models such as Video-Chat [ 47 ] and Video-LLaMA [ 48 ] are designed for video comprehension, whereas SpeechGPT [ 49 ] is tailored for audio processing. A notable example is PandaGPT [ 50 ], which uniquely manages to interpret 6 different modalities at the same time, achieved through the integration of a multimodal encoder known as ImageBind [ 51 ].

Despite numerous efforts focusing on understanding multimodal content at the input side, there is a significant gap in the ability to produce outputs in various modalities beyond textual content. This underscores the importance of developing any-to-any M-LLMs, which are crucial for realizing real artificial general intelligence (AGI) [ 52 , 53 ]. Such models should be capable of receiving inputs in any form and providing responses in the appropriate form of any modality.

From Unimodal Limitations to Multimodal Solutions

Unimodal LLMs generate content in the same modality as that in which they receive inputs, typically text, whereas M-LLMs are capable of processing inputs from various modalities and delivering outputs across multiple modalities, as illustrated in Figure 1 . Despite their remarkable abilities, unimodal LLMs in medicine have inherent limitations that can be effectively overcome by shifting toward multimodal systems. In Table 1 , we summarize these limitations in the medical field and illustrate how the integration of a multimodal approach can address these challenges.

type of research in language

Unimodal (text) LLM limitationDescription of unimodal limitationMultimodal LLM solutionDescription of multimodal solution
Lack of diagnostic imaging contextUnimodal LLMs in medicine can only process textual patient data and cannot interpret diagnostic images, which are vital in many clinical scenarios.Integration of diagnostic imaging dataMultimodal models process and integrate diagnostic imaging information (eg, x-rays and MRIs ), improving diagnostic accuracy and patient outcomes.
Inability to analyze temporal dataText LLMs often struggle with interpreting time-series data, such as continuous monitoring data or progression of diseases, which are vital for tracking patient health over time.Time-series data integrationMultimodal systems incorporate and analyze temporal data, such as ECG readings or continuous monitoring data, enabling dynamic tracking of patient health and disease progression.
Absence of auditory data interpretationUnimodal LLMs grapple with audio analysis, which limits their effectiveness in health care applications that rely on processing spoken interactions or auditory signals.Audio data processingMultimodal systems can process and understand audio signals, such as patient verbal descriptions and heartbeats, enhancing diagnostic precision.
Limited comprehension of complex medical scenariosUnimodal LLMs struggle with interpreting complex medical conditions that require a multisensory understanding beyond text.Multisensory data integrationBy processing clinical notes, diagnostic images, and patient audio, multimodal systems offer more comprehensive analyses of complex medical conditions.
Overfitting to clinical textual patternsSole reliance on clinical texts can lead LLMs to overfit to textual anomalies, potentially overlooking critical patient information.Diverse clinical data sourcesDiversifying input types with clinical imaging and audio data allows multimodal systems to increase the number of training data points and, hence, reduce overfitting, enhancing diagnostic reliability.
Bias and ethical concernsUnimodal LLMs, especially text-based ones, can inherit biases and misconceptions present in their training data sets, affecting patient care quality.Richer contextual patient dataMultimodal systems use diverse modalities, including patient interviews and diagnostic images, to provide a broader context that can mitigate biases in clinical decision-making.

a MRI: magnetic resonance imaging.

b ECG: electrocardiography.

Foundational Principles of M-LLMs

The field of M-LLMs is evolving rapidly, with new ideas and methodologies being continuously developed. The training of medical M-LLMs is a complex process designed to effectively integrate and interpret the diverse types of data encountered in clinical practice. Typically, the architecture of an M-LLM system encompasses four key stages ( Figure 2 ): (1) modality-specific encoding, (2) embedding alignment and fusion, (3) contextual understanding and cross-modal interactions, and (4) decision-making or output generation. In addition to these stages, pretraining and fine-tuning processes play a crucial role, interacting with and enhancing each of the aforementioned stages.

This section presents the foundational principles that currently guide the development and functioning of medical M-LLMs. Importantly, the specific architecture of an M-LLM might vary significantly to meet particular requirements, such as the types of data it needs to handle, the tasks it is designed to perform, and the level of interpretability and performance required. Therefore, while the stages outlined provide a high-level overview of an M-LLM system’s architecture, actual implementations may vary widely to accommodate the unique demands of each application. As this field progresses, we anticipate that the guiding principles of medical M-LLMs will continue to be shaped by emerging ideas and technological advancements.

type of research in language

Modality-Specific Encoding

The purpose of this stage is to transform raw data from each modality into a format that the model can understand and process. This involves using modality-specific encoders to encode various data types into rich and informative representations that subsequent components of the M-LLM architecture can effectively leverage. These modality-specific encoders are meticulously trained using extensive data sets of unlabeled information to generate embeddings that accurately encapsulate the data’s content. The encoders are trained in an unsupervised manner using a large collection of data sets. Selecting the appropriate encoding architecture and optimizing the training methodology are imperative and often tailored to the specific characteristics of the data and the requirements of the medical task at hand. For example, image encoders (eg, transformers [ 54 ] and convolutional neural networks (CNNs) [ 55 , 56 ]) are designed to capture fine-grained patterns or anomalies crucial for diagnosis, whereas text encoders (BERT [ 15 ]) aim to comprehend complex medical terminology and patient histories. Similarly, audio encoders (such as WaveNet [ 57 ] and DeepSpeech [ 58 ]) are optimized to distinguish subtle variations in sounds, such as differentiating between normal and abnormal heart or lung sounds. Time-series encoders (such as transformer-based models [ 15 , 59 - 61 ] and LSTM [ 12 ]) are intended to detect critical changes over time, signaling the need for urgent medical intervention. Finally, omics encoders (eg, DeepVariant [ 62 ], Basenji [ 63 ], and DeepCpG [ 64 ]) focus on identifying genetic markers or patterns associated with specific diseases, aiding in the development of targeted therapies.

Embedding Alignment and Fusion

The purpose of this stage is to harmonize the embeddings from different modality-specific encoders into a combined representation that reflects the combined information from all input modalities. This might involve techniques such as concatenation [ 65 ] and attention mechanisms [ 13 ] or more sophisticated methods such as cross-modal attention [ 66 , 67 ] and tensor fusion [ 68 ]. While modality-specific encoding relies solely on unsupervised data, embedding alignment needs annotated data across modalities. Moreover, the alignment mechanism in medical M-LLMs may require incorporating domain-specific knowledge to enhance its understanding and integration of medical data. For example, it might use known relationships between symptoms and diseases or anatomical correlations to better align and interpret the multimodal data. This results in a more accurate, reliable, and clinically relevant synthesis of information.

Contextual Understanding and Cross-Modal Interactions

The objective of this stage is that the M-LLM not only comprehends the individual modalities but also discerns their interrelations and collective contributions to the overall medical analysis or diagnostic task. This necessitates the deployment of advanced neural network architectures, notably, transformers equipped with cross-modal attention mechanisms [ 66 , 67 ]. These mechanisms enable the M-LLM to dynamically prioritize and integrate features across different modalities, enhancing its ability to make informed medical decisions. In addition, attention-based fusion strategies [ 68 ] could be implemented to weigh and integrate information from disparate sources, adjusting the focus of the model according to the contextual relevance of each data point from each modality.

Decision-Making or Output Generation

This component is the actionable end of the model that produces the final output or decision based on the integrated and interpreted multimodal data. This could be a classification layer capable of distinguishing between different medical conditions or a sequence generator for creating detailed medical reports. When encoder architectures are used, the model head layer can be trained for downstream classification tasks. When decoder architectures are used, the model head layer outputs logits of vocabulary tokens that can be applied in an autoregressive manner to synthesize a response. For instance, in diagnostic imaging, the model might analyze combined textual and visual embeddings to identify and categorize pathologies. In treatment recommendation systems, the model could synthesize patient history, current symptoms, and laboratory test results to suggest personalized treatment plans. The effectiveness of this stage depends on the precision of the previous components.

Pretraining and Fine-Tuning

Pretraining and fine-tuning are fundamental processes in the development and optimization of LLMs, including multimodal ones [ 69 ]. They are not just single steps but integral, ongoing processes that influence and enhance all components of an M-LLM system’s architecture. They interact with the 4 previous components of the M-LLM architecture in multiple ways.

Pretraining begins with modality-specific encoders, focusing on learning general features and representations for each modality. For instance, encoders are pretrained on large data sets to understand text, images, or audio before they are combined or applied for specific tasks. Within the embedding alignment and fusion component, pretraining enables models to learn preliminary methods for aligning and integrating embeddings from different modalities, especially in unsupervised or self-supervised setups in which the model is exposed to vast amounts of multimodal data. In the context of understanding and cross-modal interactions, pretraining lays the foundation for learning complex relationships between modalities. As the model is exposed to a wide and varied range of multimodal data, it learns to identify common patterns and interactions. Although pretraining does not directly result in final decisions or outputs for the decision-making or output generation component, it establishes essential capabilities and knowledge. This foundational understanding equips the model to later perform specific tasks more effectively.

Fine-tuning adapts a pretrained model to downstream tasks or data sets. It involves adjusting and optimizing the model’s parameters and architecture components using a smaller, more task-specific data set. The fine-tuned models are capable of following instructions and responding to questions and queries. In the context of M-LLMs, fine-tuning would adjust how individual modalities are encoded, how they are aligned and fused, and how the model makes decisions based on this refined understanding.

Applications

M-LLMs hold transformative potential for numerous medical applications, demonstrating unparalleled proficiency in processing and integrating diverse data types, as shown in Figure 3 . In this section, we discuss the applications of M-LLMs in clinical practice organizing them according to data type. These categories include medical images, temporal data (encompassing time-series and event data), audio, video, text, omics data, and any-to-any M-LLMs. This structured approach enables a thorough exploration of how these models can revolutionize health care practices based on their ability to synthesize and analyze complex multimodal information.

type of research in language

Medical Images

M-LLMs, equipped with advanced capabilities to process and interpret various image modalities, can significantly enhance diagnostic accuracy and efficiency in medical imaging applications. Examples of these image modalities include x-rays, MRI scans, CT scans, positron emission tomography scans, ultrasound images, digital pathology slides, and retinal images. Each modality provides unique insights into the body’s internal structures, facilitating comprehensive analysis and aiding in the early detection, diagnosis, and monitoring of diseases. For example, in radiology, M-LLMs are instrumental in analyzing CT and MRI images to offer precise, quantifiable data for identifying and characterizing anomalies such as tumors, fractures, and signs of chronic diseases. In addition, these models support the generation of automated radiological reports that summarize findings and suggest potential diagnoses. It is also possible to use M-LLM embedding to retrieve similar cases based on keyword searching. Conversely, M-LLMs allow for the annotation and tagging of medical images with keywords. This enables additional analytics applications. Similarly, in pathology, M-LLMs interpret tissue sample slides, identifying disease markers that are often subtle and challenging to discern. In dermatology, M-LLMs apply their image analysis processes to assess photos of skin lesions, aiding in the early detection of skin cancers such as melanoma [ 70 ].

Significant progress has been made in the field of general-domain image-text M-LLMs through the development of models such as GLaMM [ 71 ], Qwen-VL [ 72 ], SpatialVLM [ 73 ], InternVL [ 74 ], Osprey [ 75 ], Vary [ 76 ], ShareGPT4V [ 77 ], OtterHD [ 78 ], LION [ 79 ], SPHINX [ 80 ], BLIVA [ 81 ], SVIT [ 82 ], LLaVA [ 46 ], and CoVLM [ 83 ]. These advancements have been made possible by leveraging billions of image-text pairs predominantly sourced from the public web, enabling these models to analyze and integrate visual and textual information to understand and generate complex and contextually relevant responses. Such M-LLMs with vision capabilities can be adapted for medical imaging applications (eg, LLaVA-Med [ 84 ], PMC-VQA [ 85 ], Med-Flamingo [ 86 ], and PeFoMed [ 87 ]). However, an important question arises regarding whether such general-domain models can deeply understand medical images or whether they simply recognize superficial patterns from extensive pretraining. Previous work [ 88 ] evaluated the performance of a general-domain M-LLM in biomedical image classification tasks. The study aimed to determine whether such M-LLMs can develop usable internal representations of medical images and whether these representations could effectively distinguish between various medical subclasses. The results showed that generalist models can inherently understand medical images and, in some medical contexts, even outperform specialized, task-specific pretraining methods. Therefore, using representations from generalist models may offer a data-effective solution for developing classification models in the medical imaging domain.

Temporal Data

M-LLMs with the ability to process and interpret time-stamped sequences of data offer significant potential in areas such as real-time patient status tracking in intensive care units, longitudinal studies for chronic disease management, and predictive analytics for patient risk assessment. M-LLMs designed with temporal dimensions acquire predictive capability and skills in extrapolating the understanding of medical conditions over time. Temporal data include time-series, spatiotemporal, and event data. For the purpose of this paper, our focus will be on time-series and event data.

Time-series data are a sequence of data points collected or recorded at regular time intervals, with each data point being time-stamped. Examples include a patient’s heart rate recorded over time and continuous glucose monitoring (CGM). In critical care settings, M-LLMs can detect early signs of clinical deterioration, such as sepsis or cardiac events, from continuous monitoring of vital signs. In neurology, M-LLMs process EEG data to detect neurological anomalies, such as seizure patterns.

Event data are a record of discrete actions or occurrences at specific time points or over intervals. Unlike time-series data, they do not have to be regularly timed. Examples include electronic health records (EHRs) detailing various discrete events in a patient’s medical history, such as physician visits, hospital admissions, and prescription records, or sensor data recording specific occurrences, such as motion sensors being triggered with movement. Each event is time-stamped but does not occur at regular intervals. M-LLMs are instrumental in extracting meaningful insights from EHRs, which encompass diverse and nonregularly timed medical events [ 89 ]. M-LLMs can analyze the sequence and context of these events, providing a comprehensive understanding of a patient’s medical history. This analysis can lead to more accurate diagnoses, tailored treatment strategies, and improved management of chronic conditions. In addition, M-LLMs can process sensor data, such as motion sensor activations in older adult care settings, offering real-time insights into patient activity and well-being.

Significant advancements have been made in M-LLMs with temporal analysis capabilities, including models such as Time-LLM [ 90 ], LLM4TS [ 91 ], TEMPO [ 92 ], and PromptCast [ 93 ], among others [ 94 , 95 ]. However, there is still a lack of M-LLMs specifically designed for medical temporal data. Some of the existing M-LLMs with temporal capabilities could be adapted for medical applications [ 89 , 96 ], or new models specifically designed and pretrained on medical temporal data can be developed.

Medical M-LLMs that can process and comprehend audio signals have the potential to significantly enhance health care. These models can analyze vocal patterns and breathing sounds to identify respiratory conditions such as asthma or chronic obstructive pulmonary disease (COPD) early in their development. In addition, M-LLMs can be used in mental health to detect subtle changes in speech patterns, affective tone, and vocal tone that may indicate depression, anxiety, or stress, offering a noninvasive diagnostic tool that complements traditional assessment methods. Moreover, audio-based M-LLMs facilitate continuous monitoring of patients in intensive care unit settings, using sound analysis to alert medical staff to changes in patient condition that might necessitate immediate intervention. Furthermore, these models enhance patient engagement and education by converting medical advice into accessible audio formats tailored to individual patient needs and comprehension levels. They can also aid in the early detection of neurological disorders through speech irregularities, help monitor sleep apnea by analyzing breathing patterns during sleep, and support speech therapy for stroke survivors by tracking progress in speech fluency and pronunciation.

Numerous audio-text M-LLMs, leveraging transformer-based architectures, have integrated text- and audio-based language models, such as AudioPaLM [ 97 ], AudioLM [ 98 ], Pengi [ 99 ], AudioGPT [ 100 ], SpeechGPT [ 49 ], VioLA [ 101 ], and SALMONN [ 102 ], into a unified multimodal architecture. This architecture is capable of processing and generating both text and speech, facilitating applications such as speech recognition and speech-to-speech translation. However, there is a gap in the development of large audio models specifically tailored for medical applications [ 103 ]. Nonetheless, these existing M-LLMs with audio capabilities may be adapted and refined to address the requirements of medical-related tasks.

Although text-based LLMs are not inherently multimodal, integrating text with other data modalities such as images and audio transforms them into the core of M-LLMs. In clinical practice, these text-based components of M-LLMs can be applied in several ways. For instance, they facilitate the automated generation of patient reports by interpreting and summarizing complex medical language and data, including diagnostic imaging and laboratory test results. M-LLMs with additional skills in understanding tabular and other structured textual data are expected to perform better on EHR data. Furthermore, text M-LLMs play a crucial role in analyzing the large volumes of clinical notes routinely available in EHRs to predict clinical outcomes. In addition, they enhance medical education and training by providing simulations and interactive learning experiences based on extensive medical literature and case studies.

There is a growing interest in the development of M-LLMs that incorporate text data, demonstrating the vast potential and ongoing innovations in this field. Examples of biomedical text LLMs include BiMediX [ 104 ], BioBERT [ 105 ], PubMedBERT [ 106 ], and ClinicalBERT [ 20 ]. BioBERT is a biomedical language representation model designed for text mining tasks such as named entity recognition, relation extraction, and question answering in the biomedical domain. PubMedBERT is specifically pretrained from scratch on PubMed articles, ensuring a highly focused approach to understanding medical literature. ClinicalBERT is a BERT model pretrained on generic EHR clinical documentation and discharge summaries. BiMediX is the first bilingual medical LLM with expertise in both English and Arabic, facilitating several medical interactions, including multiturn conversations, multiple-choice queries, and closed question answering.

M-LLMs hold significant promise in transforming the analysis and interpretation of various types of video data within medical settings. In surgical training, M-LLMs can analyze and interpret surgical videos, providing real-time feedback and educational insights. In physical therapy, M-LLMs can analyze patient movement videos, aiding in designing targeted rehabilitation programs and monitoring patient progress. They can also be used in psychiatric evaluations to assess behavioral patterns through video assessments. Furthermore, M-LLMs can be used in internal examinations, interpreting recordings from endoscopic and laparoscopic procedures to identify abnormalities and support real-time decision-making during these procedures. Their applications extend to home health care, allowing for remote patient monitoring through video to track well-being. They are also used in sleep studies, where video recordings assist in diagnosing disorders such as sleep apnea. In dermatology, video analysis of skin conditions over time helps in tracking disease progression.

The progress in M-LLMs for video data analysis, demonstrated by models such as Video-Chat [ 47 ], Video-ChatGPT [ 107 ], Video-LLaMA [ 48 ], LLaMA-VID [ 108 ], MotionGPT [ 109 ], LAVENDER [ 110 ], MovieChat [ 111 ], Vid2Seq [ 112 ], VideoLLM [ 113 ], and VTimeLLM [ 114 ], shows significant promise for the development of models tailored to medical applications. The success of these models in nonmedical settings lays a foundation for similar advancements in the health care sector. However, a critical aspect in applying these models to medicine is the incorporation of domain-specific medical knowledge. Medical videos require not just technical analysis but also contextual interpretation aligned with patient history, presenting symptoms, and potential diagnoses. Furthermore, the operational demands of these models in clinical environments are stringent. They must function in real time or near real time to offer actionable insights during critical medical procedures, such as providing alerts during surgeries or continuous patient monitoring.

M-LLMs leveraging omics data, encompassing genomics, transcriptomics, proteomics, and other omics technologies, have the potential to significantly enhance personalized medicine and clinical diagnostics. By integrating and interpreting complex omics data sets, M-LLMs can uncover novel biomarkers for diseases, predict patient responses to specific treatments, and facilitate the development of targeted therapies. For example, in oncology, these models can analyze genetic mutations and expression patterns to guide cancer treatment strategies. Similarly, in cardiology, omics data analysis can help identify genetic risk factors for heart diseases, enabling preventative interventions. M-LLMs also support drug discovery processes by predicting the efficacy and side effects of potential drugs based on the omics profiles of diverse patient populations.

Several M-LLMs have been developed using omics data for a wide range of biomedical applications [ 115 ]. In genomics, DNA sequence language models are used for a variety of predictive tasks. These tasks include predicting genome-wide variant effects (GPN [ 116 ]; DNABERT [ 117 ]; and its subsequent evolution, DNABERT-2 [ 118 ]), predicting DNA cis-regulatory regions (DNAGPT [ 119 ], DNABERT, and DNABERT-2), predicting DNA-protein interactions (TFBert [ 120 ] and MoDNA [ 121 ]), and determining RNA splice sites from DNA sequences (DNABERT and DNABERT-2). In transcriptomics, RNA sequence language models are used for RNA splicing prediction (SpliceBERT [ 122 ]), assessment of long noncoding RNAs’ coding potential (LncCat [ 123 ]), RNA-binding protein interactions (BERT-RBP [ 124 ]), RNA modification identification (BERT-m7G [ 125 ]), and predictions related to protein expression and messenger RNA degradation (CodonBERT [ 126 ]). In proteomics, protein language models are used for secondary structure and contact prediction (ProtTrans [ 127 ]), protein sequence generation (ProGen [ 128 ]), protein function prediction (ProtST [ 129 ]), major posttranslational modification prediction (ProteinBERT [ 130 ]), biophysical property prediction (PromptProtein [ 131 ]), and advancing the state of the art in proteomics [ 132 , 133 ].

Any-to-Any M-LLMs

Current M-LLMs are primarily limited to multimodal comprehension on the input side, possessing limited capabilities to generate content across various modalities [ 134 , 135 ]. Given that clinicians frequently interact and communicate using a variety of medical modalities, the potential applications of any-to-any M-LLMs, which can accept input in any modality and produce output in any modality, are numerous. For instance, clinicians can provide a combination of textual patient history, radiographic images, and audio recordings of patient symptoms as input to the M-LLM. The M-LLM could then analyze this multimodal input to diagnose the patient’s condition. Subsequently, it could generate a multimodal output that includes a textual report summarizing the diagnosis, annotated images highlighting areas of concern, and an audio explanation that can be easily shared with patients or other medical professionals.

There is an increasing interest in the development of any-to-any M-LLMs, highlighting the significant potential of their applications across various domains. For instance, NExT-GPT [ 136 ] enhances an LLM with multimodal adapters and a range of diffusion decoders, enabling the model to process and generate outputs in any combination of text, images, videos, and audio. Macaw-LLM [ 137 ] integrates images, audio, and textual data using 3 primary components: a modality module for encoding multimodal data, a cognitive module for leveraging pretrained LLMs, and an alignment module for synchronizing diverse representations. OneLLM [ 138 ] incorporates 8 unique modalities within a single framework using a multimodal alignment pipeline, which can be further expanded to include additional data modalities. These models, among others [ 139 , 140 ], can be tailored and fine-tuned to specifically address the unique demands of tasks related to health care.

Use Case Example

In this section, we present a use case that demonstrates the practical application of M-LLMs in health care using the Contrastive Learning From Captions for Histopathology (CONCH) model [ 141 ]. CONCH is a vision-language M-LLM specifically designed for computational histopathology. It is pretrained on the largest histopathology-specific vision-language data set, enabling it to create effective representations for non–H&E (hematoxylin and eosin)-stained images, such as immunohistochemistry and special stains, without relying on large public histology slide collections such as The Cancer Genome Atlas, Pancreatic Cancer AI Platform, and Genotype-Tissue Expression.

For this experiment, we used the pretrained model weights available on Hugging Face [ 141 ] and installed the CONCH package from the official repository [ 142 ]. The experiment was conducted on a Linux machine equipped with an NVIDIA GeForce GTX 1080 Ti graphics card using a web-based demonstration application developed using the Flask web framework. The application created a ChatGPT-like interface for zero-shot cross-modal retrieval, accepting both pathology-related text prompts and pathological images. It computed cosine similarity and provided retrieval scores based on the input data. Figure 4 illustrates how CONCH was used to analyze 2 histopathology slides, providing confidence scores for various diagnostic questions. The model processes both the images and corresponding text prompts, offering a zero-shot cross-modal retrieval approach to assist in diagnosing conditions such as invasive ductal carcinoma, invasive lobular carcinoma, and ulcerative colitis.

This use case example highlights the potential of M-LLMs such as CONCH to enhance computational pathology by enabling advanced, multimodal data retrieval and analysis even in complex and specialized medical imaging tasks.

type of research in language

While the potential of M-LLMs is promising, it is crucial to understand the significant technical and ethical challenges and limitations that accompany their development and deployment in health care ( Figure 5 ). From a technical perspective, challenges include integrating diverse data sources (data fusion), meeting extensive data requirements, ensuring scalability and managing computational demands, and improving the interpretability of M-LLMs. Ethically, issues such as bias and fairness, obtaining informed consent, data privacy and security, and the safety and alignment of these models in clinical practice present substantial obstacles. In this section, we discuss these challenges and propose potential solutions to tackle them.

type of research in language

Technical Challenges

Data fusion.

Data fusion in medical M-LLMs is a sophisticated and complex process that requires the integration of heterogeneous data types to create a comprehensive and multidimensional representation of patient health. This integration process encompasses several technical challenges that must be adeptly managed. The first challenge is the temporal and spatial alignment of different data modalities, where aligning data from diverse sources such as medical images, videos, and text-based records is crucial to ensure that all data points are synchronized and that temporal data (showing changes over time) and spatial data (showing anatomical or physiological details) are correctly correlated. Second, handling data sparsity and missingness is vital as it can significantly impact diagnosis and treatment. For example, missing frames in a medical video could miss critical changes in a patient’s condition, incomplete medical images may not fully reveal the extent of a disease, and gaps in EHRs can result in a lack of historical context for patient care, necessitating sophisticated techniques to infer missing information without compromising diagnostic accuracy. Furthermore, normalization and standardization are essential given the varied formats, scales, and resolutions of different data modalities, for example, adjusting the scale of medical images to a standard range, normalizing text data from clinical notes to a uniform format for analysis, and standardizing video data to ensure consistent frame rates and resolutions. These challenges highlight the complexity of integrating diverse data types used in M-LLMs, underscoring the need for advanced computational techniques and algorithms to address these issues effectively.

Potential Solution

Beyond foundational methods for data fusion, a variety of advanced techniques exist that can enable M-LLMs to more effectively integrate different modalities. Prompt-based multimodal fusion [ 143 ] is one such framework that enables bidirectional interaction among different modalities through a 2-stream structure, typically involving parallel construction of the multimodal model through pretrained language and image models. Hybrid fusion [ 144 ] integrates unstructured and structured data along with other multimodal sources via a pretrained language model, capturing a more comprehensive patient representation. Gated fusion [ 145 , 146 ] uses mechanisms such as neural network gates or attention mechanisms to dynamically emphasize or de-emphasize different aspects or modalities of the data based on the context. Finally, tensor fusion [ 68 ] constructs a higher-order tensor representing all feature combinations across modalities, which is then decompressed or factorized to a lower dimension for tractable computation while preserving the depth of multimodal interactions.

Data Requirements

In the pretraining phase of M-LLMs, large and diverse data sets with extensive labeling in many cases are required to capture a wide range of general knowledge across different modalities (eg, text, images, and audio). The primary goals of pretraining are to develop robust feature representations and ensure that the model can handle the inherent variability in real-world data. However, such multimodal medical data sets are currently limited, and the acquisition of such large-scale labeled data presents logistical, ethical, and privacy challenges [ 147 ]. Existing multimodal medical data sets available for public use [ 84 , 85 , 148 ] are often relatively small in scale and demand the consolidation of numerous resources. For instance, the MIMIC-IV [ 148 ] includes a limited range of modalities, including clinical notes, medical images (chest x-ray Digital Imaging and Communications in Medicine [DICM] images), and time series (diagnostic electrocardiogram and patient records), making it a valuable but constrained resource for training medical M-LLMs. Similarly, PMC-VQA [ 85 ] and LLaVA-Med [ 84 ] include text and image modalities for medical visual question answering.

It is to be noted that the storage of vast amounts of multimodal data (ie, medical images and scans, videos, and high-resolution audio files) requires substantial storage capacity. Efficient and secure storage solutions are essential to handle these data, ensuring quick access and retrieval while maintaining data integrity and security.

To address the limited data challenge in training medical M-LLMs, a combination of synthetic data generation and federated learning could be used. Synthetic data generation using generative models can create realistic, diverse data sets that mimic real-world multimodal medical scenarios, thus expanding the training data set without compromising privacy or ethical standards [ 149 - 151 ]. In addition, federated learning presents a viable solution for leveraging multimodal data from multiple health care institutions without the need to share the actual data, thus maintaining patient privacy [ 152 - 156 ]. This decentralized approach enables multimodal M-LLMs to learn from a vast, distributed data set encompassing a wide range of medical modalities without necessitating centralization of the data.

The few-shot learning approach enables models to generalize from a limited number of examples. By leveraging the pretrained knowledge and adapting quickly to new tasks with minimal data, few-shot learning can be particularly useful in medical scenarios in which labeled data are limited. Another approach to reducing computational requirements and addressing the problem of unavailable labeled data is in-context learning. This approach enables models to perform tasks by providing examples in the input context without fine-tuning the model weights. This approach can be effective for tasks such as medical image interpretation or clinical note analysis.

To address data storage demands when building M-LLMs, cloud-based storage solutions offer a flexible and scalable way to store big data and allow organizations to scale their storage capacity as needed without the upfront investment in physical infrastructure. Other benefits include improved accessibility and cost efficiency, whereas providers can implement robust security measures (eg, data encryption and access control). Moreover, the combination of cloud-based storage and distributed storage systems provides a robust and adaptable solution for managing the extensive and complex data sets needed for M-LLMs.

Scalability and Computational Demands

The development and deployment of M-LLMs in the medical field pose significant scalability and computational challenges. During training, such complex M-LLMs require substantial computational power, often involving parallel processing and sophisticated algorithms to manage and analyze the data effectively. Moreover, M-LLMs face memory limitations due to processing vast amounts of data, and their large size necessitates considerable storage capacity. This can also lead to network latency, slowing down model performance and affecting user experience. The scalability issue is further compounded by the need for continuous model updates to incorporate new medical data and knowledge. These factors translate to high operational costs, making the development of medical M-LLMs feasible mainly for large technology corporations with significant resources. Inference, on the other hand, requires minimizing latency and reducing computational load to ensure real-time or near–real-time responses in clinical settings. Both phases pose unique challenges that need to be addressed to facilitate the practical deployment of M-LLMs in health care.

To optimize efficiency during both training and inference, several methods can be used. Parameter-efficient fine-tuning methods such as adapter layers help reduce the computational load by fine-tuning only a subset of the model’s parameters [ 157 , 158 ]. In addition, quantization approaches can address the scalability and computational demands by shifting toward quantized versions of existing models using curated, domain-specific data rather than pretraining from scratch [ 159 ]. This method capitalizes on the foundational strengths of established models, significantly reducing the computational resources needed for initial training [ 160 ]. Knowledge distillation is another approach that involves training a smaller “student” model to replicate the behavior of a larger “teacher” model, requiring less computational power while retaining performance [ 161 ]. Fine-tuning using targeted medical data sets enhances accuracy and relevance in medical applications while also cutting down development time and costs. Furthermore, developing more efficient transformer architectures tailored for multimodal data, such as Kosmos-1 [ 162 ], Muse [ 163 ], and PixArt-α [ 164 ], presents a viable solution. Optimizing algorithms for parallel processing is another approach that promotes more efficient use of computational resources. During inference, quantization and pruning continue to be beneficial by reducing the computational burden and speeding up model execution. Knowledge distillation allows for the use of smaller, faster models that maintain high performance, ideal for real-time applications. Additional optimization techniques, such as model compression [ 165 ] and hardware acceleration using graphics processing units (GPUs) or tensor processing units (TPUs) [ 166 ], further enhance efficiency.

Model Interpretability

In contrast to unimodal LLMs, the scale of M-LLMs in terms of parameters and training data introduces a unique set of interpretability challenges alongside potential opportunities in the field of research on model explainability. First, as these models expand in size, the task of understanding and interpreting their decision-making processes becomes increasingly challenging [ 167 ]. This difficulty is amplified by the added internal complexity of M-LLMs and the extensive variety of their training data sets. Moreover, this complexity necessitates substantial computational resources to facilitate the generation of explanations. Such increased complexity poses significant hurdles for in-depth analysis, thereby hindering the debugging and diagnostic processes essential for understanding and improving M-LLMs.

Addressing these interpretability challenges in the context of health care is critical as clinicians—accountable to patients and regulators—should have a reasonable ability to explain how a complex model assists and makes medical recommendations. Choosing between model performance and interpretability can be problematic and is often down to trust (in model development methods, data, metrics, and outcome data, among other things). This challenge necessitates the development of advanced methods for explaining transformer-based language models [ 167 , 168 ], particularly methods for local explanations, such as feature attribution explanation, attention-based explanation, example-based explanation, and natural language explanation [ 169 - 172 ], and global explanations, such as probing-based explanation, neuron activation explanation, concept-based explanation, and mechanistic interpretability [ 168 , 173 , 174 ]. In addition, being able to use these explanations is crucial for debugging and improving M-LLMs. An effective approach is the development of integrated explanation frameworks specifically designed for medical M-LLMs that can integrate both local and global explanations. Such frameworks are essential for handling the multimodal nature of medical data, including the combination of textual and imaging information. In addition, incorporating a human-in-the-loop approach, where clinician feedback on the model’s explanations is used for continuous improvement, can significantly enhance the practical utility and trustworthiness of these M-LLM systems in medical settings [ 167 ].

Ethical Challenges

Bias and fairness.

The potential for bias represents one of the primary ethical challenges in using M-LLMs in health care. Specifically, in the health care domain, data often exhibit bias due to the uneven distribution of demographic attributes, preconceptions held by health care professionals involved in data collection and interpretation, and the varied academic and experiential backgrounds that influence their perspectives [ 175 - 177 ]. If M-LLMs are trained on patient data that contain biases related to gender, ethnicity, socioeconomic status, or geographic location, they may inadvertently cause biases in their predictions or recommendations [ 175 , 178 , 179 ]. For example, a recently developed M-LLM, LLaVA [ 46 ], when asked to analyze an image featuring 2 Black men and 2 gorillas, erroneously identified one of the men as a gorilla. This error suggests the existence of racial bias within the algorithmic framework of the model [ 180 ]. In health care, biased M-LLMs can lead to differential treatment, misdiagnoses, and unequal access to medical resources. For example, an M-LLM analyzing medical images might miss subtle symptoms in darker-skinned individuals due to biases in the training data. One study showed that CNNs, when trained on publicly available chest x-ray data sets, may show a tendency to underdiagnose specific populations, such as individuals from marginalized communities (eg, Black and Hispanic patients), women, and Medicaid recipients [ 181 ].

Potential Solutions

Mitigating bias and improving fairness within medical M-LLMs necessitates a multifaceted approach centered on 3 pillars: data integrity, model refinement, and comprehensive evaluation [ 181 , 182 ]. Essential to this strategy is the curation of diverse and representative data. This involves compiling multimodal medical data sets that encompass a wide array of demographics, languages, and cultures to ensure balanced representation and guide targeted model fine-tuning efforts [ 183 ]. Fine-tuning these models through transfer learning and bias reduction techniques, such as counterfactual data augmentation [ 184 ], can effectively minimize patterns of gender, racial, or cultural bias. Furthermore, deploying multiple methods and metrics for evaluation is crucial. These may include human, automatic, or hybrid evaluations alongside metrics such as accuracy, sentiment, and fairness, which provide feedback on bias in M-LLM outputs. Through such rigorous evaluation, biases can be detected and continuously addressed, improving the reliability of M-LLMs. Moreover, incorporating logic-aware mechanisms into medical M-LLMs involves integrating clinical reasoning and decision-making processes into the M-LLMs. This approach promotes the generation of more accurate and less biased outputs by applying medical reasoning to the relationships between data tokens. For instance, logic-aware M-LLMs can differentiate between correlational and causal relationships in patient data, recognize the significance of laboratory values within clinical contexts, and apply diagnostic criteria accurately across diverse patient populations. Ultimately, the goal is to reduce bias without compromising the performance of M-LLMs. It is a careful balance of debiasing and enhancing the models, requiring ongoing monitoring and adjustment to align with ethical standards, particularly in the sensitive domain of health care [ 185 ].

Informed Consent

Obtaining informed consent in the context of M-LLMs presents unique challenges. In particular, it remains uncertain whether patient consent is necessary for training M-LLMs using their data if consent was previously obtained for research purposes in general or for AI development specifically [ 178 , 186 ]. Furthermore, given the complexity of M-LLMs, it might be difficult for patients to grasp what they are consenting to, especially in terms of how their data will be used, how these models operate, and the potential risks involved. This raises questions about the validity of consent and the level of detail required to adequately inform patients [ 177 , 178 ]. In addition, it can be argued that traditional institutional review boards (IRBs) and ethical oversight committees may be ill-equipped to deal with AI and M-LLM applications due to the lack of understanding of such novel technologies in the medical arena [ 187 ].

Health care providers and developers have a responsibility to empower patients to make informed decisions about the use of their data in developing M-LLMs. This requires providing them with clear, transparent, simplified explanations of how M-LLMs work, how their data will be used, the nature of the data they handle, the steps taken to protect privacy, and the potential risks of using their data (eg, algorithmic bias and privacy issues). These explanations may take various forms, including written text, visual aids, educational videos, or other materials tailored to different levels of understanding. Professional training should be provided to health care professionals on the capabilities, limitations, and ethical considerations of using M-LLMs in practice to effectively communicate these aspects to patients. To this point, it may be necessary for health care and academic medical institutions to adapt their IRBs for a more effective governance and use of AI, first through incorporating a sufficiently diverse set of expert members (eg, experts in machine learning, experts in data science, and experts in previous studies of marginalized or discriminated communities) and, second, through more targeted, ongoing training of board members. In doing so, IRBs are more likely to constructively navigate issues pertaining to informed consent, data privacy and security, and safety.

Data Privacy and Security

As mentioned previously, M-LLMs require a massive amount of patient data (eg, medical history, clinical notes, medical images, laboratory test results, and prescriptions) that are inherently sensitive. This, in turn, raises substantial privacy and security concerns—how will patient data be collected, stored, and used? Who will have access to them and for what purposes [ 175 - 177 ]? Researchers have demonstrated that bombarding an LLM with specific questions (ie, adversarial attacks) could force it to expose its training data, which contain verbatim personal identifiable information and chat conversations [ 188 ]. They have also concluded that larger models seem to be more susceptible to attacks than smaller models [ 188 ]. Other studies have shown that, even when sensitive patient data are anonymized, certain algorithms can still identify individual patients [ 189 - 191 ]. Unauthorized access or breaches can have severe consequences, including reputational damage, misuse of personal health information, and compromise of patient confidentiality.

It is crucial to implement stringent data protection measures to mitigate data privacy and security concerns when using patient data for developing M-LLMs. One of these measures is the implementation of federated learning techniques [ 153 , 155 , 156 ] to enable M-LLMs to be trained on decentralized data sources without the need to transfer sensitive or private information to a central location, thereby preserving data privacy and security. Furthermore, robust encryption protocols and anonymization techniques should be applied to the data before transferring or processing them. Secure storage infrastructure should be in place to safeguard patient information. It is important to conduct auditing of M-LLMs using data extraction attacks to understand how well M-LLMs resist unauthorized attempts to extract data and identify areas for improvement in terms of security and privacy. Health care providers and developers must establish strong data governance frameworks and policies and comply with relevant privacy regulations (eg, Health Insurance Portability and Accountability Act [HIPAA]). They also need to adopt a proactive approach to cybersecurity and regularly update security measures to counter-emerging threats.

Safety and Alignment

Ensuring the safety and alignment of M-LLMs in health care is paramount. These models must not only be effective in processing and analyzing medical data but also align with human ethical standards, particularly those of health care professionals. Similar to text-based models, where fine-tuning, reinforcement learning from human feedback, and dynamic policy optimization (DPO) are used to minimize harm and align outputs with human preferences, M-LLMs could adopt analogous methodologies to ensure that their recommendations are in harmony with the preferences and ethical considerations of medical practitioners. The challenge lies in aligning M-LLMs with the complex, nuanced, and sometimes subjective decision-making processes of human physicians. This involves training models on a diverse array of scenarios, encompassing ethical dilemmas, treatment preferences, and patient-centered care principles. By integrating feedback loops in which health care professionals review and adjust model outputs alongside technical and other professionals, M-LLMs can learn to prioritize patient safety, privacy, and the nuances of human empathy and ethical considerations in their recommendations.

Developing a framework for continuous learning and adaptation is crucial. This could involve iterative cycles of feedback and adjustment in which M-LLMs are fine-tuned based on direct input from health care professionals regarding the appropriateness and ethical alignment of their outputs. Incorporating mechanisms for DPO in which models adjust their decision-making strategies in real time based on new information or feedback could further enhance alignment with human values. Moreover, simulating diverse clinical and ethical scenarios during training phases can prepare M-LLMs to handle real-world complexities.

Future Outlook

In the evolving landscape of medical M-LLMs, anticipating future directions is crucial for advancing their application in health care. In this section, we outline prospective advancements and necessary adaptations that could enhance the functionality, efficacy, and ethical integration of M-LLMs in health care. Specifically, we explore the evolution in generating multimodal outputs, the critical need for establishing performance benchmarks, the shift in explainability paradigms toward comprehensive explainability, the role of M-LLMs in enhancing interoperability within hospital systems, the formulation of robust regulatory frameworks, and the essential role of multidisciplinary collaboration ( Figure 6 ). We envision that these areas collectively represent key future perspectives where M-LLMs are expected to transform both medical applications and patient care.

type of research in language

Generating Multimodal Outputs

While medical M-LLMs are rapidly evolving in processing multimodal inputs, the development of multimodal outputs is still trailing behind. The importance of multimodal outputs in medical contexts is significant. For example, when asking ChatGPT to explain complex medical concepts, such as interpreting radiological images or outlining surgical procedures, effective explanations should ideally blend textual descriptions with graphical representations, mathematical equations, audio narratives, or animations for enhanced comprehension. This highlights the need for medical M-LLMs capable of producing such varied outputs. A critical step toward this goal is the creation of a shared intermediate output by the model, which raises the following question: what form should this intermediate output take? A practical method is using text as the intermediate output, serving as a basis for generating additional modalities. For example, the causal masked multimodal (CMM) model [ 192 ] produces HTML markup that can be transformed into rich web pages with text, formatting, links, and images. Alternatively, using multimodal tokens where each token is tagged to represent different modalities such as text or image offers another route. Image tokens could feed into an image generation model such as a diffusion model to generate visual content, whereas text tokens are processed by a language model. This dual-token approach paves the way for more sophisticated and contextually appropriate multimodal outputs. Further exploration and development in this field could lead to models that seamlessly integrate a variety of output formats, revolutionizing the way in which medical information is conveyed and understood.

Establishing Benchmarks

Benchmarks are crucial in assessing the performance, accuracy, and effectiveness of generative AI, especially in the context of medical M-LLMs. The expansive scope and complex nature of health care and medicine necessitate continuous advancements in robust evaluation methods and frameworks. This is essential to ensure that medical M-LLMs are effectively aligned with the unique requirements of these domains. These benchmarks enable model comparisons, highlighting efficiencies and creative capabilities in specific tasks and data modalities both individually and collectively. They also play a critical role in detecting biases and limitations. Furthermore, they play a crucial role in establishing industry standards for medical M-LLMs, ensuring their ethical and safe use in sensitive medical contexts. Recent initiatives in M-LLM benchmarks, such as AesBench [ 193 ], Mementos [ 194 ], MME [ 195 ], MM-BigBench [ 196 ], MLLM-Bench [ 197 ], and VLM-Eval [ 198 ], offer a foundational framework that could be adapted to medical M-LLMs. However, there is an urgent need for more comprehensive evaluation methods and frameworks as well as rigorous rubrics for human evaluation of M-LLM performance in real-world clinical workflows and scenarios.

Evolution of Explainability: From Snapshot to Temporal Explainability

Snapshot explainability refers to the ability of M-LLMs to provide explanations for decisions or predictions at a single, specific point in time. In contrast, temporal analysis offers a more comprehensive understanding by tracking and interpreting changes over time. Most current interpretability research on M-LLMs neglects training dynamics, focusing mainly on post hoc explanations of fully trained models [ 167 ]. This lack of developmental investigation into the training process can lead to biased explanations. Moreover, examining interpretability based on a single data modality fails to reflect interactions between modalities. Therefore, transitioning from static snapshot explainability to dynamic temporal analysis is essential for medical M-LLMs. This approach is particularly beneficial for using multimodal data in monitoring patient progress, understanding disease trajectories, and predicting outcomes. By leveraging temporal explainability, M-LLMs can better contextualize data, uncovering patterns and trends that might be overlooked in static analysis. This shift not only enhances the accuracy of diagnoses and treatment plans but also improves the personalization of patient care by taking advantage of rich multimodal data.

Interoperability in Hospital Systems

An M-LLM could act as a central hub in hospitals, integrating various unimodal AI systems such as radiology, insurance, and EHRs. Currently, each department uses different AI tools from various companies, and most of these systems do not intercommunicate, resulting in access being limited to only department-specific systems. For instance, radiologists use radiological AI, whereas cardiologists might not have access to this, and likewise for other specialties. The introduction of M-LLMs can change this landscape significantly. M-LLMs understand the language and format of all these disparate software applications, allowing for seamless interaction. This means that health care practitioners regardless of specialty could easily work with any AI tool in the hospital, breaking down the silos that currently exist. This potential is vital as it enables comprehensive, integrated care, which individual organizations cannot achieve alone due to proprietary restrictions on data.

Developing Regulatory Frameworks

The development of a regulatory framework for medical M-LLMs is essential to ensure their safe, effective, and ethical use. Regulatory bodies need to establish standards and guidelines that define acceptable accuracy for various M-LLM applications, ensuring that these tools are reliable and trustworthy in clinical settings. A critical aspect of this framework also includes algorithmic transparency; therefore, regulatory guidelines must clearly stipulate requirements for explainability. Furthermore, the protection of patient data privacy is essential given that M-LLMs process sensitive health information. Therefore, regulatory frameworks must enforce strict data protection standards and formulate strategies for ethically collecting and processing multimodal data sets. Moreover, regardless of whether regulations are sufficiently developed or comprehensive in any given jurisdiction, medical and research institutions have an obligation to upgrade the knowledge and diversity of their ethics approval boards.

Fostering Multidisciplinary Collaboration and Stakeholder Engagement

AI, and specifically M-LLMs, is so new and complex in the health care domain that the expertise and insights needed extend far beyond the capabilities of any one health care or academic medical organization. Thus, it is imperative for those implementing M-LLM solutions to draw upon the know-how of 4 major external stakeholders. First, because many AI projects are expected to pose ethical concerns, the relevant applicable regulatory bodies and local health authorities should be engaged on a regular basis to ensure compliance with regulations. Indeed, guidelines and laws are rapidly changing; at the time of writing, the European Union has endorsed a world-first AI Act [ 199 ]. Second, much of the M-LLM innovation is expected to stem from academic and research contexts, where scientists continually push the boundaries of evidence-based, validated AI projects commonly published and made available for public benefit. Collaborating and partnering with such institutions ensures that the latest approaches and technologies can be incorporated into a health care project. Third, the industry is often a forgotten collaborator due to perceived entry barriers (eg, intellectual property ownership, exclusivity, and so forth). However, large commercial companies have access to far wider resources and technical expertise, particularly in engineering development, than medical institutions and, when negotiated with a win-win perspective, can significantly accelerate AI project deployment in the health care context. The same may apply to vendors who are infrastructure and deployment experts and who may be able to contribute beyond the limited scope of a purchase agreement. Moreover, when applicable, industry partners may offer greater commercialization pathways for projects. Finally, the fourth external stakeholder is the patient advocacy organization. Such groups should be engaged early and continuously and can help ensure that patients’ critical perspectives are communicated and included within the requirements of an M-LLM project. This is especially the case in projects that directly impact the patients’ needs and preferences, for instance, an M-LLM that interacts by providing clinical insights and recommendations to the physician during a patient consultation. Such advocacy groups can also be an effective way for health care institutions to more naturally engage in awareness and trust building with their communities. Naturally, with external stakeholders, appropriate collaboration and data agreements should be sought to protect the health care institutions’ interests as well as those of their patients. In addition, regardless of whether projects require internal or external collaboration, best practices should be used to ensure that roles, responsibilities, and decision-making structures are clarified upfront.

Conclusions

In this paper, we explored the foundational principles, applications, challenges, and future perspectives of M-LLMs in health care practice. While this work suggests a promising direction for the application of M-LLMs in medicine, it also highlights the need for further evaluation and benchmarking of their capabilities and limitations in real-world medical settings. In addition, despite the momentum toward models capable of processing multimodal inputs, the progression toward sophisticated multimodal outputs remains comparatively slow. Furthermore, it is crucial to acknowledge that the emergence of M-LLMs does not render traditional LLMs obsolete. Instead, M-LLMs serve as an extension, building upon the foundational strengths and capabilities of LLMs to enhance health care delivery. This association underscores that the efficiency of M-LLMs is inherently tied to the robustness of the underlying LLMs. As we advance toward more general AI systems, M-LLMs offer a promising path to a comprehensive form of AI in health care practice. The journey has its challenges, but the potential rewards could significantly redefine our interaction with technology in the medical field.

Conflicts of Interest

AAA is an associate editor for JMIR Nursing . All other authors declare no coflicts of interest.

  • Tamkin A, Brundage M, Clark J, Ganguli D. Understanding the capabilities, limitations, and societal impact of large language models. arXiv. Preprint posted online on February 4, 2021. [ CrossRef ]
  • Chen R. [Prospects for the application of healthcare big data combined with large language models]. Sichuan Da Xue Xue Bao Yi Xue Ban. Sep 2023;54(5):855-856. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Clusmann J, Kolbinger FR, Muti HS, Carrero ZI, Eckardt JN, Laleh NG, et al. The future landscape of large language models in medicine. Commun Med (Lond). Oct 10, 2023;3(1):141. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Botelho F, Tshimula JM, Poenaru D. Leveraging ChatGPT to democratize and decolonize global surgery: large language models for small healthcare budgets. World J Surg. Nov 2023;47(11):2626-2627. [ CrossRef ] [ Medline ]
  • Praveen SV, Deepika R. Exploring the perspective of infection clinicians on the integration of large language models (LLMs) in clinical practice: a deep learning study in healthcare. J Infect. Oct 2023;87(4):e68-e69. [ CrossRef ] [ Medline ]
  • Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large language models encode clinical knowledge. Nature. Aug 2023;620(7972):172-180. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yu P, Xu H, Hu X, Deng C. Leveraging generative AI and large language models: a comprehensive roadmap for healthcare integration. Healthcare (Basel). Oct 20, 2023;11(20):2776. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Thirunavukarasu AJ. Large language models will not replace healthcare professionals: curbing popular fears and hype. J R Soc Med. May 2023;116(5):181-182. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Holzinger A. Explainable AI and multi-modal causability in medicine. I Com (Berl). Jan 26, 2021;19(3):171-179. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Liang J, Li ZW, Yue CT, Hu Z, Cheng H, Liu ZX, et al. Multi-modal optimization to identify personalized biomarkers for disease prediction of individual patients with cancer. Brief Bioinform. Sep 20, 2022;23(5):bbac254. [ CrossRef ] [ Medline ]
  • Zheng S, Zhu Z, Liu Z, Guo Z, Liu Y, Yang Y, et al. Multi-modal graph learning for disease prediction. IEEE Trans Med Imaging. Sep 2022;41(9):2207-2216. [ CrossRef ]
  • Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. Nov 15, 1997;9(8):1735-1780. [ CrossRef ] [ Medline ]
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017. Presented at: NIPS'17; December 4-9, 2017; Long Beach, CA.
  • AlSaad R, Malluhi Q, Janahi I, Boughorbel S. Interpreting patient-specific risk prediction using contextual decomposition of BiLSTMs: application to children with asthma. BMC Med Inform Decis Mak. Nov 08, 2019;19(1):214. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv. Preprint posted online on October 11, 2018
  • AlSaad R, Malluhi Q, Abd-Alrazaq A, Boughorbel S. Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation. Artif Intell Med. Mar 2024;149:102802. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Li Y, Mamouei M, Salimi-Khorshidi G, Rao S, Hassaine A, Canoy D, et al. Hi-BEHRT: hierarchical transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records. IEEE J Biomed Health Inform. Feb 2023;27(2):1106-1117. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Li Y, Rao S, Solares JR, Hassaine A, Ramakrishnan R, Canoy D, et al. BEHRT: transformer for electronic health records. Sci Rep. Apr 28, 2020;10(1):7155. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ho QT, Nguyen TT, Khanh Le NQ, Ou YY. FAD-BERT: improved prediction of FAD binding sites using pre-training of deep bidirectional transformers. Comput Biol Med. Apr 2021;131:104258. [ CrossRef ] [ Medline ]
  • Mulyar A, Uzuner O, McInnes B. MT-clinical BERT: scaling clinical information extraction with multitask learning. J Am Med Inform Assoc. Sep 18, 2021;28(10):2108-2115. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Introducing ChatGPT. OpenAI. Nov 30, 2022. URL: https://openai.com/index/chatgpt/ [accessed 2024-09-12]
  • Zhang Q, Liang Y. Comments on "ChatGPT and its role in the decision-making for the diagnosis and treatment of lumbar spinal stenosis: a comparative analysis and narrative review". Global Spine J. May 2024;14(4):1452. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Koga S, Martin NB, Dickson DW. Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders. Brain Pathol. May 08, 2024;34(3):e13207. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Horiuchi D, Tatekawa H, Shimono T, Walston SL, Takita H, Matsushita S, et al. Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases. Neuroradiology. Jan 23, 2024;66(1):73-79. [ CrossRef ] [ Medline ]
  • Sinha RK, Deb Roy A, Kumar N, Mondal H. Applicability of ChatGPT in assisting to solve higher order problems in pathology. Cureus. Feb 2023;15(2):e35237. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mese I, Taslicay CA, Sivrioglu AK. Improving radiology workflow using ChatGPT and artificial intelligence. Clin Imaging. Nov 2023;103:109993. [ CrossRef ] [ Medline ]
  • Mallio CA, Sertorio AC, Bernetti C, Beomonte Zobel B. Large language models for structured reporting in radiology: performance of GPT-4, ChatGPT-3.5, Perplexity and Bing. Radiol Med. Jul 29, 2023;128(7):808-812. [ CrossRef ] [ Medline ]
  • Perera Molligoda Arachchige AS. Empowering radiology: the transformative role of ChatGPT. Clin Radiol. Nov 2023;78(11):851-855. [ CrossRef ] [ Medline ]
  • Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, Lacroix T, et al. LLaMA: open and efficient foundation language models. arXiv. Preprint posted online on February 27, 2023. [ FREE Full text ]
  • Chung HW, Hou L, Longpre S, Zoph B, Tay Y, Fedus W, et al. Scaling instruction-finetuned language models. arXiv. Preprint posted online on October 20, 2022. [ FREE Full text ]
  • Vicuna: an open-source chatbot impressing GPT-4 with 90%* ChatGPT quality. LMSYS. Mar 30, 2023. URL: https://lmsys.org/blog/2023-03-30-vicuna/ [accessed 2024-09-12]
  • Taori R, Gulrajani I, Zhang T, Dubois Y, Li X, Guestrin C, et al. Alpaca: a strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. 2023. URL: https://crfm.stanford.edu/2023/03/13/alpaca.html [accessed 2024-09-12]
  • Gobira M, Nakayama LF, Moreira R, Andrade E, Regatieri CV, Belfort RJ. Performance of ChatGPT-4 in answering questions from the Brazilian National Examination for Medical Degree Revalidation. Rev Assoc Med Bras (1992). 2023;69(10):e20230848. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lai UH, Wu KS, Hsu TY, Kan JK. Evaluating the performance of ChatGPT-4 on the United Kingdom Medical Licensing Assessment. Front Med (Lausanne). Sep 19, 2023;10:1240915. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huang Y, Gomaa A, Semrau S, Haderlein M, Lettmaier S, Weissmann T, et al. Benchmarking ChatGPT-4 on a radiation oncology in-training exam and Red Journal Gray Zone cases: potentials and challenges for AI-assisted medical education and decision making in radiation oncology. Front Oncol. Sep 14, 2023;13:1265024. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mihalache A, Huang RS, Popovic MM, Muni RH. ChatGPT-4: an assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination. Med Teach. Oct 15, 2023;46(3):1-7. [ CrossRef ]
  • Kleebayoon A, Wiwanitkit V. ChatGPT-4, medical education, and clinical exposure challenges. Indian J Orthop. Nov 21, 2023;57(11):1912. [ CrossRef ] [ Medline ]
  • Meskó B. The impact of multimodal large language models on health care's future. J Med Internet Res. Nov 02, 2023;25:e52865. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huang H, Zheng O, Wang D, Yin J, Wang Z, Ding S, et al. ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. Jul 28, 2023;15(1):29. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kline A, Wang H, Li Y, Dennis S, Hutch M, Xu Z, et al. Multimodal machine learning in precision health: a scoping review. NPJ Digit Med. Nov 07, 2022;5(1):171. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huang W, Tu S, Xu L. PFB-Diff: progressive feature blending diffusion for text-driven image editing. arXiv. Preprint posted online on June 28, 2023. [ FREE Full text ]
  • Zhu D, Chen J, Shen X, Li X, Elhoseiny M. MiniGPT-4: enhancing vision-language understanding with advanced large language models. arXiv. Preprint posted online on April 20, 2023. [ FREE Full text ]
  • Su Y, Lan T, Liu Y, Liu F, Yogatama D, Wang Y, et al. Language models can see: plugging visual controls in text generation. arXiv. Preprint posted online on May 5, 2022. [ FREE Full text ]
  • Koh JY, Fried D, Salakhutdinov R. Generating images with multimodal language models. arXiv. Preprint posted online on May 26, 2023. [ FREE Full text ]
  • Alayrac JB, Donahue J, Luc P, Miech A, Barr I, Hasson Y, et al. Flamingo: a visual language model for few-shot learning. arXiv. Preprint posted online on April 29, 2022. [ FREE Full text ]
  • Liu H, Li C, Wu Q, Lee YJ. Visual instruction tuning. arXiv. Preprint posted online on April 17, 2023. [ CrossRef ]
  • Li K, He Y, Wang Y, Li Y, Wang W, Luo P, et al. VideoChat: chat-centric video understanding. arXiv. Preprint posted online on May 10, 2023. [ CrossRef ]
  • Zhang H, Li X, Bing L. Video-LLaMA: an instruction-tuned audio-visual language model for video understanding. arXiv. Preprint posted online on June 5, 2023. [ CrossRef ]
  • Zhang D, Li S, Zhang X, Zhang J, Wang P, Zhou Y, et al. SpeechGPT: empowering large language models with intrinsic cross-modal conversational abilities. arXiv. Preprint posted online on May 18, 2023. [ CrossRef ]
  • Su Y, Lan T, Li H, Xu J, Wang Y, Cai D. PandaGPT: one model to instruction-follow them all. arXiv. Preprint posted online on May 25, 2023. [ CrossRef ]
  • Girdhar R, El-Nouby A, Liu Z, Singh M, Alwala KV, Joulin A, et al. ImageBind: one embedding space to bind them all. arXiv. Preprint posted online on May 9, 2023. [ CrossRef ]
  • Fei N, Lu Z, Gao Y, Yang G, Huo Y, Wen J, et al. Towards artificial general intelligence via a multimodal foundation model. Nat Commun. Jun 02, 2022;13(1):3094. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Buttazzo G. Rise of artificial general intelligence: risks and opportunities. Front Artif Intell. Aug 25, 2023;6:1226990. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: transformers for image recognition at scale. arXiv. Preprint posted online on October 22, 2020. [ CrossRef ]
  • Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv. Preprint posted online on September 4, 2014. [ CrossRef ]
  • He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. Presented at: CVPR 2016; June 27-30, 2016; Las Vegas, NV. [ CrossRef ]
  • Albaqami H, Hassan GM, Datta A. Automatic detection of abnormal EEG signals using WaveNet and LSTM. Sensors (Basel). Jun 27, 2023;23(13):5960. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hannun A, Case C, Casper J, Catanzaro B, Diamos G, Elsen E, et al. Deep speech: scaling up end-to-end speech recognition. arXiv. Preprint posted online on December 17, 2014. [ CrossRef ]
  • Zhu S, Zheng J, Ma Q. MR-Transformer: multiresolution transformer for multivariate time series prediction. IEEE Trans Neural Netw Learn Syst. Nov 06, 2023;PP. (forthcoming). [ CrossRef ] [ Medline ]
  • Baidya R, Jeong H. Anomaly detection in time series data using reversible instance normalized anomaly transformer. Sensors (Basel). Nov 19, 2023;23(22):9272. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bai N, Wang X, Han R, Wang Q, Liu Z. PAFormer: anomaly detection of time series with parallel-attention transformer. IEEE Trans Neural Netw Learn Syst. Dec 11, 2023;PP. (forthcoming). [ CrossRef ] [ Medline ]
  • Poplin R, Chang PC, Alexander D, Schwartz S, Colthurst T, Ku A, et al. A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. Nov 2018;36(10):983-987. [ CrossRef ] [ Medline ]
  • Kelley DR, Snoek J, Rinn JL. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. Jul 2016;26(7):990-999. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Angermueller C, Lee HJ, Reik W, Stegle O. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. Apr 11, 2017;18(1):67. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wang X, Jiang Y, Bach N, Wang T, Huang Z, Huang F, et al. Automated concatenation of embeddings for structured prediction. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021. Presented at: ACL/IJCNLP 2021; August 1-6, 2021; Virtual Event. [ CrossRef ]
  • Xi C, Lu G, Yan J. Multimodal sentiment analysis based on multi-head attention mechanism. In: Proceedings of the 4th International Conference on Machine Learning and Soft Computing. 2020. Presented at: ICMLSC '20; January 17-19, 2020; Haiphong City, Viet Nam. [ CrossRef ]
  • Kiela D, Bhooshan S, Firooz H, Perez E, Testuggine D. Supervised multimodal bitransformers for classifying images and text. arXiv. Preprint posted online on September 6, 2019. [ CrossRef ]
  • Zadeh A, Chen M, Poria S, Cambria E, Morency LP. Tensor fusion network for multimodal sentiment analysis. arXiv. Preprint posted online on July 23, 2017. [ CrossRef ]
  • Chen Z, Cano AH, Romanou A, Bonnet A, Matoba K, Salvi F, et al. MEDITRON-70B: scaling medical pretraining for large language models. arXiv. Preprint posted online on November 27, 2023. [ CrossRef ]
  • Akrout M, Cirone KD, Vender R. Evaluation of vision LLMs GTP-4V and LLaVA for the recognition of features characteristic of melanoma. J Cutan Med Surg. 2024;28(1):98-99. [ CrossRef ] [ Medline ]
  • Rasheed H, Maaz M, Mullappilly SS, Shaker A, Khan S, Cholakkal H, et al. GLaMM: pixel grounding large multimodal model. arXiv. Preprint posted online on November 6, 2023
  • Bai J, Bai S, Yang S, Wang S, Tan S, Wang P, et al. Qwen-VL: a versatile vision-language model for understanding, localization, text reading, and beyond. arXiv. Preprint posted online on August 24, 2023. [ CrossRef ]
  • Chen B, Xu Z, Kirmani S, Ichter B, Driess D, Florence P, et al. SpatialVLM: endowing vision-language models with spatial reasoning capabilities. arXiv. Preprint posted online on January 22, 2024. [ CrossRef ]
  • Chen Z, Wu J, Wang W, Su W, Chen G, Xing S, et al. InternVL: scaling up vision foundation models and aligning for generic visual-linguistic tasks. arXiv. Preprint posted online on December 21, 2023. [ CrossRef ]
  • Yuan Y, Li W, Liu J, Tang D, Luo X, Qin C, et al. Osprey: pixel understanding with visual instruction tuning. arXiv. Preprint posted online on December 15, 2023. [ CrossRef ]
  • Wei H, Kong L, Chen J, Zhao L, Ge Z, Yang J, et al. Vary: scaling up the vision vocabulary for large vision-language models. arXiv. Preprint posted online on December 11, 2023. [ CrossRef ]
  • Chen L, Li J, Dong X, Zhang P, He C, Wang J, et al. ShareGPT4V: improving large multi-modal models with better captions. arXiv. Preprint posted online on November 21, 2023. [ CrossRef ]
  • Li B, Zhang P, Yang J, Zhang Y, Pu F, Liu Z. OtterHD: a high-resolution multi-modality model. arXiv. Preprint posted online on November 7, 2023. [ CrossRef ]
  • Chen G, Shen L, Shao R, Deng X, Nie L. LION : empowering multimodal large language model with dual-level visual knowledge. arXiv. Preprint posted online on November 20, 2023. [ CrossRef ]
  • Lin Z, Liu C, Zhang R, Gao P, Qiu L, Xiao H, et al. SPHINX: the joint mixing of weights, tasks, and visual embeddings for multi-modal large language models. arXiv. Preprint posted online on November 13, 2023. [ CrossRef ]
  • Hu W, Xu Y, Li Y, Li W, Chen Z, Tu Z. BLIVA: a simple multimodal LLM for better handling of text-rich visual questions. Proc AAAI Conf Artif Intell. Mar 24, 2024;38(3):2256-2264. [ CrossRef ]
  • Zhao B, Wu B, He M, Huang T. SVIT: scaling up visual instruction tuning. arXiv. Preprint posted online on July 9, 2023. [ CrossRef ]
  • Li J, Chen D, Hong Y, Chen Z, Chen P, Shen Y, et al. CoVLM: composing visual entities and relationships in large language models via communicative decoding. arXiv. Preprint posted online on November 6, 2023. [ CrossRef ]
  • Li C, Wong C, Zhang S, Usuyama N, Liu H, Yang J, et al. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day. arXiv. Preprint posted online on June 1, 2023. [ CrossRef ]
  • Zhang X, Wu C, Zhao Z, Lin W, Zhang Y, Wang Y, et al. PMC-VQA: visual instruction tuning for medical visual question answering. arXiv. Preprint posted online on May 17, 2023. [ CrossRef ]
  • Moor M, Huang Q, Wu S, Yasunaga M, Zakka C, Dalmia Y, et al. Med-Flamingo: a multimodal medical few-shot learner. arXiv. Preprint posted online on July 27, 2023. [ CrossRef ]
  • He J, Liu G, Li P, Zhao Z, Zhong S. PeFoMed: parameter efficient fine-tuning on multimodal large language models for medical visual question answering. arXiv. Preprint posted online on April 16, 2024. [ CrossRef ]
  • Han T, Adams LC, Nebelung S, Kather JN, Bressem KK, Truhn D. Multimodal large language models are generalist medical image interpreters. medRxiv. Preprint posted online on December 22, 2023. [ CrossRef ]
  • Yang X, Chen A, PourNejatian N, Shin HC, Smith KE, Parisien C, et al. A large language model for electronic health records. NPJ Digit Med. Dec 26, 2022;5(1):194. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jin M, Wang S, Ma L, Chu Z, Zhang JY, Shi X, et al. Time-LLM: time series forecasting by reprogramming large language models. arXiv. Preprint posted online on October 3, 2023. [ CrossRef ]
  • Chang C, Wang WY, Peng WC, Chen TF. LLM4TS: aligning pre-trained LLMs as data-efficient time-series forecasters. arXiv. Preprint posted online on August 16, 2023. [ CrossRef ]
  • Cao D, Jia F, Arik SO, Pfister T, Zheng Y, Ye W, et al. TEMPO: prompt-based generative pre-trained transformer for time series forecasting. arXiv. Preprint posted online on October 8, 2023. [ CrossRef ]
  • Xue H, Salim FD. PromptCast: a new prompt-based learning paradigm for time series forecasting. IEEE Trans Knowl Data Eng. Dec 13, 2023:1-14. [ CrossRef ]
  • Jin M, Wen Q, Liang Y, Zhang C, Xue S, Wang X, et al. Large models for time series and spatio-temporal data: a survey and outlook. arXiv. Preprint posted online on October 16, 2023. [ CrossRef ]
  • Gruver N, Finzi M, Qiu S, Wilson AG. Large language models are zero-shot time series forecasters. arXiv. Preprint posted online on October 11, 2023. [ CrossRef ]
  • Liu X, McDuff D, Kovacs G, Galatzer-Levy I, Sunshine J, Zhan J, et al. Large language models are few-shot health learners. arXiv. Preprint posted online on May 24, 2023. [ CrossRef ]
  • Rubenstein PK, Asawaroengchai C, Nguyen DD, Bapna A, Borsos Z, de Chaumont Quitry F, et al. AudioPaLM: a large language model that can speak and listen. arXiv. Preprint posted online on June 22, 2023. [ CrossRef ]
  • Borsos Z, Marinier R, Vincent D, Kharitonov E, Pietquin O, Sharifi M, et al. AudioLM: a language modeling approach to audio generation. IEEE/ACM Trans Audio Speech Lang Process. Jun 21, 2023;31:2523-2533. [ CrossRef ]
  • Deshmukh S, Elizalde B, Singh R, Wang H. Pengi: an audio language model for audio tasks. arXiv. Preprint posted online on May 19, 2023. [ CrossRef ]
  • Huang R, Li M, Yang D, Shi J, Chang X, Ye Z, et al. AudioGPT: understanding and generating speech, music, sound, and talking head. Proc AAAI Conf Artif Intell. Mar 24, 2024;38(21):23802-23804. [ CrossRef ]
  • Wang T, Zhou L, Zhang Z, Wu Y, Liu S, Gaur Y, et al. VioLA: conditional language models for speech recognition, synthesis, and translation. IEEE/ACM Trans Audio Speech Lang Process. Jul 29, 2024;32:3709-3716. [ CrossRef ]
  • Tang C, Yu W, Sun G, Chen X, Tan T, Li W, et al. SALMONN: towards generic hearing abilities for large language models. arXiv. Preprint posted online on October 20, 2023. [ CrossRef ]
  • Latif S, Shoukat M, Shamshad F, Usama M, Ren Y, Cuayáhuitl H, et al. Sparks of large audio models: a survey and outlook. arXiv. Preprint posted online on August 24, 2023. [ CrossRef ]
  • Pieri S, Mullappilly SS, Khan FS, Anwer RM, Khan S, Baldwin T, et al. BiMediX: bilingual medical mixture of experts LLM. arXiv. Preprint posted online on February 20, 2024. [ CrossRef ]
  • Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. Feb 15, 2020;36(4):1234-1240. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc. Oct 15, 2021;3(1):1-23. [ CrossRef ]
  • Maaz M, Rasheed H, Khan S, Khan FS. Video-ChatGPT: towards detailed video understanding via large vision and language models. arXiv. Preprint posted online on June 8, 2023. [ CrossRef ]
  • Li Y, Wang C, Jia J. LLaMA-VID: an image is worth 2 tokens in large language models. arXiv. Preprint posted online on November 28, 2023. [ CrossRef ]
  • Jiang B, Chen X, Liu W, Yu J, Yu G, Chen T. MotionGPT: human motion as a foreign language. arXiv. Preprint posted online on June 26, 2023. [ CrossRef ]
  • Li L, Gan Z, Lin K, Lin CC, Liu Z, Liu C, et al. LAVENDER: unifying video-language understanding as masked language modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. Presented at: CVPR 2023; June 17-24, 2023; Vancouver, BC. [ CrossRef ]
  • Song E, Chai W, Wang G, Zhang Y, Zhou H, Wu F, et al. MovieChat: from dense token to sparse memory for long video understanding. arXiv. Preprint posted online on July 31, 2023. [ CrossRef ]
  • Yang A, Nagrani A, Seo PH, Miech A, Pont-Tuset J, Laptev I, et al. Vid2Seq: large-scale pretraining of a visual language model for dense video captioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. Presented at: CVPR 2023; June 17-24, 2023; Vancouver, BC. [ CrossRef ]
  • Chen G, Zheng YD, Wang J, Xu J, Huang Y, Pan J, et al. VideoLLM: modeling video sequence with large language models. arXiv. Preprint posted online on May 22, 2023. [ CrossRef ]
  • Huang B, Wang X, Chen H, Song Z, Zhu W. VTimeLLM: empower LLM to grasp video moments. arXiv. Preprint posted online on November 30, 2023. [ CrossRef ]
  • Liu J, Yang M, Yu Y, Xu H, Li K, Zhou X. arge language models in bioinformatics: applications and perspectives. arXiv. Preprint posted online on January 8, 2024. [ FREE Full text ] [ CrossRef ]
  • Benegas G, Batra SS, Song YS. DNA language models are powerful predictors of genome-wide variant effects. Proc Natl Acad Sci U S A. Oct 31, 2023;120(44):e2311219120. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ji Y, Zhou Z, Liu H, Davuluri RV. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics. Aug 09, 2021;37(15):2112-2120. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhou Z, Ji Y, Li W, Dutta P, Davuluri R, Liu H. DNABERT-2: efficient foundation model and benchmark for multi-species genome. arXiv. Preprint posted online on June 26, 2023. [ CrossRef ]
  • Zhang D, Zhang W, Zhao Y, Zhang J, He B, Qin C, et al. DNAGPT: a generalized pre-trained tool for versatile DNA sequence analysis tasks. BioRxiv. Preprint posted online on January 04, 2024. [ CrossRef ]
  • Luo H, Shan W, Chen C, Ding P, Luo L. Improving language model of human genome for DNA-protein binding prediction based on task-specific pre-training. Interdiscip Sci. Mar 2023;15(1):32-43. [ CrossRef ] [ Medline ]
  • An W, Guo Y, Bian Y, Ma H, Yang J, Li C, et al. MoDNA: motif-oriented pre-training for DNA language model. In: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. 2022. Presented at: BCB '22; August 7-10, 2022; Northbrook, Illinois. [ CrossRef ]
  • Chen K, Zhou Y, Ding M, Wang Y, Ren Z, Yang Y. Self-supervised learning on millions of pre-mRNA sequences improves sequence-based RNA splicing prediction. BioRxiv. Preprint posted online on February 3, 2023. [ CrossRef ]
  • Feng H, Wang S, Wang Y, Ni X, Yang Z, Hu X, et al. LncCat: an ORF attention model to identify LncRNA based on ensemble learning strategy and fused sequence information. Comput Struct Biotechnol J. Feb 08, 2023;21:1433-1447. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yamada K, Hamada M. Prediction of RNA-protein interactions using a nucleotide language model. Bioinform Adv. Apr 07, 2022;2(1):vbac023. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang L, Qin X, Liu M, Liu G, Ren Y. BERT-m7G: a transformer architecture based on BERT and stacking ensemble to identify RNA N7-methylguanosine sites from sequence information. Comput Math Methods Med. Aug 25, 2021;2021:7764764. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Babjac AN, Lu Z, Emrich SJ. CodonBERT: using BERT for sentiment analysis to better predict genes with low expression. In: Proceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. 2023. Presented at: BCB '23; September 3-6, 2023; Houston, TX. [ CrossRef ]
  • Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans Pattern Anal Mach Intell. Oct 2022;44(10):7112-7127. [ CrossRef ] [ Medline ]
  • Madani A, Krause B, Greene ER, Subramanian S, Mohr BP, Holton JM, et al. Large language models generate functional protein sequences across diverse families. Nat Biotechnol. Aug 2023;41(8):1099-1106. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Xu M, Yuan X, Miret S, Tang J. ProtST: multi-modality learning of protein sequences and biomedical texts. arXiv. Preprint posted online on January 28, 2023. [ CrossRef ]
  • Brandes N, Ofer D, Peleg Y, Rappoport N, Linial M. ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics. Apr 12, 2022;38(8):2102-2110. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wang Z, Zhang Q, Yu H, Hu S, Jin X, Gong Z, et al. Multi-level protein structure pre-training with prompt learning. In: Proceedings of the Eleventh International Conference on Learning Representations. 2023. Presented at: ICLR 2023; May 1-5, 2023; Kigali, Rwanda.
  • Wang S, You R, Liu Y, Xiong Y, Zhu S. NetGO 3.0: protein language model improves large-scale functional annotations. Genomics Proteomics Bioinformatics. Apr 2023;21(2):349-358. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Flamholz ZN, Biller SJ, Kelly L. Large language models improve annotation of prokaryotic viral proteins. Nat Microbiol. Feb 2024;9(2):537-549. [ CrossRef ] [ Medline ]
  • Driess D, Xia F, Sajjadi MS, Lynch C, Chowdhery A, Ichter B, et al. PaLM-E: an embodied multimodal language model. arXiv. Preprint posted online on March 6, 2023. [ CrossRef ]
  • Moon S, Madotto A, Lin Z, Nagarajan T, Smith M, Jain S, et al. AnyMAL: an efficient and scalable any-modality augmented language model. arXiv. Preprint posted online on September 27, 2023. [ CrossRef ]
  • Wu S, Fei H, Qu L, Ji W, Chua TS. NExT-GPT: any-to-any multimodal LLM. arXiv. Preprint posted online on September 11, 2023. [ CrossRef ]
  • Lyu C, Wu M, Wang L, Huang X, Liu B, Du Z, et al. Macaw-LLM: multi-modal language modeling with image, audio, video, and text integration. arXiv. Preprint posted online on June 15, 2023. [ CrossRef ]
  • Han J, Gong K, Zhang Y, Wang J, Zhang K, Lin D, et al. OneLLM: one framework to align all modalities with language. arXiv. Preprint posted online on December 6, 2023. [ CrossRef ]
  • Ye Q, Xu H, Ye J, Yan M, Hu A, Liu H, et al. mPLUG-Owl2: revolutionizing multi-modal large language model with modality collaboration. arXiv. Preprint posted online on November 7, 2023. [ CrossRef ]
  • Gemini Team Google. Gemini: a family of highly capable multimodal models. arXiv. Preprint posted online on December 19, 2023. [ CrossRef ]
  • Lu MY, Chen B, Williamson DF, Chen RJ, Liang I, Ding T, et al. A visual-language foundation model for computational pathology. Nat Med. Mar 2024;30(3):863-874. [ CrossRef ] [ Medline ]
  • CONCH: a vision-language foundation model for computational pathology. GitHub. URL: https://github.com/mahmoodlab/CONCH [accessed 2024-08-02]
  • Li Y, Quan R, Zhu L, Yang Y. Efficient multimodal fusion via interactive prompting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. Presented at: CVPR 2023; June 17-24, 2023; Vancouver, BC. [ CrossRef ]
  • Ye J, Hai J, Song J, Wang Z. Multimodal data hybrid fusion and natural language processing for clinical prediction models. medRxiv. Preprint posted online on August 25, 2023. [ CrossRef ]
  • Quan Z, Sun T, Su M, Wei J. Multimodal sentiment analysis based on cross-modal attention and gated cyclic hierarchical fusion networks. Comput Intell Neurosci. Aug 9, 2022;2022:4767437. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yuan H, Xu H. Deep multi-modal fusion network with gated unit for breast cancer survival prediction. Comput Methods Biomech Biomed Engin. May 2024;27(7):883-896. [ CrossRef ] [ Medline ]
  • Zhou H, Liu F, Gu B, Zou X, Huang J, Wu J, et al. A survey of large language models in medicine: progress, application, and challenge. arXiv. Preprint posted online on November 9, 2023
  • Johnson AE, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. Jan 03, 2023;10(1):1. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wu W, Wang Y, Liu Q, Wang G, Zhang J. Wavelet-improved score-based generative model for medical imaging. IEEE Trans Med Imaging. Mar 2024;43(3):966-979. [ CrossRef ] [ Medline ]
  • Li W, Yang J, Min X. Next-day medical activities recommendation model with double attention mechanism using generative adversarial network. J Healthc Eng. Nov 7, 2022;2022:6334435. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Barreto AG, de Oliveira JM, Gois FN, Cortez PC, de Albuquerque VH. A new generative model for textual descriptions of medical images using transformers enhanced with convolutional neural networks. Bioengineering (Basel). Sep 19, 2023;10(9):1098. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhao L, Huang J. A distribution information sharing federated learning approach for medical image data. Complex Intell Systems. Mar 29, 2023. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yue G, Wei P, Zhou T, Song Y, Zhao C, Wang T, et al. Specificity-aware federated learning with dynamic feature fusion network for imbalanced medical image classification. IEEE J Biomed Health Inform. Sep 26, 2023;PP. (forthcoming). [ CrossRef ] [ Medline ]
  • Wang R, Lai J, Zhang Z, Li X, Vijayakumar P, Karuppiah M. Privacy-preserving federated learning for internet of medical things under edge computing. IEEE J Biomed Health Inform. Feb 2023;27(2):854-865. [ CrossRef ] [ Medline ]
  • Ma Y, Wang J, Yang J, Wang L. Model-heterogeneous semi-supervised federated learning for medical image segmentation. IEEE Trans Med Imaging. Jan 01, 2024;PP. (forthcoming). [ CrossRef ] [ Medline ]
  • Mantey EA, Zhou C, Anajemba JH, Arthur JK, Hamid Y, Chowhan A, et al. Federated learning approach for secured medical recommendation in internet of medical things using homomorphic encryption. IEEE J Biomed Health Inform. Jun 2024;28(6):3329-3340. [ CrossRef ] [ Medline ]
  • Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, et al. LoRA: low-rank adaptation of large language models. arXiv. Preprint posted online on June 17, 2021. [ CrossRef ]
  • Dettmers T, Pagnoni A, Holtzman A, Zettlemoyer L. QLoRA: efficient finetuning of quantized LLMs. arXiv. Preprint posted online on May 23, 2023. [ CrossRef ]
  • Han S, Mao H, Dally WJ. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv. Preprint posted online on Oct 1, 2015. [ CrossRef ]
  • Lu Y, Li C, Lu H, Yang J, Gao J, Shen Y. An empirical study of scaling instruct-tuned large multimodal models. arXiv. Preprint posted online on September 18, 2023. [ CrossRef ]
  • Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv. Preprint posted online on March 9, 2015. [ CrossRef ]
  • Huang S, Dong L, Wang W, Hao Y, Singhal S, Ma S, et al. Language is not all you need: aligning perception with language models. arXiv. Preprint posted online on February 27, 2023. [ CrossRef ]
  • Chang H, Zhang H, Barber J, Maschinot A, Lezama J, Jiang L, et al. Muse: text-to-image generation via masked generative transformers. arXiv. Preprint posted online on January 2, 2023. [ CrossRef ]
  • Chen J, Yu J, Ge C, Yao L, Xie E, Wu Y, et al. PixArt-α: fast training of diffusion transformer for photorealistic text-to-image synthesis. arXiv. Preprint posted online on September 30, 2023. [ CrossRef ]
  • Cheng Y, Wang D, Zhou P, Zhang T. A survey of model compression and acceleration for deep neural networks. arXiv. Preprint posted online on October 23, 2017. [ CrossRef ]
  • Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, et al. In-datacenter performance analysis of a tensor processing unit. arXiv. Preprint posted online on April 16, 2017. [ FREE Full text ] [ CrossRef ]
  • Zhao H, Chen H, Yang F, Liu N, Deng H, Cai H, et al. Explainability for large language models: a survey. ACM Trans Intell Syst Technol. Feb 22, 2024;15(2):1-38. [ CrossRef ]
  • Hoover B, Strobelt H, Gehrmann S. exBERT: a visual analysis tool to explore learned representations in transformers models. arXiv. Preprint posted online on October 11, 2019. [ FREE Full text ] [ CrossRef ]
  • Wu T, Ribeiro MT, Heer J, Weld D. Polyjuice: generating counterfactuals for explaining, evaluating, and improving models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021. Presented at: ACL-IJCNLP 2021; August 1-6, 2021; Online. [ CrossRef ]
  • Chen H, Covert IC, Lundberg SM, Lee SI. Algorithms to estimate shapley value feature attributions. Nat Mach Intell. May 22, 2023;5:590-601. [ CrossRef ]
  • Yordanov Y, Kocijan V, Lukasiewicz T, Camburu OM. Few-shot out-of-domain transfer learning of natural language explanations in a label-abundant setup. arXiv. Preprint posted online on December 12, 2021. [ FREE Full text ]
  • Luo S, Ivison H, Han SC, Poon J. Local interpretations for explainable natural language processing: a survey. ACM Comput Surv. Apr 25, 2024;56(9):1-36. [ CrossRef ]
  • Dalvi F, Durrani N, Sajjad H, Belinkov Y, Bau A, Glass J. What is one grain of sand in the desert? Analyzing individual neurons in deep NLP models. Proc AAAI Conf Artif Intell. Jul 17, 2019;33(01):6309-6317. [ CrossRef ]
  • Hewitt J, Manning CD. A structural probe for finding syntax in word representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019. Presented at: NAACL-HLT 2019; June 2-7, 2019; Minneapolis, MN.
  • Ong JC, Chang SY, William W, Butte AJ, Shah NH, Chew LS, et al. Ethical and regulatory challenges of large language models in medicine. Lancet Digit Health. Jun 2024;6(6):e428-e432. [ CrossRef ]
  • Price N. Problematic interactions between AI and health privacy. Utah Law Rev. 2021;2021(4):925-936. [ CrossRef ]
  • Gerke S, Minssen T, Cohen G. Chapter 12 - Ethical and legal challenges of artificial intelligence-driven healthcare. In: Bohr A, Memarzadeh K, editors. Artificial Intelligence in Healthcare. Cambridge, MA. Academic Press; 2020:295-336.
  • Becker J, Gerke S, Cohen IG. The development, implementation, and oversight of artificial intelligence in health care: legal and ethical issues. In: Valdés E, Lecaros JA, editors. Handbook of Bioethical Decisions. Volume I. Cham, Switzerland. Springer; 2023.
  • Ma W, Scheible H, Wang B, Veeramachaneni G, Chowdhary P, Sun A, et al. Deciphering stereotypes in pre-trained language models. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. Presented at: EMNLP 2023; December 6-10, 2023; Singapore, Singapore.
  • Christos K. Multimodal LLMs: fairness and transparency concerns. Media Analysis, Verification and Retrieval Group (MeVer). Nov 2, 2023. URL: https://mever.gr/post/fairness-and-transparency-concerns-in-multimodal-llms/ [accessed 2024-09-13]
  • He K, Mao R, Lin Q, Ruan Y, Lan X, Feng M, et al. A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics. arXiv. Preprint posted online on October 9, 2023. [ FREE Full text ] [ CrossRef ]
  • Navigli R, Conia S, Ross B. Biases in large language models: origins, inventory, and discussion. ACM J Data Inf Qual. Jun 22, 2023;15(2):1-21. [ CrossRef ]
  • Lee N, Bang Y, Lovenia H, Cahyawijaya S, Dai W, Fung P. Survey of social bias in vision-language models. arXiv. Preprint posted online on September 24, 2023. [ FREE Full text ]
  • Reddy AG, Bachu S, Dash S, Sharma C, Sharma A, Balasubramanian VN. On counterfactual data augmentation under confounding. arXiv. Preprint posted online on May 29, 2023. [ FREE Full text ]
  • Chen RJ, Wang JJ, Williamson DF, Chen TY, Lipkova J, Lu MY, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng. Jun 2023;7(6):719-742. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Cohen IG. Informed consent and medical artificial intelligence: what to tell the patient? Georgetown Law J. May 1, 2020;108:1425-1469. [ FREE Full text ] [ CrossRef ]
  • Friesen P, Douglas-Jones R, Marks M, Pierce R, Fletcher K, Mishra A, et al. Governing AI-driven health research: are IRBs up to the task? Ethics Hum Res. Mar 2021;43(2):35-42. [ CrossRef ] [ Medline ]
  • Carlini N, Tramer F, Wallace E, Jagielski M, Herbert-Voss A, Lee K, et al. Extracting training data from large language models. arXiv. Preprint posted online on December 14, 2020. [ FREE Full text ]
  • Erlich Y, Shor T, Pe'er I, Carmi S. Identity inference of genomic data using long-range familial searches. Science. Nov 09, 2018;362(6415):690-694. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Na L, Yang C, Lo CC, Zhao F, Fukuoka Y, Aswani A. Feasibility of reidentifying individuals in large national physical activity data sets from which protected health information has been removed with use of machine learning. JAMA Netw Open. Dec 07, 2018;1(8):e186040. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gymrek M, McGuire AL, Golan D, Halperin E, Erlich Y. Identifying personal genomes by surname inference. Science. Jan 18, 2013;339(6117):321-324. [ CrossRef ] [ Medline ]
  • Aghajanyan A, Huang B, Ross C, Karpukhin V, Xu H, Goyal N, et al. CM3: a causal masked multimodal model of the internet. arXiv. Preprint posted online on January 19, 2022. [ FREE Full text ]
  • Huang Y, Yuan Q, Sheng X, Yang Z, Wu H, Chen P, et al. AesBench: an expert benchmark for multimodal large language models on image aesthetics perception. arXiv. Preprint posted online on January 16, 2024. [ FREE Full text ]
  • Wang X, Zhou Y, Liu X, Lu H, Xu Y, He F, et al. Mementos: a comprehensive benchmark for multimodal large language model reasoning over image sequences. arXiv. Preprint posted online on January 19, 2024. [ FREE Full text ]
  • Fu C, Chen P, Shen Y, Qin Y, Zhang M, Lin X, et al. MME: a comprehensive evaluation benchmark for multimodal large language models. arXiv. Preprint posted online on June 23, 2023. [ FREE Full text ]
  • Yang X, Wu W, Feng S, Wang M, Wang D, Li Y, et al. MM-BigBench: evaluating multimodal models on multimodal content comprehension tasks. arXiv. Preprint posted online on October 13, 2023. [ FREE Full text ]
  • Ge W, Chen S, Chen GH, Chen Z, Chen J, Yan S, et al. MLLM-Bench: evaluating multimodal LLMs with per-sample criteria. arXiv. Preprint posted online on November 23, 2023. [ FREE Full text ]
  • Li S, Zhang Y, Zhao Y, Wang Q, Jia F, Liu Y, et al. VLM-Eval: a general evaluation on video large language models. arXiv. Preprint posted online on November 20, 2023. [ FREE Full text ]
  • The act texts. EU Artificial Intelligence Act. URL: https://artificialintelligenceact.eu/the-act/ [accessed 2024-09-13]

Abbreviations

artificial general intelligence
artificial intelligence
Bidirectional Encoder Representations From Transformers
continuous glucose monitoring
causal masked multimodal
convolutional neural network
Contrastive Learning From Captions for Histopathology
chronic obstructive pulmonary disease
computed tomography
Digital Imaging and Communications in Medicine
dynamic policy optimization
electronic health record
generative pretrained transformer
graphics processing unit
hematoxylin and eosin
Health Insurance Portability and Accountability Act
institutional review board
large language model
long short-term memory
multimodal large language model
mask language modeling
magnetic resonance imaging
natural language processing
recurrent neural network
tensor processing unit

Edited by A Coristine, A Mavragani; submitted 13.04.24; peer-reviewed by B Solaiman, G Sun; comments to author 14.05.24; revised version received 07.08.24; accepted 20.08.24; published 25.09.24.

©Rawan AlSaad, Alaa Abd-alrazaq, Sabri Boughorbel, Arfan Ahmed, Max-Antoine Renault, Rafat Damseh, Javaid Sheikh. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 25.09.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

  • Tools and Resources
  • Customer Services
  • Applied Linguistics
  • Biology of Language
  • Cognitive Science
  • Computational Linguistics
  • Historical Linguistics
  • History of Linguistics
  • Language Families/Areas/Contact
  • Linguistic Theories
  • Neurolinguistics
  • Phonetics/Phonology
  • Psycholinguistics
  • Sign Languages
  • Sociolinguistics
  • Share Facebook LinkedIn Twitter

Article contents

Vowel length in the romance languages.

  • Michele Loporcaro Michele Loporcaro University of Zurich
  • https://doi.org/10.1093/acrefore/9780199384655.013.715
  • Published online: 18 September 2024

This article classifies all the Romance languages and dialects with regard to the phonological property of vowel length, phonemic and allophonic, considering its relationship to its correlate in phonetic substance, namely vocoid duration. It examines the rise and fall of vowel length in Romance, starting with a reference to Latin, where vowel length was contrastive as was consonant length, while in the Latin-Romance transition the former became dependent on the latter, as a part of a syllable structure conditioning. Consequently, Proto-Romance can be reconstructed as featuring an allophonic rule that lengthens stressed vowels in non-final open syllables (short, OSL) identical to that operating today in standard Italian, all Italo-Romance dialects south of the La Spezia-Rimini line, and Sardinian, which I will label type A languages. Due to a series of later changes, the remaining Romance languages and dialects lost this allophonic rule, which gave rise to either of the two further types: on the one hand, languages lacking contrastive gemination and contrastive vowel length (type B, including Daco- and Ibero-Romance all along their documented history, as well as, today, most of Gallo-Romance); and, on the other hand, languages lacking contrastive gemination but displaying contrastively long versus short vowels (type C, including most of northern Italo-Romance as well as part of Raeto- and Gallo-Romance, but which arguably stretched from the Apennines to the North Sea in the Middle Ages).

This article examines all relevant sources of evidence, from Latin epigraphic inscriptions to experimental phonetic measurements, showing that they all chime perfectly with the picture just outlined. Needless to say, while the data from modern languages and dialects are observational, and those from older stages delivered by the written record need interpretation, reconstruction is, by definition, constructional: It can be supported by several sources of evidence but is in itself always provisional. Therefore, the story to be told here must be considered the best approximation to the historical truth that the present author deems reconstructible based on the available evidence.

  • syllable structure
  • consonant gemination
  • dialect variation

You do not currently have access to this article

Please login to access the full content.

Access to the full content requires a subscription

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 26 September 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • [185.80.151.41]
  • 185.80.151.41

Character limit 500 /500

  • Our strategy for 2023-2025
  • Diversity, equity and inclusion
  • Ethics and safeguarding
  • Impact and financial reports

type of research in language

Our strategy

type of research in language

  • Democracy and governance
  • Economic opportunity
  • Environment and climate action
  • Humanitarian response
  • Nutrition and food security 
  • Explore our expertise

type of research in language

  • Nutrition and food security

type of research in language

  • Asia Pacific
  • Central Africa
  • East Africa
  • West Africa
  • Southern Africa
  • Middle East and North Africa
  • Europe and Central Asia
  • Latin America and the Caribbean
  • United States

Around the world

type of research in language

In the United States

type of research in language

Join our team

Partner with us

  • Our FHI 360 network

type of research in language

  • Videos (YouTube)
  • Get the latest news

type of research in language

Published research

Resource library

type of research in language

Filter Resources By

Technical area.

  • Related project
  • Resource type

Explore thousands of resources produced by FHI 360’s experts that illustrate the impact of our work and underpin our data-driven solutions.

Featured resources

Discover resources by….

  • Guide, manual or tool
  • Research paper or report
  • Training material
  • Case study or success story

type of research in language

Latest posts

Read the latest stories that include insights and information from FHI 360.

USAID Enhancing Quality of Healthcare Activity II (USAID EQHA II) fact sheet

type of research in language

Capacity Development and Support Program (CDS): Final Report

type of research in language

Success stories: Enhancing Quality of Healthcare Activity

type of research in language

Building Pathways to Thriving Futures: FHI 360 Youth Development Framework

type of research in language

A Strategic Approach to HIV Programming in a Changing Climate: FHI 360’s Vision for Collective Action

type of research in language

  • Privacy overview
  • Essential cookies
  • Non-essential cookies

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Essential cookies should be enabled at all times so that we can save your preferences for cookie settings. However, they can be enabled or disabled using the button below.

If you disable all cookies, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable essential cookies again.

This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Keeping this cookie enabled helps us to improve our website.

Please enable essential cookies first so that we can save your preferences!

{{errorText}}

Visit www.aminer.cn directly for more AI services.

Go to AMiner

error code: {{errorCode}}

type of research in language

Logging off

Search for peer-reviewed journals with full access.

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

Conducting Multidisciplinary SHS Research in the Field of Cancer: Strengths and Barriers?

This paper summarizes the work held at the Cancéropôle Ile-de-France's annual SHS research seminar on the theme: Pluridisciplinarity and methods for SHS research in the field of cancer. After clarifying the concepts of pluri-, inter-, and transdisciplinarité, it aimed to describe how this type of research is carried out in practice, addressing successively: the role of stakeholders and their respective positions, the need for a shared language, the various temporalities involved and task sharing, the interview and analysis methods as well as the implication of patient-researchers. It highlighted the personal qualities required to practice this type of research, such as psychological flexibility and adaptability, a strong desire for collaborative work, acceptance of risk, and a humble stance.

No abstract is available for this article. Click the button above to view the PDF directly.

type of research in language

Web of Science

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

type of research in language

京ICP备 10035462号-42

type of research in language

  • SEP 25, 2024

Best Accelerated Online Speech Pathology Degree Programs for 2024

Imed Bouchrika, Phd

by Imed Bouchrika, Phd

Co-Founder and Chief Data Scientist

Choosing the best accelerated online speech pathology degree is a critical decision for aspiring speech-language pathologists who want to fast-track their careers. With the growing demand for professionals in this field, finding a program that balances quality, flexibility, and speed can make a significant difference in career success.

This article is designed to provide you with essential information to guide you through the process, highlighting the top programs that offer both comprehensive academic training and the flexibility of online learning. This guide will help you make an informed decision. By exploring the best options, you can find the right path to achieving your career goals in speech-language pathology.

Key things you should know about getting a degree in accelerated online speech pathology degree

  • The job market for speech-language pathologists (SLPs) is strong and growing. The U.S. Bureau of Labor Statistics (BLS) projects an 18% growth rate for SLP positions from 2023 to 2033.
  • As of May 2023, the median annual wage for SLPs was approximately $89,000, according to the BLS. Salaries can vary based on location, experience, and work setting. SLPs in specialized medical roles or with extensive experience may earn higher wages.
  • SLPs work in various settings including schools, hospitals, rehabilitation centers, private practices, and through telepractice. This diversity offers a broad range of career opportunities.

What can I expect from an accelerated online speech pathology degree?

The accelerated format aims to get you into the workforce quickly while maintaining the quality and depth of education needed for professional practice.

The program will cover the same core material as traditional degrees but in a shorter time frame, often one to two years. Expect a rigorous and fast-paced learning experience with overlapping courses and frequent assessments.

Clinical hours will be supervised by licensed and certified speech-language pathologists (SLPs). You will need to document your hours and receive feedback on your performance.

Where can I work with an accelerated online speech pathology degree?

Schools are a common workplace, where you can provide speech and language therapy to students from preschool through high school, helping them overcome communication disorders that impact their academic performance and social interactions. Hospitals and rehabilitation centers also offer opportunities, where you will work with patients recovering from strokes, brain injuries, or surgeries, providing crucial therapy to help them regain their communication abilities.

Additionally, private practices and clinics allow for specialized and individualized care, enabling you to focus on specific disorders or populations, such as pediatric or adult speech therapy. Telepractice is another growing field, especially suited for the online nature of your degree, where you can offer remote therapy sessions to clients across different locations.

How much can I make with an accelerated online speech pathology degree?

With an accelerated online speech pathology degree, you can expect to earn a competitive salary in the field. The median annual wage for speech-language pathologists is approximately $84,000, though this can vary based on factors such as location, experience, and work setting.

SLPs in high-demand areas or specialized medical roles may earn more, with salaries potentially exceeding $100,000 in urban centers or specialized positions. Additionally, many roles offer benefits such as health insurance, retirement plans, and paid time off, enhancing the overall compensation package.

Table of Contents

Best accelerated online speech pathology degree programs, how long does it take to complete an accelerated online speech pathology degree program, how does an online accelerated online speech pathology degree compare to an on-campus program, what is the average cost of an accelerated online speech pathology degree program, what are the financial aid options for students enrolling in an accelerated online speech pathology degree program, what are the prerequisites for enrolling in an accelerated online speech pathology degree program, what courses are typically in an accelerated online speech pathology degree program, what types of specializations are available in accelerated online speech pathology degree programs, how do you choose the best accelerated online speech pathology degree program, what career paths are available for graduates of accelerated online speech pathology degree programs, what is the job market for graduates with an accelerated online speech pathology degree, other things you should know about accelerated online speech pathology degree, how do we rank schools.

We are aware that committing to an accelerated online speech pathology degree program is a big decision.  You have to give the financial commitment serious thought.  Our team of specialists at Research.com has ranked accelerated online speech pathology programs with the goal of empowering you with insights derived from data.

This ranking is built upon a comprehensive and transparent methodology . We leverage data from trusted sources like the Integrated Postsecondary Education Data System (IPEDS) database, Peterson's databases including their Distance Learning Licensed Data Set, and the College Scorecard database from the National Center for Education Statistics .  This multifaceted approach ensures we capture a holistic view of each program, allowing you to compare and contrast based on key factors relevant to your needs.

1. University of Kansas

The accelerated master's degree in speech-language pathology of the University of Kansas is designed to prepare highly skilled clinicians ready to enter the profession. This program allows qualified students to earn a bachelor's degree in speech, language, and hearing sciences and disorder. The MA is typically completed in the fifth year or later. This program meets the certification standards of the American Speech-Language-Hearing Association, allowing graduates to proceed to their clinical fellowship year. Upon successful completion of the clinical fellowship, graduates are awarded the Certificate of Clinical Competence in Speech-Language Pathology.

  • Program Length: 5 years
  • Tracks/concentrations: Speech-Language Pathology
  • Cost per Credit: $421.15
  • Required Credits to Graduate: 51 credits hours
  • Accreditation: Council on Academic Accreditation in Audiology and Speech-Language Pathology.

2. Baylor University

The Baylor University online speech-language pathology graduate program is designed for students who have earned or are in the process of earning a bachelor’s degree (BA or BS) in Communication Sciences and Disorders or have completed a leveling program. Students are required to complete 45 trimester hours, pass a comprehensive exam, complete 400 clinical hours, and take the National Praxis examination at the end of the program. Full-time students can complete the program in 5 trimesters (about 20 months), while part-time students typically finish in 7 trimesters (about 28 months). 

  • Cost per Credit: $2,000
  • Required Credits to Graduate: 45 credit hours
  • Accreditation: Council on Academic Accreditation in Audiology and Speech-Language Pathology

3. University of Central Florida

The Communication Sciences and Disorders (MA) – Accelerated BS to MA Track for speech pathology degree offered by the University of Central Florida is one semester shorter than the traditional program. Students can complete up to 16 credits of graduate-level coursework while still earning their bachelor’s degree. Internships can be arranged locally in coordination with the school. Online courses may be available, depending on the student's eligibility and course availability. Students have the option to pursue either a thesis or non-thesis track.

  • Cost per Credit: $369.65
  • Required Credits to Graduate: 72 credit hours

4. University of Rhode Island

The accelerated degree in speech-language pathology offers qualified students the opportunity to earn both a B.S. in Communicative Disorders and an M.S. in Speech-Language Pathology in just five years, instead of the typical six. Admission to this competitive program allows students to begin graduate coursework during their senior year, helping them save on tuition and enter the job market sooner. Course requirements align with those of the standard two-year master’s program.

  • Cost per Credit: $887
  • Required Credits to Graduate: 54 credit hours

5. University of Akron

The Accelerated Degree Pathway (ADP) program is a 5-year bridge between the undergraduate and graduate degrees in speech-language pathology (SLP) at the University of Akron. Selected students can complete 9 credits of graduate coursework during their third undergraduate year. Upon meeting all undergraduate degree and graduate admission requirements, students will seamlessly transition into the campus-based SLP graduate program. This pathway allows chosen students to earn both their undergraduate and graduate degrees from the University of Akron in just 5 years.

  • Cost per Credit: $782
  • Required Credits to Graduate: 58 credit hours

An accelerated online speech pathology degree program typically takes about two to three years to complete, depending on the program structure and the student's pace.

Before entering a master’s program, students must have a bachelor's degree, usually in communication sciences and disorders or a related field. This typically takes four years but can vary if the student completes an accelerated or prior degree.

A master's degree is essential. The standard time for a master’s degree in speech pathology is two years. However, accelerated programs may allow students to complete it in about 18 to 24 months.

Even in accelerated programs, students must complete the required supervised clinical hours, which are essential for certification and licensing.  Employment of speech-language pathologists is projected to grow 18 percent from 2023 to 2033.

The duration of accelerated SLP online programs also depends on whether students are studying full-time or part-time and their ability to meet clinical and academic requirements quickly.

Employment outlook for speech pathologists through 2033

An online accelerated speech pathology degree offers flexibility and convenience compared to traditional on-campus programs, but there are differences to consider in terms of structure, experience, and resources. 

Flexibility

  • Online Accelerated Program: Students can often complete coursework at their own pace, allowing them to balance work or other responsibilities. Accelerated programs are designed to help students complete the degree faster, typically in about 18–24 months.
  • On-Campus Program: Typically follows a more structured schedule with set class times. On-campus programs may offer fewer accelerated options and generally take about 2 years to complete.

Course Delivery

  • Online Accelerated Program: These programs often allow students to work from any location, making them ideal for those with geographic or time constraints.
  • On-Campus Program: Classes are conducted in-person, which may offer a more immersive learning environment with direct access to professors and classmates for face-to-face interactions.

Clinical Experience

  • Online Accelerated Program: Students are still required to complete clinical practicum hours. These are often arranged at local facilities near the student’s home. Some online programs have partnerships with clinics nationwide to help with placements.
  • On-Campus Program: Clinical hours are typically more straightforward to arrange, as the program usually has established partnerships with local facilities.

Interaction with Faculty and Peers

  • Online Accelerated Program: Interaction with faculty and peers is usually via discussion boards, emails, or video calls, which may feel less personal compared to in-person meetings.
  • On-Campus Program: Direct in-person interactions allow for spontaneous discussions and networking opportunities with peers and faculty, fostering a more collaborative learning environment.

Program Length

  • Online Accelerated Program: Designed to be faster-paced, allowing students to complete their degree in a shorter time frame. This can be more intense, requiring strong time management skills.
  • On-Campus Program: While still rigorous, the pacing is generally more traditional, with a focus on a two-year completion timeframe.
  • Online Accelerated Program: This may be more affordable due to savings on commuting, housing, and other on-campus fees. However, some programs may charge similar tuition rates to on-campus offerings.
  • On-Campus Program: Typically more expensive when accounting for the full cost of living near campus, commuting, and university fees.

The average cost of an accelerated online speech pathology degree program can vary significantly depending on the institution, but generally, tuition costs range from $30,000 to $60,000 for the full program. 

In-state tuition for online programs at public universities can be more affordable, typically ranging from $30,000 to $50,000. Out-of-state students may pay more, though some online programs offer the same rate for all students. In comparison, tuition at private universities for accelerated online programs is often higher, ranging from $50,000 to $60,000 or more.

Many online programs charge technology or platform fees, which may add an additional $1,000 to $3,000 over the duration of the program. The cost of textbooks and course materials can vary but may range from $500 to $2,000 throughout the program.

While some programs may include the cost of clinical placements, others might charge additional fees for setting up and managing local practicum experiences. About 13,700 openings for speech-language pathologists are projected each year, on average, through 2023.

Overall, students can expect to pay between $30,000 and $60,000 for an accelerated online speech pathology program, with variations depending on factors like institution type, additional fees, and residency status.

Students enrolling in an accelerated online speech pathology degree program have several financial aid options to help cover tuition and other expenses. 

Federal Financial Aid

FAFSA (Free Application for Federal Student Aid): Completing the FAFSA is the first step to accessing federal aid. 

  • Federal Direct Unsubsidized Loans: Graduate students are eligible for unsubsidized loans, which accrue interest from the time the loan is disbursed. The maximum loan amount per academic year is typically around $20,500.
  • Federal Grad PLUS Loans: These loans cover any remaining costs after other financial aid is applied. They have higher interest rates but no borrowing limit beyond the cost of attendance.
  • Pell Grants and Other Grants: Although Pell Grants are typically for undergraduates, some graduate-level students may qualify for other federal grants or funding based on need or merit.

Scholarships

  • University-Specific Scholarships: Many universities offer scholarships for students in speech pathology programs. These can be based on academic achievement, financial need, or demographic factors. Online students are often eligible for the same scholarships as on-campus students.
  • Professional Associations: Organizations like the American Speech-Language-Hearing Association (ASHA) and the National Student Speech Language Hearing Association (NSSLHA) offer scholarships specifically for students pursuing speech-language pathology.
  • Private Scholarships: There are numerous private organizations that provide scholarships for speech pathology students.
  • State and Institutional Grants: Some states and universities offer grants for graduate students. These are typically based on financial need and do not need to be repaid.
  • Professional Organization Grants: ASHA and other professional organizations sometimes offer grants or fellowships to graduate students pursuing speech pathology, often to support specific research or clinical interests.

Private Student Loans

  • Private Lenders: If federal aid and scholarships are insufficient, private student loans from lenders can help cover the remaining costs. Interest rates and repayment terms vary, so it is essential to compare lenders before borrowing.
  • Eligibility and Terms: These loans typically require a credit check and may have higher interest rates than federal loans. Some lenders offer specific loans for students in healthcare-related fields.

Job outlook for speech-language pathologists

As with other college majors , enrolling in an accelerated online speech pathology degree program typically requires meeting several academic and professional prerequisites. These prerequisites ensure that students are prepared for the rigorous coursework and clinical training involved in the program. Here are the common requirements:

Bachelor’s Degree

Most programs require applicants to have a bachelor's degree in Communication Sciences and Disorders (CSD) or a related field. This ensures that students already have foundational knowledge in speech and language development, hearing science, and anatomy.

If your degree is in an unrelated field, many programs may still accept you, but you might need to complete prerequisite coursework in areas like phonetics, speech science, audiology, and language development before starting the master’s program. Some schools offer post-baccalaureate or leveling courses to help non-CSD students meet these requirements.

For students without a CSD degree, the following courses are typically required:

  • Introduction to Communication Disorders
  • Speech and Language Development
  • Anatomy and Physiology of Speech and Hearing Mechanisms
  • Audiology and Hearing Science
  • Neurological Bases of Communication
  • Speech Science

These prerequisites can sometimes be completed online before entering the full master’s program. If you've taken similar courses, you may need to submit syllabi for review to ensure they meet the program's standards.

Minimum GPA

Most programs require a minimum GPA of 3.0 on a 4.0 scale. Competitive programs may have higher GPA standards, particularly in prerequisite coursework.

GRE Scores (Optional for Some Programs)

Some speech pathology programs require Graduate Record Examination (GRE) scores as part of the application process, though many are moving away from this requirement, especially for online and accelerated programs.

If required, a competitive GRE score will strengthen your application, particularly if your GPA is below the program’s ideal range.

Letters of Recommendation

Applicants are usually asked to submit letters of recommendation from academic or professional references. These should come from individuals who can speak to your academic abilities, work ethic, and potential for success in the field of speech pathology.

Ideal references include professors, supervisors from clinical settings, or professionals in the communication sciences field.

Personal Statement or Statement of Purpose

A well-written personal statement or statement of purpose is a crucial part of the application. It should highlight your reasons for pursuing speech pathology, your academic and professional background, and your long-term career goals.

The personal statement is also an opportunity to explain why you’re interested in the accelerated and online format of the program.

Clinical Observation Hours

Some programs require applicants to have completed 25 hours of clinical observation under the supervision of a licensed speech-language pathologist (SLP). This is usually a requirement for speech pathology students during their undergraduate studies, but for those who did not complete it, certain programs may provide ways to fulfill this requirement.

An accelerated online speech pathology degree program covers a range of courses designed to provide students with in-depth knowledge of communication disorders, assessment techniques, and therapeutic interventions. These programs, of which some are also covered by affordable online master's in special education , focus on preparing students for clinical practice and certification as speech-language pathologists (SLPs). 

Foundational Courses

These courses provide the necessary background in the science of communication and swallowing disorders:

  • Anatomy and Physiology of Speech and Hearing Mechanisms: Covers the anatomical and physiological structures involved in speech, hearing, and swallowing, including the respiratory, phonatory, and auditory systems.
  • Phonetics and Phonology: Explores the study of sounds in speech, including how they are produced and perceived, along with transcription skills using the International Phonetic Alphabet (IPA).
  • Speech Science: Introduces students to the acoustics, perception, and production of speech. Students learn about the physical properties of sound and how speech is generated and processed by the auditory system.
  • Neuroscience for Communication Disorders: Focuses on the neural bases of speech, language, and hearing, with an emphasis on understanding the brain's role in communication and related disorders.

Core Speech Pathology Courses

These courses address the identification, assessment, and treatment of various speech and language disorders:

  • Child Language Development and Disorders: Examines the normal development of language in children and common language disorders, including developmental language delays and specific language impairments.
  • Articulation and Phonological Disorders: Teaches students how to assess and treat articulation and phonological disorders in children and adults, focusing on speech sound disorders.
  • Fluency Disorders: Focuses on the nature, causes, and treatment of stuttering and other fluency disorders across the lifespan.
  • Voice Disorders: Covers the diagnosis and treatment of disorders affecting vocal quality, pitch, loudness, and resonance.
  • Motor Speech Disorders: Explores conditions like dysarthria and apraxia of speech, which affect motor planning and execution in speech production.
  • Speech Sound Disorders in Children: Focuses on assessing and treating speech sound disorders in children, including phonological processes and articulation errors.

Language and Cognitive Disorders

These courses deal with the diagnosis and intervention of language and cognitive issues, particularly in adults:

  • Aphasia and Related Disorders: Studies acquired language disorders resulting from brain injury, such as stroke, and emphasizes treatment approaches for patients with aphasia.
  • Cognitive-Communication Disorders: Covers disorders affecting cognitive processes related to communication, such as attention, memory, problem-solving, and executive function, often in patients with brain injuries or neurodegenerative diseases.
  • Right Hemisphere and Traumatic Brain Injury (TBI) Communication Disorders: Focuses on communication disorders associated with right hemisphere damage and traumatic brain injuries, including challenges with pragmatics, discourse, and social communication.
  • Dementia and Communication: Explores the impact of dementia and other neurodegenerative conditions on communication abilities, with emphasis on management and intervention strategies.

Swallowing and Dysphagia

  • Dysphagia: This course covers the evaluation and treatment of swallowing disorders (dysphagia) in children and adults, focusing on the anatomy and physiology of swallowing and therapeutic interventions.

Audiology and Hearing Disorders

  • Hearing Assessment and Intervention: Introduces students to audiological assessment and hearing disorders. Covers basic audiological testing and interventions like hearing aids and cochlear implants for people with hearing loss.
  • Aural Rehabilitation: Focuses on treatment approaches for individuals with hearing loss, including speech reading, auditory training, and the use of assistive technology.

In 2023, 69% of the SLPs worked full time. Accelerated online speech pathology degree programs offer a range of specializations that allow students to focus on specific populations, disorders, or treatment approaches within the field. Below are some common specializations available in speech pathology programs:

  • Pediatric Speech-Language Pathology: Focuses on diagnosing and treating speech, language, and communication disorders in infants, children, and adolescents.
  • Adult Speech-Language Pathology: Concentrates on disorders affecting adult populations, often related to neurological conditions, aging, or traumatic injury.
  • Swallowing and Dysphagia: Specializes in assessing and treating swallowing disorders across all age groups.
  • Bilingual Speech-Language Pathology: Focuses on treating clients who are bilingual or come from diverse linguistic and cultural backgrounds.
  • Autism Spectrum Disorders (ASD): Specializes in working with individuals with autism and related developmental disabilities.

Choosing the best accelerated online speech pathology degree program, just like choosing the best online communications masters , requires careful consideration of several factors that align with your career goals, learning preferences, and lifestyle. The lowest 10% of speech pathologists earned less than $57,910, and the highest 10% earned more than $129,930 in 2023.

Here's a guide to your decision-making process:

Accreditation

Ensure the program is accredited by the Council on Academic Accreditation in Audiology and Speech-Language Pathology (CAA), which is part of the American Speech-Language-Hearing Association (ASHA). Accreditation ensures the program meets the educational standards required for licensure and certification.

Graduating from a CAA-accredited program is a requirement for obtaining your Certificate of Clinical Competence in Speech-Language Pathology (CCC-SLP), which is necessary for most jobs in the field.

Program Format and Flexibility

Determine if you can handle the demands of an accelerated program, which condenses the curriculum into a shorter time frame (often 1-2 years). These programs are intense, requiring more coursework per semester.

Some programs offer fully asynchronous courses, where you can complete the coursework at your own pace, while others require synchronous (live) sessions at set times.

Even in an online program, you will need to complete clinical practicum hours. Ensure the program supports clinical placements in your geographic area, and check if they help facilitate those placements.

Look for programs that offer robust student support, including access to online libraries, academic advising, technical assistance, and tutoring.

Curriculum and Specializations

Review the program’s curriculum to ensure it covers the areas of speech pathology you are most interested in. Check if the program offers specializations or electives in areas that align with your career goals, such as swallowing disorders, autism spectrum disorders, or telepractice.

Make sure the program includes key courses that are essential for certification, such as courses in articulation, fluency, voice disorders, and dysphagia.

Confirm how long the program will take to complete. Accelerated programs in cheap online colleges usually take one to two years, depending on the number of courses per semester and any prerequisite coursework you need to complete.

Some accelerated programs offer part-time options, which might be beneficial if you’re working or have other commitments. However, these will extend the overall time to completion.

Cost and Financial Aid

Compare the total tuition costs of each program, including any additional fees. Some programs may appear affordable but have hidden costs.

Research what financial aid is available, including scholarships, grants, and loans. Many online programs offer the same federal financial aid options as on-campus programs. Check if the school offers flexible payment plans or employer reimbursement options.

Average salary of speech-language pathologists in 2023

Speech pathologists can find employment in a variety of sectors:

  • School-Based Speech-Language Pathologist: Works in K-12 schools to diagnose and treat speech and language disorders, developing Individualized Education Programs (IEPs) for students.
  • Medical Speech-Language Pathologist: Provides therapy in hospitals or rehabilitation centers for patients with speech, language, and swallowing disorders due to medical conditions like stroke or brain injuries.
  • Private Practice Speech-Language Pathologist: Owns or works in a private clinic, offering individualized speech therapy services to clients of all ages and specializations.
  • Pediatric Speech-Language Pathologist: Specializes in diagnosing and treating communication disorders in infants, toddlers, and children, including developmental delays and autism.
  • Adult Speech-Language Pathologist: Focuses on treating communication and swallowing disorders in adults, often related to neurological conditions such as stroke or dementia.
  • Voice Specialist Speech-Language Pathologist: Focuses on diagnosing and treating voice disorders, often working with professional voice users like singers, actors, or teachers.
  • Researcher or Academic Speech-Language Pathologist: Conducts research in communication sciences or teaches at universities, contributing to advancements in speech pathology.

The job market for graduates with an accelerated online speech pathology degree is generally robust and growing, reflecting the increasing demand for speech-language pathologists across various settings. 

According to the BLS, the median annual wage for SLPs was around $89,000 as of May 2023. Salaries can vary based on experience, location, and setting. Medical SLPs and those in specialized roles may command higher salaries compared to other settings.

There is a high demand for SLPs due to an aging population, increased awareness of speech and language disorders, and a rise in diagnoses of conditions like autism and neurodegenerative diseases.

The U.S. Bureau of Labor Statistics (BLS) projects an 18% growth rate in employment for SLPs from 2023 to 2033, much faster than the average for other occupations.

Many SLPs work in educational settings, such as K-12 schools and early intervention programs. Some even pursue an  online doctorate in education . Schools are a major source of employment, with a consistent demand for services to support students with communication disorders.

Hospitals, rehabilitation centers, and skilled nursing facilities offer opportunities for SLPs to work with patients recovering from injuries or managing chronic conditions. There are also opportunities for SLPs to start their own practices or work in established private clinics, providing flexibility and autonomy in their work.

The rise of telehealth has expanded job opportunities for remote SLP services, allowing practitioners to work with clients from various locations.

There is a significant need for SLPs specializing in pediatric care, particularly in schools and early intervention settings. Those with expertise in areas like swallowing disorders or neurogenic communication disorders are in demand in healthcare settings. Bilingual SLPs or those with expertise in working with diverse populations are increasingly sought after.

Here’s What Graduates Have to Say About Their Accelerated Online Speech Pathology Degree

Earning my speech pathology degree online through an accelerated program allowed me to balance my studies with work and family life. The flexibility of online learning let me progress at my own pace, and I still felt connected to my professors and peers through virtual discussions. Completing the program in less time was a game-changer for my career!  - Sandra

The online accelerated speech pathology program was incredibly intense, but it gave me the skills and clinical experience I needed to start my career quickly. Being able to arrange my practicum locally made it convenient, and the online format offered me the flexibility to focus on my studies while managing my other commitments. I'm so glad I took this path!  - Dylan

Studying speech pathology online was the perfect solution for me. The accelerated pace kept me motivated, and I loved how interactive the online platform was. It felt just as engaging as an on-campus program, but I had the added bonus of completing it faster and from the comfort of my home. - Cathy

Key Findings

  • Employment of speech-language pathologists is projected to grow 18 percent from 2023 to 2033.
  • About 13,700 openings for speech-language pathologists are projected each year, on average, through 2023.
  • In 2023, 69% of the SLPs worked full time of which 66% of the SLPs were primarily clinical service providers.
  • The lowest 10% of speech pathologists earned less than $57,910, and the highest 10% earned more than $129,930 in 2023.
  • A total of 6,577 speech-language pathology degrees were awarded in 2022.

What degree is best for speech pathology?

The best degree for becoming a speech-language pathologist is a Master's in Speech-Language Pathology (MS or MA) from an accredited program. A master’s degree is the minimum requirement for certification and licensure in the field, as it provides the necessary advanced training in communication disorders, clinical techniques, and patient care. Most states and the American Speech-Language-Hearing Association (ASHA) require this degree for eligibility to obtain the Certificate of Clinical Competence in Speech-Language Pathology (CCC-SLP).

For students looking to enter the field, an undergraduate degree in communication sciences and disorders (CSD) or a related field is a common starting point. The graduate program includes both academic coursework and supervised clinical practicums, ensuring students are well-prepared to diagnose and treat speech and language disorders across various populations and settings.

How does accreditation affect licensure and certification in speech pathology?

To be eligible for certification by the American Speech-Language-Hearing Association (ASHA) through the Certificate of Clinical Competence in Speech-Language Pathology (CCC-SLP), graduates must complete their degree from an accredited program. Degrees from non-accredited programs may not meet the certification requirements, which can prevent graduates from obtaining the CCC-SLP.

Many employers prefer or require job candidates to have graduated from an accredited program, as it provides assurance of the candidate’s educational background and preparedness for the role. 

What are the typical practicum or internship requirements in an accelerated online speech pathology program?

In an accelerated online speech pathology program, students are typically required to complete a certain number of supervised clinical practicum hours, often ranging from 300 to 400 hours, to gain hands-on experience. Programs often allow students to arrange their clinical placements in approved local settings, such as schools, hospitals, or private practices, with remote supervision and support from faculty.

Additionally, the practicum requirements must meet the standards set by the American Speech-Language-Hearing Association (ASHA) for certification. Students may also need to complete internships or externships during the program, usually in the final stages, to work under the supervision of a licensed speech-language pathologist. 

How are clinical practicum hours arranged in an accelerated online speech pathology program?

In an accelerated online speech pathology program, clinical practicum hours are typically arranged by the student in collaboration with the program’s clinical coordinator. Students are responsible for finding approved local clinical sites, such as schools, hospitals, or rehabilitation centers, where they can complete their hours under the supervision of a licensed speech-language pathologist. 

Remote supervision is common in online programs, where faculty or assigned supervisors monitor progress through virtual check-ins, video submissions, and evaluation reports from on-site supervisors. This flexible approach allows students to complete their practicum in a location that fits their schedule and geographical area, while still meeting the clinical competency standards set by ASHA for certification.

References:

  • American Speech-Language-Hearing Association. (2023). 2023 SLP health care survey: Workforce .
  • American Speech-Language-Hearing Association. (n.d.). American Speech-Language-Hearing Association . 
  • U.S. Bureau of Labor Statistics. (2023). Occupational employment and wages, May 2022: Speech-language pathologists (29-1127) .
  • U.S. Bureau of Labor Statistics. (2023). Speech-language pathologists: Occupational outlook handbook .

Related Articles

Best Online Associate in Sonography Programs in 2024 thumbnail

Best Online Associate in Sonography Programs in 2024

Best Online Nursing Programs in Kentucky – 2024 Accredited RN to BSN Programs thumbnail

Best Online Nursing Programs in Kentucky – 2024 Accredited RN to BSN Programs

Online Master’s in Computer Science Programs for Non-CS Majors in 2024 thumbnail

Online Master’s in Computer Science Programs for Non-CS Majors in 2024

Creative Writing Major Guide: Salary Rates, Career Paths & Best Colleges in 2024 thumbnail

Creative Writing Major Guide: Salary Rates, Career Paths & Best Colleges in 2024

Best Online Doctorate in Healthcare Management Programs in 2024 thumbnail

Best Online Doctorate in Healthcare Management Programs in 2024

What is an Associate Degree and Why is it Important in 2024 thumbnail

What is an Associate Degree and Why is it Important in 2024

Best Accounting Schools in Delaware in 2024 – How to Become a CPA in DE thumbnail

Best Accounting Schools in Delaware in 2024 – How to Become a CPA in DE

Online Ph.D. Programs With No Dissertation in 2024: Specializations, Cost & Period of Study thumbnail

Online Ph.D. Programs With No Dissertation in 2024: Specializations, Cost & Period of Study

Best Accounting Schools in Wisconsin in 2024 – How to Become a CPA in WI thumbnail

Best Accounting Schools in Wisconsin in 2024 – How to Become a CPA in WI

Best Online Nutrition Programs in 2024 thumbnail

Best Online Nutrition Programs in 2024

Most Affordable Online Computer Science Degrees in the U.S. in 2024 thumbnail

Most Affordable Online Computer Science Degrees in the U.S. in 2024

Best degrees to get in 2024: popular majors that guarantee high salary.

How to Become a Middle School Math Teacher in Florida: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Florida: Requirements & Certification in 2024

How to Become a Middle School Math Teacher in Illinois: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Illinois: Requirements & Certification in 2024

How to Become a Middle School Math Teacher in New Jersey: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in New Jersey: Requirements & Certification in 2024

How to Become a Middle School Math Teacher in Arizona: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Arizona: Requirements & Certification in 2024

Best Careers You Can Have with an Associate of Arts Degree in 2024 thumbnail

Best Careers You Can Have with an Associate of Arts Degree in 2024

How to Become a Middle School Math Teacher in Alaska: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Alaska: Requirements & Certification in 2024

How to Become a Middle School Math Teacher in Arkansas: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Arkansas: Requirements & Certification in 2024

How to Become a Middle School Math Teacher in Mississippi: Requirements & Certification in 2024 thumbnail

How to Become a Middle School Math Teacher in Mississippi: Requirements & Certification in 2024

How to Become a Midde School Math Teacher in Minnesota: Requirements & Certification in 2024 thumbnail

How to Become a Midde School Math Teacher in Minnesota: Requirements & Certification in 2024

Newsletter & conference alerts.

Research.com uses the information to contact you about our relevant content. For more information, check out our privacy policy .

Newsletter confirmation

Thank you for subscribing!

Confirmation email sent. Please click the link in the email to confirm your subscription.

Redirect Notice

Resubmission applications.

A resubmission is an unfunded application that has been modified following initial review and resubmitted for consideration.

  • A resubmission application can follow a competing new, renewal, or revision application (A0) that was not selected for funding (including applications "not discussed" in review).
  • Only a single resubmission (A1) of a competing new, renewal, or revision application (A0) will be accepted.
  • A resubmission has a suffix in its application identification number, e.g., A1. (Resubmissions were previously called “amended” applications, hence “A1”.)
  • Resubmission must be listed in the Application Types Allowed section of the funding opportunity in order to submit a resubmission application.
  • You may resubmit using a different PA, PAR, or PAS program announcement that accepts resubmissions, provided eligibility and other requirements are met.
  • You must submit a new application (not a resubmission) if switching between a program announcement and request for application (RFA) or if changing activity codes (see NOT-OD-18-197 for exceptions).
  • You may submit an unfunded application as new again, without a resubmission.
  • Before a resubmission application can be submitted, the PD/PI must have received the summary statement from the previous review.
  • You must submit the resubmission application within 37 months of the new, renewal, or revision application it follows. Thereafter, the application must be submitted as a new application.
  • After an unsuccessful resubmission (A1), you may submit the idea as a new application.
  • After an unsuccessful submission and/or resubmission of a renewal application, your only option for a subsequent application is to submit as a new application. While you can submit a renewal resubmission application after an unsuccessful renewal application, you cannot submit a second renewal application following an unsuccessful renewal application.
  • Resubmission applications follow the same timeline as other applications (~9 months to award).
  • The NIH will not accept duplicate or highly overlapping applications under review at the same time, except in certain limited circumstances.

Application Requirements for Resubmission Applications

  • Resubmission applications must be submitted through Grants.gov to NIH using ASSIST, Workspace, or an institutional system-to-system solution.
  • You may need to make significant changes to the resubmission, compared to the new application that it follows.
  • You may include a cover letter, though not required.
  • Select "Resubmission" in Type of Application field (box 8) on the SF424 R&R form.
  • Do not markup changes within application attachments (e.g., do not highlight, color, bold or italicize changes in Research Strategy).
  • Responds to the issues and criticism raised in the summary statement.
  • Is one page or less in length, unless specified otherwise in the funding opportunity or is specified differently on our table of page limits .
  • In a multi-project application, you must submit an introduction with the Overall component, but introductions within the other components are optional.
  • In a resubmission of a revision application the same introduction must describe within the standard page limit the nature and impact of the revision and summarize the changes made to the application since the last submission.
  • Career development and fellowship applicants must arrange for resubmission of the three reference letters required for those programs.

Policy Details

Notice Notice Number
NIH/AHRQ Application Submission/Resubmission Policy
Overlap with another application pending appeal of initial peer review
Time limit of 37 months for resubmissions

History of NIH’s policy on resubmissions

We encourage applicants to discuss questions about resubmission with the NIH Institute/Center scientific contact associated with your grant application. Contacts for your grant can be found in your eRA Commons account.

General questions concerning this policy may be directed to the Division of Receipt and Referral at the Center for Scientific Review.

IMAGES

  1. PPT

    type of research in language

  2. PPT

    type of research in language

  3. Research Methods in Language Learning

    type of research in language

  4. Various approaches to language research.

    type of research in language

  5. Language Research Method

    type of research in language

  6. Types of Research by Method

    type of research in language

VIDEO

  1. 20

  2. Prescriptive Versus Descriptive Grammar Rules

  3. Variables in Research: Applied Linguistics

  4. Research Methods Definitions Types and Examples

  5. Getting the most from repeatable text analysis

  6. Relative Flow: Productizing your customer feedback program

COMMENTS

  1. Research Methods in Language Teaching and Learning

    Research Methods in Sociolinguistics: A Practical Guide Edited by Janet Holmes and Kirk Hazen 6. Research Methods in Sign Language Studies: A Practical Guide Edited by Eleni Orfanidou, Bencie Woll, and Gary Morgan 7. Research Methods in Language Policy and Planning: A Practical Guide Edited by Francis Hult and David Cassels Johnson 8.

  2. PDF INTRODUCTION TO RESEARCH METHODOLOGIES IN LANGUAGE STUDIES

    Language research is an area of interest for many students and lecturers of Faculty of Letters. This article is an attempt to describe various research methodologies in language studies in a simple way. The research methodologies covered include experimental research, quasi experimental research, ethnography, and case study.

  3. PDF Research Methods in Linguistics

    2 Ethics in linguistic research Penelope Eckert 11 3 Judgment data Carson T. Schütze and Jon Sprouse 27 4 Fieldwork for language description Shobhana Chelliah 51 5 Population samples Isabelle Buchstaller and Ghada Khattab 74 6 Surveys and interviews Natalie Schilling 96 7 Experimental research design Rebekha Abbuhl, Susan Gass, and Alison ...

  4. PDF Research Methods in Language Acquisition: Principles, Procedures, and

    Introduction. The purpose of this manual is to introduce the concepts, principles, and procedures of a unique field of linguistic study, that of language acquisition. Our objective is to provide an overview of scientific methods for the study of language acquisition and to present a systematic, scientifically sound approach to this study.

  5. Introducing linguistic research

    Authors. Svenja Voelkel, Johannes Gutenberg Universität Mainz, Germany Svenja Völkel is senior researcher/lecturer in linguistics at the University of Mainz, Germany. She has long-standing research and teaching experience in a broad field of topics, including language typology, anthropological linguistics, language contact, and cognitive linguistics.

  6. Linguistics: Research Methods

    Research Methods in Sociolinguistics: A Practical Guide by Hazen & Holmes, eds. Publication Date: 2014. This single-volume guide equips students of sociolinguistics with a full set of methodological tools including data collection and analysis techniques, explained in clear and accessible terms by leading experts.

  7. (PDF) Research methods in linguistics: An overview

    Di erent methods have. been devel oped to collect and analyze data resulting in two research paradigms: qualitative and quantitative research. H owever, Brown (200 1) a rgues that a. more ...

  8. Diversity of research methods and strategies in language teaching

    The six articles published in this issue of Language Teaching Research present research that varies widely in terms of the type of research design used as well as methodology of data collection, time frame and research objectives. They range from highly controlled experimental and/or cross-sectional studies that attempt to explain relationships or differences between and among groups, to ...

  9. Research methods in applied linguistics and language education: current

    As the field advances, significant growth in the quality, quantity, and diversity in research perspectives is attested by the increasing number of publications in research methods in applied linguistics (e.g. Paltridge and Phakiti 2015; Riazi 2016) and second language studies (e.g. Mackey and Gass 2015). As suggested by McKinley and Rose in the ...

  10. Research Approaches in Applied Linguistics

    Her main areas of interest are language acquisition and language socialization, qualitative research methods, classroom discourse in a variety of educational contexts, including second/foreign language courses, mainstream and L2-immersion content-based courses, and the teaching, learning, and use of English and Chinese as international languages.

  11. Research Methods in Language Attitudes

    Metrics. Attitudes towards spoken, signed, and written language are of significant interest to researchers in sociolinguistics, applied linguistics, communication studies, and social psychology. This is the first interdisciplinary guide to traditional and cutting-edge methods for the investigation of language attitudes.

  12. Applied linguistics: Research methods for language teaching

    The literature review process includes six steps: understanding, organizing, dialoguing/critiquing, synthesizing, reporting, and becoming (part of the literature). Once you have conducted your literature review you will have a clear sense of topics and themes in relevant fields. 3. Research questions.

  13. Exploring Research Methods in Language Learning-teaching Studies

    (Troudi & Nunan, 1995) In addition, in education and learning practice, there are 2 types of research of four research methods,, namely Research and Development (R & D) and Classroom Action ...

  14. Research Methods in Language Teaching and Learning: A Practical Guide

    A practical guide to the methodologies used in language teaching and learning research, providing expert advice and real-life examples from leading TESOL researchers Research Methods in Language Teaching and Learning provides practical guidance on the primary research methods used in second language teaching, learning, and education. Designed to support researchers and students in language ...

  15. 1

    The chapter introduces the three types of methods by means of which language attitudes can be investigated - that is, the analysis of the societal treatment of language, direct methods, and indirect methods - and the key overarching issues in language attitudes research which are covered in the book (i.e. regarding different community types ...

  16. Research Methods in Language Teaching and Learning

    Research Methods in Language Teaching and Learning: A Practical Guide. Editor(s): ... Type of import. Citation file or direct import. Indirect import or copy/paste. Cancel. Next. Go back. ... Online and Hybrid Research Using Case Study and Ethnographic Approaches: A Decision-Making Dialogue Between Two Researchers (Pages: 87-102) ...

  17. The Relevance of Language for Scientific Research

    The historical framework of the origin of the relevance of language for scientific research is the previous step for its philosophical analysis, which considers a number of aspects of special importance. (1) Language is one of the constitutive elements of science. It accompanies the other elements that configure science: the structure in which scientific theories are articulated, scientific ...

  18. PDF RESEARCH LANGUAGE

    research as a process, or series of integrated steps. Understanding this process requires familiarity with several terms, namely constructs, variables,and hypotheses. These basic concepts will be introduced with many concrete examples. They are part of the "language" of research. Understanding the research language is sometimes demanding ...

  19. PDF THE DIFFERENT LANGUAGES OF QUALITATIVE RESEARCH

    Qualitative research designs tend to work with a relatively small number of cases. Generally speaking, qualitative researchers are prepared to sacrifice scope for detail. Moreover, even what counts as detail tends to vary between qualitative and quantitative researchers. The latter typically seek detail in certain aspects of corre-

  20. Language Of Research

    Language Of Research. Learning about research is a lot like learning about anything else. To start, you need to learn the jargon people use, the big controversies they fight over, and the different factions that define the major players. We'll start by considering five really big multi-syllable words that researchers sometimes use to describe ...

  21. PDF THE LANGUAGE OF RESEARCH

    The role of the research question. To guide the direction of the study. To identify facts that are relevant and those that are not. To suggest which form of research design is likely to be most appropriate. To provide a framework for organizing and evaluating the conclusions that result.

  22. Investigating language acquisition in communication sciences and

    Language acquisition research has a long tradition of including individuals with disabilities as research subjects. Numerous early works had the goal of using what was "missing" in their development to inform theories of how the system "ought" to function. ... Consequently, speech-language therapists often rely on informal or ad-hoc ...

  23. PDF The Language of Research

    research creates the theory (research-then-theory) (Berg 2004) or inductive logic (see Box 2-1). In reality, the two types of logic are actually extensions of one another. Observation may lead to theory construction, which then leads to more obser-vation in order to test the theory. Therefore, even research that is initially induc-

  24. What Is Qualitative Research?

    Revised on September 5, 2024. Qualitative research involves collecting and analyzing non-numerical data (e.g., text, video, or audio) to understand concepts, opinions, or experiences. It can be used to gather in-depth insights into a problem or generate new ideas for research. Qualitative research is the opposite of quantitative research, which ...

  25. Journal of Medical Internet Research

    In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews ...

  26. Vowel Length in the Romance Languages

    Due to a series of later changes, the remaining Romance languages and dialects lost this allophonic rule, which gave rise to either of the two further types: on the one hand, languages lacking contrastive gemination and contrastive vowel length (type B, including Daco- and Ibero-Romance all along their documented history, as well as, today ...

  27. Resource library

    Filter Resources By Resource library Explore thousands of resources produced by FHI 360's experts that illustrate the impact of our work and underpin our data-driven solutions. Featured resources Discover resources by… Technical area Country Type Language

  28. Conducting Multidisciplinary SHS Research in the Field of Cancer

    This paper summarizes the work held at the Cancéropôle Ile-de-France's annual SHS research seminar on the theme: Pluridisciplinarity and methods for SHS research in the field of cancer. After clarifying the concepts of pluri-, inter-, and transdisciplinarité, it aimed to describe how this type of research is carried out in practice, addressing successively: the role of stakeholders and ...

  29. Best Accelerated Online Speech Pathology Degree Programs for 2024

    Voice Specialist Speech-Language Pathologist: Focuses on diagnosing and treating voice disorders, often working with professional voice users like singers, actors, or teachers. Researcher or Academic Speech-Language Pathologist: Conducts research in communication sciences or teaches at universities, contributing to advancements in speech pathology.

  30. Resubmission Applications

    Select "Resubmission" in Type of Application field (box 8) on the SF424 R&R form. You must include an introduction that: Summarizes substantial additions, deletions, and changes to the application. Do not markup changes within application attachments (e.g., do not highlight, color, bold or italicize changes in Research Strategy).