Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
What’s new in academic international medicine? Artificial intelligence and machine learning is here to stay, forcing rapid adoption and adaptation
4
Zitationen
3
Autoren
2023
Jahr
Abstract
The terms “artificial intelligence,” “machine learning,” and “ChatGPT” have become increasingly popular over the past several years, corresponding to the emerging new era of nonhuman resources for education and research. The relationship between Generative Pre-trained Transformers (ChatGPTs) and artificial intelligence (AI)/machine learning (ML) can be simplistically thought of as that of an “end-user product” (e.g., ChatGPT interface); “data interpretation tool” (e.g., AI software code); and “data acquisition tool” (e.g., ML functionality)[1–3] Within the academic community, we are witnessing early implementations of “large language models” (LLMs), such as GPT 2–4, Bidirectional Encoder Representations from Transformers (BERT) and others.[4–6] Engineers have come so far in developing these sophisticated AI models and machines to not only learn and analyze the human language, but to also process queries and respond back in the most humanly way as possible.[7,8] Due to their powerful capabilities, the use of LLMs has sparked significant debate and resulted in many questions regarding its appropriate versus inappropriate applications, its various evolving functions/roles over time, as well as different ways to minimize any misuse of these emerging new capabilities, especially in the field of medicine and medical research [Table 1].[9–11]Table 1: Pros and cons of large language modelsWe first take a closer look at several areas in which LLMs may be acceptable and even beneficial.[12,13] For one, many see LLMs to be useful in promoting health literacy, primarily by enabling one’s ability to take health information, understand it, and ultimately act on it.[7] There is also the potential for medical text summarization, with the goal of condensing medical knowledge such as scientific articles, clinical notes, or reports into short summaries. This helps to assist researchers and clinicians to quickly access the information so that appropriate decisions or plans could be made more efficiently.[14] Within the research, realm is an LLM called GatorTron, from the University of Florida, which trained on more than 90 billion words of text from the electronic health records. This implementation has the potential to spearhead the next frontier in clinical research by utilizing natural language processing to ascertain protocol-specific information while ensuring various regulatory requirements are met.[15,16] Appropriately deployed LLMs can also assist in medical and administrative decision-making, triage capacity, and other mission-critical aspects of health-care operations.[7,17] The ability to offload administrative burdens associated with modern clinical practice has the potential to improve patient and provider satisfaction and may contribute to better and safer overall health-care system.[18] It is also important to provide the public with safe and verified sources of medical information. When appropriately deployed, LLMs could serve in this role quite well. At the same time, any such implementations would require utmost care to ensure that any bias/misinformation is minimized, if not eliminated from the model’s user-facing output.[14,19] In this context, LLMs have the potential to ultimately replace search engines such as Google in acting as resources for medical triage or patient inquiries on medical symptoms, conditions, diagnoses, and treatment. Other realistic and more concrete applications of LLMs in medicine include facilitating clinical documentation. This may include, but is not limited to, the creation of discharge summaries; generating clinic, operation, and procedure notes; obtaining insurance preauthorization; or summarizing research papers in the context of a specific health-care encounter.[20,21] In one emerging use-case, LLMs may also be able to assist physicians in diagnosing conditions based on medical records, images, laboratory results, and suggest treatment options or plans. At the same time, systematic reviews highlighted other potential benefits too such as improved scientific writing, enhancing research equity, streamlining the health-care workflow, cost saving, and improved personalized learning in medical education.[22] Employing various LLMs to attain the so-called “academic upscaling”[23,24] of a publication or another end-product represents another acceptable use of this emerging technology. After all, LLMs have been developed as “language models” and their inherent strength is through the seamless mastery of human-like language, speech, and broadly considered communications.[11,25,26] Under this use-case paradigm, users would utilize LLMs to generate an upgraded language/grammar content or version(s) of their existing written output. Of critical importance is the assumption that the content being processed is fully authored by the individuals/author(s)/research team(s) using this approach and not in any way “generated” by the LLM itself.[27] Despite bringing LLMs into a positive light, natural language processors also have a number of potential pitfalls to consider. For one, LLMs are trained to gather information, but any “programmed model” may not be able to recognize information that is deemed partial to opinion, subjective, biased, or “fake news” (e.g. intentionally factually incorrect content).[28,29] Thus, LLMs could be susceptible to passively spreading misinformation should a particular model process false statements that it fails to detect as invalid or reinforces biases relating to specific segments of the population.[30–32] Since its inception, some reports have highlighted the appearance of racial bias with the use of ChatGPT.[33,34] Consequently, LLMs will need to integrate dedicated “quality control” tools that will assist with validation of data and its overall quality. Such tools will need to be both objective and trustless, thus reducing the risk of “bias within the bias-fighting tools.”[30–32] Other limitations include the costs and resources needed to maintain computing power required for LLMs to function. It must be noted that the amount of computing power (fully accounting for the costs of equipment, maintenance, and electricity) required to run a powerful AI/ML model can be truly overwhelming.[35] In addition, LLMs also require very large volumes of data to be trained effectively, which might dictate that the sharing of data between institutions could be required to train algorithms.[11,27] Such data sharing presents a unique challenge in the field of health care, as strict data privacy laws and institutional data protection agreements contribute to data silos.[4,12,13,28,36] In terms of the truly powerful “generative” properties of LLMs, it is very important to ensure that clear attribution (inclusive of quantifiable effort/contribution) is disclosed when utilizing AI-generated content.[37] Not doing so would constitute (and should be strongly considered to be) a form of plagiarism. At the same time, the ability of current LLMs to generate “blended content” and various other nuanced creative outputs may be considered desirable when preparing various visual aids and scientific presentations, as long as attribution to AI is transparently communicated. Examples of such acceptable uses of AI in academics could include a “blended portrait” of a historical figure in a modern setting (e.g., AI-generated picture of Sir Isaac Newton in an astronaut suit). The use of LLMs to “generate scientific data” is unacceptable and can lead to disastrous and dangerous outcomes. One of the most infamous examples of blatant disregard for scientific truth was the Surgisphere fraud, where “ML model” was utilized to “generate” a clearly made-up dataset.[38,39] Regardless of whether the reported data were simply “made up” or actually generated by a custom LLM model, the authors of the now-retracted work managed to deceive prominent peer reviewers and editors working for one of the most prestigious medical journals on the planet.[40] In terms of creative input and authorship attribution, it is essential to ensure that appropriate guidelines are established, followed, and periodically re-evaluated based on our collective experiences as an international scientific community.[41,42] The authors of this Editorial advocate that a standardized language/section (similar to currently required Ethics and Conflict-of-Interest statements) regarding AI-based attribution be implemented across all high-quality scientific journals. Among some of the most curious and potentially most destructive phenomena associated with LLMs are the abilities of such models to “hallucinate” and/or “confabulate” when it comes to synthetic creative processes that are inherent to the AI-based derivation of LLM output(s).[42,43] Thus, better understanding of the intricate balance between “creative” and “objective” aspects of LLM operations will be required. Perhaps, a way to set up “processing filters” between “creative” and “objective” will help enable the creation of LLMs with the most optimal blend of these still poorly understood aspects of AI/ML-based applications. The United States Food and Drug Administration (FDA) has been leading the global discussions on regulatory oversight and has been a prominent example in providing regulations about emerging technologies, such as AI-based medical tools.[22] The FDA started regulating “Software as a Medical Device” category, which refers to software solutions that perform medical functions and are used in the prevention, diagnosis, treatment, or monitoring of various diseases or conditions. As a continuation of that approach, the FDA has been adapting its regulatory framework to specifically address AI and ML technologies in medical devices.[44] The FDA released a discussion paper that outlined their potential regulatory approach tailored to AI/ML technologies used in medical devices.[22] The proposed framework also emphasized the importance of transparency, real-world performance monitoring, and clear expectations for modifications and updates to AI/ML algorithms.[22] Given the amount of controversy surrounding this rapidly emerging area, there are increasing voices for regulatory oversight of LLMs, especially in certain professional areas where bias-free data processing and interpretation are critical (e.g. health care, legal profession, engineering, etc.).[22,28,36,45,46] If nothing else, the scientific community should continue to familiarize itself with potential pitfalls of LLMs, in addition to embracing such models on “face value” and collective excitement, before defining further course of adoption for this potentially revolutionary technology. Looking at various historical paradigms, the optimal approach to LLMs may be to dynamically evaluate its risks, benefits, and alternatives, with periodic course readjustment(s). For now, it looks like LLMs are here to stay, and it is our collective responsibility to make the best of these powerful new tools, in education, research, and clinical care! Research quality and ethics statement All research projects presented during the St. Luke’s University Health Network Annual Research Symposium were verified to have either appropriate Institutional Review Board approvals or exemptions. For case repots, proof of appropriate patient consent documentation is required. In all instances, appropriate EQUATOR guidelines (see https://www.equatornetwork.org/ reportingguidelines/) for scientifical reporting were followed.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.400 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.261 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.695 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.506 Zit.