Leveraging Large Language Models like ChatGPT for Real-World Evidence Generation

Leveraging Large Language Models like ChatGPT for Real-World Evidence Generation

3 min read

Large Language Models (LLMs), including ChatGPT, have gained significant traction since late 2022, sparking debates about whether they’re mere hype or a genuine revolution. However, their undeniable utility spans personal and business activities, including but not limited to proofreading and editing, language translation, coding and debugging, and content creation. At Polygon Health Analytics, we recognize several areas where LLMs can make a substantial positive impact on real-world evidence (RWE) generation and Health Economics and Outcome Research (HEOR), as discussed below.

LLMs Enable RWE Generation from Novel Data Types

While most current RWE is derived from claims data or structured Electronic Health Record (EHRs), a wealth of untapped potential lies in unstructured real-world data (RWD) sources. These include unstructured EHRs (e.g., clinician’s notes), surveys, social media (including online patient forums), news articles, and scientific literature. Often overlooked due to challenges like missing entries, typos, specialized jargons, and context-dependent acronyms, these unstructured free-text data can now be harnessed efficiently with LLMs. These models can efficiently extract clinical terms (e.g., diagnoses, symptoms, and medications), transform them into structured formats, fix misspellings, and contextualize acronyms and jargons1-4 (Figure 1). LLM will unlock previously overlooked RWE insights, driving the evolution of evidence-based medicine.

Figure 1. ChatGPT deciphers the clinical notes and extracts medical terms into a structured format (adapted from [4])

LLMs Democratize RWD Exploration and Analysis

RWE generation often involves navigating vast datasets, demanding advanced technical skills. LLMs hold the promise of democratizing data analysis, making it accessible to a broader audience, regardless of programming or statistical expertise. Users can interact with data naturally by posing questions in everyday language, and LLMs can translate these inquiries into structured queries for data retrieval5,6 (Figure 2). Furthermore, LLMs can assist users in generating code for complex analyses, and interpreting and summarizing findings presented in tables and figures, thereby lowering barriers to RWD analysis and facilitating data-driven decision-making7,8.

Figure 2. LLM converts free-text questions to SQL queries for data retrieving from a relational database storing RWD ([6]).

LLMs Automate Scientific Literature Synthesis

Systematic literature review is about analyzing and synthesizing existing scientific literature for evidence-based decision-making and novel research and development opportunities identification. Traditionally, manual reviews are time-consuming and expensive. LLMs can perform systematic literature review tasks, from defining search terms, to summarizing and extracting information from articles9. By deploying multiple AI agents, LLMs can streamline the review process, offering timely insights amidst the growing body of literature10.

In conclusion, integrating LLMs into RWE generation promises to advance HEOR, patient care, and health policy. LLMs are poised to reshape the landscape of RWD and RWE. As these models continue evolving, their impact on the future of medicine and patient care will be profound. It is crucial to approach these advancements critically, addressing challenges like customized models, data privacy, biasness and fairness, and regulatory compliance.

References:

  1. Jethani, Neil, et al. “Evaluating ChatGPT in Information Extraction: A Case Study of Extracting Cognitive Exam Dates and Scores.” medRxiv (2023): 2023-07.
  2. Huang, Jingwei, et al. “A Critical Assessment of Using ChatGPT for Extracting Structured Data from Clinical Notes.” Available at SSRN 4488945.
  3. Hu, Yan, et al. “Zero-shot clinical entity recognition using chatgpt.” arXiv preprint arXiv:2303.16416 (2023).
  4. “Large language models help decipher clinical notes.” MIT.edu (2022)
  5. Pan, Youcheng, et al. “A BERT-based generation model to transform medical texts to SQL queries for electronic medical records: model development and validation.” JMIR Medical Informatics 9.12 (2021): e32698.
  6. Lee, Gyubok, et al. “EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records.” Advances in Neural Information Processing Systems 35 (2022): 15589-15601.
  7. “How to Use ChatGPT Code Interpreter.” Datacamp.com (2023)
  8. Maddigan, Paula, and Teo Susnjak. “Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models.” IEEE Access (2023).
  9. Alshami, Ahmad, et al. “Harnessing the Power of ChatGPT for Automating Systematic Review Process: Methodology, Case Study, Limitations, and Future Directions.” Systems 11.7 (2023): 351.
  10. Talebirad, Yashar, and Amirhossein Nadiri. “Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents.” arXiv preprint arXiv:2306.03314 (2023).

Other Posts You Might Like

Synthetic Data vs. Real-World Data: A Reality Check for Healthcare AI
Synthetic Data vs. Real-World Data: A Reality Check for Healthcare AI
Dec 15, 2025
I first encountered the concept of synthetic data back in 2013, while teaching a health informatics course as a tenure-track assistant professor at UNC Charlotte. To...
Read more
Drug Development Program Done Right: A Practical Checklist to Prevent Strategic Blind Spots
Drug Development Program Done Right: A Practical Checklist to Prevent Strategic Blind Spots
Nov 28, 2025
In the high-stakes world of pharmaceutical R&D, thousands of drug candidates are abandoned every year long before reaching patients. The harsh reality: fewer than...
Read more
QALYs Explained: The Metric That’s Shaping—and Dividing—Healthcare Policy
QALYs Explained: The Metric That’s Shaping—and Dividing—Healthcare Policy
Nov 10, 2025
Quality-Adjusted Life Years (QALYs) are a cornerstone concept in health economics. They measure the value of medical treatments by considering both how long people live and...
Read more
Value-Based Health Care: Shifting the Focus from Quantity to Quality
Value-Based Health Care: Shifting the Focus from Quantity to Quality
Oct 23, 2025
Understand how value-based health care shifts focus from volume to outcomes, rewarding better results, reducing costs and improving patient care....
Read more
Budget Impact Models: A Practical Tool for Healthcare Decision-Making
Budget Impact Models: A Practical Tool for Healthcare Decision-Making
Oct 07, 2025
Learn how Budget Impact Models help payers and HTA agencies assess short-term affordability of new healthcare treatments alongside cost-effectiveness analysis....
Read more
New White Paper: Charting the Landscape of Real-World Data in the U.S.
New White Paper: Charting the Landscape of Real-World Data in the U.S.
Oct 01, 2025
Learn how real-world data is transforming U.S. healthcare and life sciences. Our new white paper maps datasets, applications, challenges, and future directions....
Read more
Chart showing global vaccine trial trends
Polygon Health Analytics Launches Vaccine Trial Atlas: Making Clinical Trial Data Accessible
Sep 16, 2025
The vaccine research and development community has faced unprecedented challenges in recent months, including policy upheavals, leadership changes, research program cancellations, and a surge of misinformation...
Read more
Launchpad
Polygon Health Analytics Celebrates Graduation of 2025 Launchpad Cohort
Sep 03, 2025
[Philadelphia, September 2, 2025] – Polygon Health Analytics proudly announces the successful graduation of its 2025 Launchpad Program cohort—the second since the program’s inception—marking...
Read more
Patient_Reported_With_Doctor
Patient-Reported Outcomes: Bringing the Patient’s Voice into Clinical Development and Outcomes Research
Aug 14, 2025
When it comes to healthcare, numbers and lab results only tell part of the story. What about how patients feel? How treatments impact their daily lives?...
Read more
Powerful AI Starts with High-Quality Data: Lessons from Edwin Chen and Surge AI
Powerful AI Starts with High-Quality Data: Lessons from Edwin Chen and Surge AI
Jul 28, 2025
In a world where AI headlines are dominated by billion-dollar fundraises, massive model sizes, and compute power arms races, Edwin Chen offers a refreshing counter-narrative. As...
Read more
View all