Large Language Models (LLMs), including ChatGPT, have gained significant traction since late 2022, sparking debates about whether they’re mere hype or a genuine revolution. However, their undeniable utility spans personal and business activities, including but not limited to proofreading and editing, language translation, coding and debugging, and content creation. At Polygon Health Analytics, we recognize several areas where LLMs can make a substantial positive impact on real-world evidence (RWE) generation and Health Economics and Outcome Research (HEOR), as discussed below.

LLMs Enable RWE Generation from Novel Data Types

While most current RWE is derived from claims data or structured Electronic Health Record (EHRs), a wealth of untapped potential lies in unstructured real-world data (RWD) sources. These include unstructured EHRs (e.g., clinician’s notes), surveys, social media (including online patient forums), news articles, and scientific literature. Often overlooked due to challenges like missing entries, typos, specialized jargons, and context-dependent acronyms, these unstructured free-text data can now be harnessed efficiently with LLMs. These models can efficiently extract clinical terms (e.g., diagnoses, symptoms, and medications), transform them into structured formats, fix misspellings, and contextualize acronyms and jargons1-4 (Figure 1). LLM will unlock previously overlooked RWE insights, driving the evolution of evidence-based medicine.

Figure 1. ChatGPT deciphers the clinical notes and extracts medical terms into a structured format (adapted from [4])

LLMs Democratize RWD Exploration and Analysis

RWE generation often involves navigating vast datasets, demanding advanced technical skills. LLMs hold the promise of democratizing data analysis, making it accessible to a broader audience, regardless of programming or statistical expertise. Users can interact with data naturally by posing questions in everyday language, and LLMs can translate these inquiries into structured queries for data retrieval5,6 (Figure 2). Furthermore, LLMs can assist users in generating code for complex analyses, and interpreting and summarizing findings presented in tables and figures, thereby lowering barriers to RWD analysis and facilitating data-driven decision-making7,8.

Figure 2. LLM converts free-text questions to SQL queries for data retrieving from a relational database storing RWD ([6]).

LLMs Automate Scientific Literature Synthesis

Systematic literature review is about analyzing and synthesizing existing scientific literature for evidence-based decision-making and novel research and development opportunities identification. Traditionally, manual reviews are time-consuming and expensive. LLMs can perform systematic literature review tasks, from defining search terms, to summarizing and extracting information from articles9. By deploying multiple AI agents, LLMs can streamline the review process, offering timely insights amidst the growing body of literature10.

In conclusion, integrating LLMs into RWE generation promises to advance HEOR, patient care, and health policy. LLMs are poised to reshape the landscape of RWD and RWE. As these models continue evolving, their impact on the future of medicine and patient care will be profound. It is crucial to approach these advancements critically, addressing challenges like customized models, data privacy, biasness and fairness, and regulatory compliance.


  1. Jethani, Neil, et al. “Evaluating ChatGPT in Information Extraction: A Case Study of Extracting Cognitive Exam Dates and Scores.” medRxiv (2023): 2023-07.
  2. Huang, Jingwei, et al. “A Critical Assessment of Using ChatGPT for Extracting Structured Data from Clinical Notes.” Available at SSRN 4488945.
  3. Hu, Yan, et al. “Zero-shot clinical entity recognition using chatgpt.” arXiv preprint arXiv:2303.16416 (2023).
  4. “Large language models help decipher clinical notes.” (2022)
  5. Pan, Youcheng, et al. “A BERT-based generation model to transform medical texts to SQL queries for electronic medical records: model development and validation.” JMIR Medical Informatics 9.12 (2021): e32698.
  6. Lee, Gyubok, et al. “EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records.” Advances in Neural Information Processing Systems 35 (2022): 15589-15601.
  7. “How to Use ChatGPT Code Interpreter.” (2023)
  8. Maddigan, Paula, and Teo Susnjak. “Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models.” IEEE Access (2023).
  9. Alshami, Ahmad, et al. “Harnessing the Power of ChatGPT for Automating Systematic Review Process: Methodology, Case Study, Limitations, and Future Directions.” Systems 11.7 (2023): 351.
  10. Talebirad, Yashar, and Amirhossein Nadiri. “Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents.” arXiv preprint arXiv:2306.03314 (2023).


Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *