Notes on Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

Link to paper: https://arxiv.org/abs/2307.06439

Paper published on: 2023-07-12

Paper's authors: Yu Gu, Sheng Zhang, Naoto Usuyama, Yonas Woldesenbet, Cliff Wong, Praneeth Sanapathi, Mu Wei, Naveen Valluri, Erika Strandberg, Tristan Naumann, Hoifung Poon

Let's embark on a journey to explore the intriguing world of large language models (LLMs) and their application in the healthcare sector, specifically focusing on adverse drug event (ADE) extraction. Think of LLMs as a vast library of information, and our task is to extract specific information from this library, akin to a librarian finding a particular book among thousands.

In this research, the authors have developed a novel way to optimize the process by using a method called knowledge distillation. Imagine if we were able to create a smaller, more specialized library (student model) from the larger one (teacher model) that contains only the books (or knowledge) needed for a particular task. This is what knowledge distillation does. The distilled model, in this case, PubMedBERT, is like a specialized librarian who knows exactly where to find information about adverse drug events.

The novelty here lies in the application of this method to healthcare, specifically to ADE extraction. The authors have proposed a unique architecture that simplifies ADE extraction into a unified task, reducing the computational requirement from O(NM) to O(M). This is akin to changing how we search for a book from checking every shelf (O(NM)) to directly going to the right section (O(M)), making the process more efficient and accurate.

In this distillation process, GPT-3.5, a large language model, serves as the teacher model. Two student models, PubMedBERT and BioGPT, are then fine-tuned using the generated input-output pairs as labeled examples. The performance of these models is evaluated using the lenient F1 score as the primary metric. This is essentially a test to see how well our specialized librarians can find the right books.

The distilled PubMedBERT model performs remarkably well, achieving a comparable accuracy to the state-of-the-art model, despite being significantly smaller. It outperforms its teacher model (GPT-3.5) and even GPT-4 in terms of F1 score. This is like a specialized librarian being able to find books more accurately and efficiently than a general librarian.

The authors didn't stop at ADE extraction; they applied this approach to other biomedical knowledge extraction tasks such as gene-disease associations and protected health information. This demonstrates the versatility of the approach, like a librarian being able to specialize in multiple fields.

The study also acknowledges certain limitations, such as not using GPT-4 as the teacher model and the assumption of gold drug entities during evaluation. The authors propose future work to include additional domain-specific knowledge sources, expansion of the training corpus for other clinical tasks, and evaluation on a broader range of clinical tasks and datasets.

One important point to note is the annotation inconsistencies in the ADE corpus. This is like having inaccuracies in the book catalogues, which can affect the performance of the models trained on this dataset. The authors provide examples of these inconsistencies, including ambiguous boundaries and discrepancies in the inclusion of similar words.

In conclusion, this research presents a novel approach to using large language models for biomedical knowledge extraction, specifically adverse drug event extraction. The technique of distilling a large language model into task-specific student models through self-supervised learning has shown significant gains in terms of accuracy, cost, efficiency, and white-box model access. The case study on ADE extraction using a distilled PubMedBERT model has demonstrated the potential of this approach and its applicability to other healthcare-related tasks.

Notes on Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

More from this blog

Notes on Android in the Wild: A Large-Scale Dataset for Android Device Control

Notes on LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

Notes on Text2Layer: Layered Image Generation using Latent Diffusion Model

Notes on DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI

Notes on Towards A Unified Agent with Foundation Models

Command Palette

More from this blog