Students Need More Attention: BERT-based Attention Model for Small Data with Application to Automatic Patient Message Triage

Shijing Si, Rui Wang, Jedrek Wosik, Hao Zhang, David Dov, Guoyin Wang, Ricardo Henao, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

Small and imbalanced datasets commonly seen in healthcare represent a challenge when training classifiers based on deep learning models. So motivated, we propose a novel framework based on BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical TextMining). Specifically, (i) we introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets. As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent. Experiments demonstrate that our approach can outperform several strong baseline classifiers by a significant margin of 4.3% in terms of macro F1 score.
Original languageEnglish (US)
Title of host publicationProceedings of Machine Learning Research
PublisherML Research Press
Pages436-456
Number of pages21
StatePublished - Jan 1 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'Students Need More Attention: BERT-based Attention Model for Small Data with Application to Automatic Patient Message Triage'. Together they form a unique fingerprint.

Cite this