TY - JOUR
T1 - Effects of Negation and Uncertainty Stratification on Text-Derived Patient Profile Similarity.
AU - Slater, Luke T
AU - Karwath, Andreas
AU - Hoehndorf, Robert
AU - Gkoutos, Georgios
N1 - KAUST Repository Item: Exported on 2022-01-25
Acknowledged KAUST grant number(s): OSR, URF/1/3790-01-01.
Acknowledgements: GG and LS acknowledge support from support from the NIHR Birmingham ECMC, NIHR Birmingham SRMRC, Nanocommons H2020-EU (731032) and the NIHR Birmingham Biomedical Research Centre and the MRC HDR UK (HDRUK/CFC/01), an initiative funded by UK Research and Innovation, Department of Health and Social Care (England) and the devolved administrations, and leading medical research charities. RH and GG were supported by funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3790-01-01. AK was supported by the Medical Research Council (MR/S003991/1) and the MRC HDR UK (HDRUK/CFC/01).
PY - 2021/12/23
Y1 - 2021/12/23
N2 - Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.
AB - Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.
UR - http://hdl.handle.net/10754/675119
UR - https://www.frontiersin.org/articles/10.3389/fdgth.2021.781227/full
U2 - 10.3389/fdgth.2021.781227
DO - 10.3389/fdgth.2021.781227
M3 - Article
C2 - 34939069
SN - 2673-253X
VL - 3
JO - Frontiers in Digital Health
JF - Frontiers in Digital Health
ER -