Improving Downstream Task Performance by Treating Numbers as Entities

Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Liyan Xu, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed. Though numbers are typically not accounted for distinctly in most NLP tasks, there is still an underlying amount of numeracy already exhibited by NLP models. For instance, in named entity recognition (NER), numbers are not treated as an entity with distinct tags. In this work, we attempt to tap the potential of state-of-the-art language models and transfer their ability to boost performance in related downstream tasks dealing with numbers. Our proposed classification of numbers into entities helps NLP models perform well on several tasks, including a handcrafted Fill-In-The-Blank (FITB) task and on question answering, using joint embeddings, outperforming the BERT and RoBERTa baseline classification.

Original languageEnglish (US)
Title of host publicationCIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages4535-4539
Number of pages5
ISBN (Electronic)9781450392365
DOIs
StatePublished - Oct 17 2022
Event31st ACM International Conference on Information and Knowledge Management, CIKM 2022 - Atlanta, United States
Duration: Oct 17 2022Oct 21 2022

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Country/TerritoryUnited States
CityAtlanta
Period10/17/2210/21/22

Keywords

  • named entity
  • numeracy
  • text tagging

ASJC Scopus subject areas

  • General Business, Management and Accounting
  • General Decision Sciences

Fingerprint

Dive into the research topics of 'Improving Downstream Task Performance by Treating Numbers as Entities'. Together they form a unique fingerprint.

Cite this