Unsupervised Paraphrasing Consistency Training for Low Resource Named Entity Recognition

Rui Wang, Ricardo Henao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

Unsupervised consistency training is a way of semi-supervised learning that encourages consistency in model predictions between the original and augmented data. For Named Entity Recognition (NER), existing approaches augment the input sequence with token replacement, assuming annotations on the replaced positions unchanged. In this paper, we explore the use of paraphrasing as a more principled data augmentation scheme for NER unsupervised consistency training. Specifically, we convert Conditional Random Field (CRF) into a multi-label classification module and encourage consistency on the entity appearance between the original and paraphrased sequences. Experiments show that our method is especially effective when annotations are limited.
Original languageEnglish (US)
Title of host publicationEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages5303-5308
Number of pages6
ISBN (Print)9781955917094
StatePublished - Jan 1 2021
Externally publishedYes

Fingerprint

Dive into the research topics of 'Unsupervised Paraphrasing Consistency Training for Low Resource Named Entity Recognition'. Together they form a unique fingerprint.

Cite this