WASSERSTEIN CROSS-LINGUAL ALIGNMENT FOR NAMED ENTITY RECOGNITION

Rui Wang, Ricardo Henao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Supervised training of Named Entity Recognition (NER) models generally require large amounts of annotations, which are hardly available for less widely used (low resource) languages, e.g., Armenian and Dutch. Therefore, it will be desirable if we could leverage knowledge extracted from a high resource language (source), e.g., English, so that NER models for the low resource languages (target) could be trained more efficiently with less cost associated with annotations. In this paper, we study cross-lingual alignment for NER, an approach for transferring knowledge from high- to low-resource languages, via the alignment of token embeddings between different languages. Specifically, we propose to align by minimizing the Wasserstein distance between the contextualized token embeddings from source and target languages. Experimental results show that our method yields improved performance over existing works for cross-lingual alignment in NER tasks.
Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages8342-8346
Number of pages5
ISBN (Print)9781665405409
DOIs
StatePublished - Jan 1 2022
Externally publishedYes

Fingerprint

Dive into the research topics of 'WASSERSTEIN CROSS-LINGUAL ALIGNMENT FOR NAMED ENTITY RECOGNITION'. Together they form a unique fingerprint.

Cite this