Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets

Rui Wang, Tong Yu*, Junda Wu, Handong Zhao, Sungchul Kim, Ruiyi Zhang, Subrata Mitra, Ricardo Henao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Federated learning involves collaborative training with private data from multiple platforms, while not violating data privacy. We study the problem of federated domain adaptation for Named Entity Recognition (NER), where we seek to transfer knowledge across different platforms with data of multiple domains. In addition, we consider a practical and challenging scenario, where NER datasets of different platforms of federated learning are annotated with heterogeneous tag sets, i.e., different sets of entity types. The goal is to train a global model with federated learning, such that it can predict with a complete tag set, i.e., with all the occurring entity types for data across all platforms. To cope with the heterogeneous tag sets in a multi-domain setting, we propose a distillation approach along with a mechanism of instance weighting to facilitate knowledge transfer across platforms. Besides, we release two re-annotated clinic NER datasets, for testing the proposed method in the clinic domain. Our method shows superior empirical performance for clinic NER with federated learning.

Original languageEnglish (US)
Title of host publicationFindings of the Association for Computational Linguistics, ACL 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages7449-7463
Number of pages15
ISBN (Electronic)9781959429623
StatePublished - 2023
Event61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, Canada
Duration: Jul 9 2023Jul 14 2023

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Country/TerritoryCanada
CityToronto
Period07/9/2307/14/23

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets'. Together they form a unique fingerprint.

Cite this