Open world classification is a task in natural language processing with key practical relevance and impact. Since the open or unknown category data only manifests in the inference phase, finding a model with a suitable decision boundary accommodating for the identification of known classes and discrimination of the open category is challenging. The performance of existing models is limited by the lack of effective open category data during the training stage or the lack of a good mechanism to learn appropriate decision boundaries. We propose an approach based on adaptive negative samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets. Empirically, we find a significant advantage in using auxiliary one-versus-rest binary classifiers, which effectively utilize the generated negative samples and avoid the complex threshold-seeking stage in previous works. Extensive experiments on three benchmark datasets show that ANS achieves significant improvements over state-of-the-art methods.
|Original language||English (US)|
|Title of host publication||2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||15|
|State||Published - 2022|