Tristan Luiggi, Laure Soulier, Vincent Guigue, Aurélien Baelde, Siwar Jendoubi, “Dynamic Named Entity Recognition“, submitted to SAC 2023. The paper is in the review process.
Abstract of Paper:
Named Entity Recognition (NER) is a challenging and widely studied task that involves detecting and typing entities in text. Recent advances in Natural Language Processing have resulted in performance increases for NER, with models like LSTM-CRF or trans-formers. So far, NER still approaches entity typing as a task of classification into universal classes (e.g., places, persons, or locations), which are not context-dependent. Moreover, architectures such as transformers and LSTMs are complex and capable of memorizing entity labels, which may induce overfitting, and thus, underuse of context. Our work focuses precisely on situations where the type of entities depends on the context. We have called this new task: Contextualized Named Entity Recognition (CNER). This task should allow us to better evaluate the ability of algorithms to extract entities by exploiting the context and to limit overfitting in the meantime. We, therefore, designed a CNER benchmark based on two datasets, CNER-RotoWire and CNER-ImDB, respectively aiming at detecting winning/losing players in basketball match summaries and predicting actors’ credit orders within movie synopsis. We evaluate baseline models and present experiments reflecting issues and research axes related to the novel task.