Member-only story

Rule-based technique for coreference resolution

Nikhil Verma
3 min readJun 22, 2022
Photo from google search

Set of sentences that convey an understanding of text, when read together are referred to as discourse. In these sentences, the words that can be grouped by identity of reference are called as coreference. Using intuition from the past that coreference resolution mechanism using syntactic/semantic rules could outperform ML based techniques, the authors present a two layer model which uses hand-written rules called as sieves. It uses linguistic features from text and is entity based. It uses an entity centric approach i.e. coreference decision is globally informed. It helps in sharing features across each sieve in the framework.

The Mention Detection stage, which is a high recall stage, stands at the top of the stack because we do not want to miss mentions that are guaranteed to affect the final score. Following it the featurised text is passed through coreference resolution slabs. These slabs focus on increasing precision of the system. Here text is processed through 10 sieves. The system’s precision ordering allows it to first link high-confidence mention-pairs, and only later consider lower confidence sources of information [1].

Notable sieves worth mentioning that contribute around 97% model improvement in paper are:-

  • Exact string match: Two mentions are linked if they have the exact same string. Notable is…

--

--

Nikhil Verma
Nikhil Verma

Written by Nikhil Verma

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/ | My blogs are living document, updated as I receive comments

Responses (1)