Review Comment:
This paper claims four contributions:
1. Textual Genre Independence
2. Language Independence
3. Knowledge Base Independence
4. Entity type independence
Along with these four claims, the authors suggest that they have a highly modular system. However, the evaluation section, in my opinion, fails to strongly demonstrate or backup these claims.
Let's start with the modular system claim. The authors point out examples like AIDA, TagME, Babelfy as static non-adaptable system, non-modular systems. For instance, for AIDA one would need to change the code to include a new KB. I see that as engineering change and not an algorithmic one. So I don’t necessarily buy the argument that it's not knowledge base independent. Similarly for Babelfy, authors claim a linear formula can't be added. Why would they want to do that?
From an engineering/system architecture and design point of view, yes ADEL is modular. I want to compliment the authors for the effort and the design. But I see that as a system contribution and not as a research contribution in the context of this journal.
If I apply the author's own definition about the need for code-changes as some-what static systems, even ADEL is not completely modular. The Overlap Resolution module uses a manual mapping between different types (LOC, Place etc.) across different NER systems. If one would add a new NER/new type, I see a "code-change" required in ADEL. The index construction also seems a somewhat manual process, with the end user/developer required to construct the SPARQL queries to generate a new index for a new KB (including the PageRank computation).
Let's consider the textual independence claim. The authors did evaluate the system across several different benchmarks which include different genre data - from news like articles to tweets. However, I attribute the textual independence to system/architecture as opposed to algorithmic. The system architecture allows one to put together a different pipeline for different genre of data. E.g. you choose appropriate POS/NER for tweets and choose another one for news data. The system is designed well for bring your own recognition system + linking system. I don’t see much of novel research contribution here. The evaluation section clearly shows that ADEL uses Stanford CoreNLP models all the way.
On the subject of language independence, authors do admit that they haven't achieved it and point to the fact that NERs/POS models for different languages can be added to the system (which again is a nice architecture contribution …). I recommend the authors to look into work in cross-lingual, multi-lingual embeddings and their applications in building truly language agnostic models (including NERs).
Knowledge base and Entity type independence is more of system design. In theory, most of the related work stated in this paper can be engineered to work with different KBs and entity types.
The evaluation section should focus on backing up the claims. For instance, there is too much emphasis on "recognition". I see those results more of a reflection of the performance of the CoreNLP models (and their different combinations) as opposed to providing evidence for the contributions. ADEL being an Entity Linking system, as authors themselves admit, is yet to achieve satisfactory results in the linking process (Tables 9, 10, 11, 14 to cite a few examples). The ADEL linker formula needs more work, and authors have pointed to promising interesting next steps.
I reviewed the author comments for the previous reviews and I would agree with Reviewer 3. I would consider this paper as a good systems paper and not a full research paper. I feel the novel contributions - model combination, overlap resolution module and indexing are not strong enough to warrant a full research paper. It’s also unclear to me regarding what's new in this paper and what's previously published.
I definitely recommend this paper to be resubmitted as a systems paper.
(a small gripe - in section 6.2, please reference the appropriate table numbers in the subsections, will make it easier to read).
|