Review Comment:
Summary:
The core work of this article is on identifying trustworthy information on social media, which is challenged by several different problems, such as target topics containing numerous conflicting claims. The authors presented a multi-task learning framework for stance detection and veracity prediction, namely Argumentation-based Truth Discovery Model, to discover multiple truths from conflicting sources. Experimental results on Emergent and Rumour Eval-2019 Task A+B showed the performance of the proposed model.
(1) Originality:
To the best of my knowledge, it is a novel idea to apply multi-tasking to stance detection and veracity prediction. Many similar works exist, such as
https://aclanthology.org/D19-6603.pdf
https://arxiv.org/pdf/2007.07803v2.pdf
https://aclanthology.org/D19-1485/
https://aclanthology.org/C18-1288/
Also, its main contributions to the knowledge of the SWJ community are not apparently significant.
(2) Significance of the results:
The results on two public datasets (Emergent, Rumour Eval-2019 Task A + B) demonstrated the effectiveness of the proposed methods. Plus, the authors had 9 observations from the results. I think it is hard to show the significant contributions to the SWJ community, not only due to the less novelty.
(3) Quality of writing
This article is not easy to follow, nor has a high quality of writing. In addition to typos (e.g., see Section 3.3, line 63, “target’=”), and non-standard mathematical notations, there are many ungrammatical sentences (e.g., see Section 1, para 5, line 1-3, or see Section 3.3, para 1, line 16-18). Also, this article is not very concise in describing the core work.
This article did not provide any publicly available resources (e.g., source codes, demonstrations) for replication of experiments, even though public datasets (Emergent, Rumour Eval-2019 Task A+B) for the stance detection and veracity prediction were used.
This article is lengthy, especially in terms of describing the architectures of different components of their proposed model. The descriptions or explanations are excessive, the reasons are as follows:
(1) the descriptions can be replaced with respective clear architectures, such as clause section component in paragraph 3 of Section 3.4, article (relevant clauses), and claim encoder and decoder in Section 3.4.1 and Section 3.4.2.
(2) The part that different components also use, like GRU, should be not described again. See paragraph 3 of Section 3.4, and Section 3.4.1.
(3) It is suggested that all of the architecture diagrams in the article should be re-drawn since they are unable to give readers any detailed information about the proposed model and its several components in a direct way.
(4) Please see paragraph 4 of Section 3.4, the authors repeatedly explain the principle and the attention mechanism. Similarly, see paragraph 5 of the same section, the softmax layer is repeated.
In paragraph 2 of Section 3.2, the authors mentioned this work employs a pointer generator architecture with attention and copy mechanisms to create a claim-target topic-based generator.
What is a pointer generator with attention and its architecture? Using only the single green box in Fig.2 and several lines to explain it is not sufficient.
What are copy mechanisms used? Sorry, I can not find any formal descriptions about them.
What is JSP in Fig 3.? Is it JSD (Jensen-Shannon Divergence)?
The mathematical notations of this article are extremely not uniform and standard, and unclear. For example, In Section 3.1: [h1,…,ht] (See paragraph 7), g, j, k (See paragraph 8), l, F, j, Fl (see paragraph 9), q(k), alpha(k) (see paragraph 10, not the same with equation (3)). This issue exists throughout the article.
The article lacks the most important architecture diagram of the multi-task learning and the soft parameter sharing network. It is suggested that the diagram be added, and also the formulas be added.
In paragraphs 5 and paragraph 6 of Section 3.4, the authors mentioned the loss function but did not provide its formal definition, please add it. Moreover, the authors mentioned this model trained by cross-entropy, but the loss function is computed the cosine similarity between target topic embedding and hidden state of the t-th clause. How this model is trained? It is suggested that more details be provided.
In paragraph 2 of Section 4.5, what is target topic aware target-specific based claim?
The reviewer believes that the paper is not related to the topics of the semantic web journal. This work is out of the scope of this journal since it does not use any existing or their own KGs.
|