Automatic Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods

Tracking #: 1083-2295

Heiko Paulheim

Responsible editor: 
Philipp Cimiano

Submission type: 
Survey Article
In the recent years, different web knowledge graphs, both free and commercial, have been created, with DBpedia, YAGO, and Freebase being among the most prominent ones. Those graphs are often constructed from semi-structured knowledge, such as Wikipedia, or harvested from the web with a combination of statistical and NLP methods. The result are large-scale knowledge graphs that try to make a good trade-off between completeness and correctness. In order to further increase the utility of knowledge graphs, various refinement methods have been proposed, which try to infer and add missing knowledge to the graph, or identify erroneous pieces of information. In this article, we provide a survey of such knowledge graph refinement approaches, with a dual look at both the methods being proposed as well as the evaluation methodologies used.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Natasha Noy submitted on 02/Jun/2015
Major Revision
Review Comment:

This manuscript was submitted as 'Survey Article' and should be reviewed along the following dimensions: (1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic. (2) How comprehensive and how balanced is the presentation and coverage. (3) Readability and clarity of the presentation. (4) Importance of the covered material to the broader Semantic Web community.


The paper presents a survey of automatic approaches to refining knowledge graphs. I think this paper has a potential to be exactly what SWJ expects from survey papers but it will require some major revisions for that to happen. Given that there are hardly any survey papers on knowledge graphs, and given that this survey can be quite broad, I sincerely hope that the author takes the time to address the revisions.

I think the discussion section is particularly strong and it is rare to see a survey paper that really focuses on analyzing the results and not just presenting them.

There are a couple of major concerns that I have with the current version.

First, I think that the focus on only automatic means is somewhat artificial and diminishes the value of the survey. There are not that many non-automatic means (anything beyond crowdsourcing?) so why not include all types of refinement for a really comprehensive survey. I would understand limiting the scope if expanding beyond the automatic means would add a lot more works to consider and would really dilute the focus. But I really don’t think this is the case here.

Second, the author makes a big emphasis on the fact that the paper is about refinement vs construction. I have to admit that I have trouble seeing where one ends and the other begins. In particular, the core part of refinement that the author considers at length, mainly completion, comes very close to construction. I am not suggesting expanding the survey to the construction methods as well, but I would like to see a more clear delineation.

Third, the paper never really defines what a Knowledge Graph is. I appreciate that there may not be a very straightforward definition, yet, it is hard to argue that the survey is comprehensive without defining what it operates on. I think this problem is compounded by the fact that some of the works included in the survey operate on all kinds of artifacts (e.g., “several small ones” for [58], WordNet for [76], [54], etc.). Please define what a knowledge graph is -- and what it is not. And then only include the works that operate on the artifacts that satisfy this definition.

Finally, the paper never mentions several large knowledge graphs -- the ones on which, admittedly, there has been much less research; yet, a survey on knowledge graphs is incomplete without their mention: Facebook, Microsoft, and Yahoo! all have their own knowledge graphs and have published at least a bit on them.

Roi Blanco, B. Barla Cambazoglu, Peter Mika, Nicolas Torzec, Entity Recommendations in Web Search, ISWC 2013
Dalvi, Nilesh, Ravi Kumar, Bo Pang, Raghu Ramakrishnan, Andrew Tomkins, Philip Bohannon, Sathiya Keerthi, and Srujana Merugu. A Web of Concepts, ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2009
Thomas Lin, Patrick Pantel, Michael Gamon, Anitha Kannan, and Ariel Fuxman, Active Objects: Actions for Entity-Centric Search, WWW 2012

Now, a few more detailed comments.

There is a comment at the very top of page 2 that decoupling knowledge graph construction and refinement allows for developing methods to refine arbitrary knowledge graphs. I would soften that statement as it has clearly not been the case so far, and, frankly, I am not sure this is really a fact.

There is a point towards the middle of the second column on p.2 that says that Google Knowledge Graph uses knowledge harvested from Google+. Please provide a citation (I don’t believe this is true, actually).

On page 3, there is a reference to a “genuine semantic web knowledge graph”. Can you explain what you mean by this?

Table 1:
please describe what the column mean; there is no reference in the text to what “relationship types” are for example; it is certainly not the same as “entity types” and I realized only later tha it is the number of properties
I was trying to understand where the data came from and I think the footnotes on p.3 don’t account for it. For instance, it is unclear where the number of entity types and relation types for Google knowledge graph comes from.
The table introduces Knowledge Vault which is hardly discussed or mentioned elsewhere

Section 3, “Targeted kind of information” -- it took me a while to understand what this means. Even a simple re-phrasing to “Type of structures targeted” might help.

Knowledge graph as a silver standard: I could not quite understand the explanation. What is being compared at the end, in particular if the graph is not being broken up into testing and training set? Maybe giving an example of the way the errors are identified could help?

Maybe use “Retrospective evaluation” instead of less commonly used “Ex post evaluation”?

I found the organization of Section 5 a bit odd. While you introduced the dimensions of comparison in Section 4, you used yet another dimension -- computational approach -- to organize your survey. At the very least, it seems like this dimension should also be included. But I also wonder if it would be useful to choose another dimension, such as the type of information targeted for organizing this section.

Section 6.1.4: Neither of the two references mentioned deal with knowledge graphs -- at least in a sense that the authors seem to imply the definition of a knowledge graph. Unless there are more relevant works that do, I think this should be dropped.

Review #2
Anonymous submitted on 08/Jul/2015
Minor Revision
Review Comment:

(1) Suitability as introductory text, targeted at researchers, PhD students, or
practitioners, to get started on the covered topic.

The article reviews methods for the automated refinement of knowledge graphs.
However, the article does not actually have a tutorial style. For instance, it does not clarify the difference between a knowledge graph and a DL ontology, although some notions from the DL literature are used within the text. Only readers already acquainted with basics of knowledge representation for the Semantic Web can grasp these references. So, the article can be only partially considered as an introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic.

(2) How comprehensive and how balanced is the presentation and coverage.

The article covers refinement methods aimed at completion or at error detection in knowledge graphs, being the two cases complementary as the author emphasizes. For each of these cases, the presentation distinguishes between internal and external methods. The survey also comprehends currently popular graph-based datasets and evaluation methodologies for knowledge graph refinement. Overall, the presentation and coverage is well-balanced. However, the work on DL learning is relevant to the covered topic and should be at least mentioned.

(3) Readability and clarity of the presentation.

The presentation is readable and clear. Only a few typos and linguistic flaws occur in the text (including the bibliography). I recommend the author to proofread carefully the article or have it proofread by a native English speaker.

(4) Importance of the covered material to the broader Semantic Web community.

The covered material is definitely important to the broader Semantic Web community due to the increasing interest in graph-based datasets such as Linked Open Data sources.

(5) Further comments.

The sentence starting Section 6.1.2 is unfortunate and contains an arguable claim.

Automated reasoning has not been invented by Semantic Web researchers (!). I
recommend to replace the bibliographic reference [45] with a more appropriate one.

Review #3
Anonymous submitted on 14/Jul/2015
Minor Revision
Review Comment:

This paper presents and analyses knowledge graph refinement approaches, in particular the methods being used as well as methodologies which are used to evaluate those refinement methods.

(1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic.

The paper is interesting and well written. Nevertheless, I think that some improvements and clarifications are necessary.

The survey is about knowledge graphs, but from what I know, there is no real definition of what a knowledge graph is, and moreover it's usually referred to the Google knowledge base. Based on that point, the relationship to ontologies should be explained.
Moreover, if we think about large ontologies when talking about knowledge graphs, there are already some surveys about ontology learning, so it might be interesting to highlight some complementary things and/or differences.

While everybody knows about gold standards, the term "silver standard" is not a common one, so maybe it should be shortly explained.

(2) How comprehensive and how balanced is the presentation and coverage.

I think that the topic is covered extensively. I'd only add some additional work according to

Pattern Based Knowledge Base Enrichment by Lorenz Bühmann and Jens Lehmann @ ISWC 2013

Universal OWL Axiom Enrichment for Large Knowledge Bases by Lorenz Bühmann and Jens Lehmann @ EKAW 2012

Bootstrapping the Linked Data Web by Daniel Gerber and Axel-Cyrille Ngonga Ngomo in 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011

Real-time RDF extraction from unstructured data streams by Daniel Gerber, Axel-Cyrille Ngonga Ngomo, Sebastian Hellmann, Tommaso Soru, Lorenz Bühmann, and Ricardo Usbeck @ ISWC 2013

(3) Readability and clarity of the presentation.

The paper is clear and well written.

Some minor issues:

* p4 "...for a corrections tasks," -> "...for corrections tasks," or "...for a corrections task,"
* p5 "Completion of knowledge graphs aims at increasing the coverage is the goal of knowledge graph completion." -> last part ("is the goal...") should be removed
* p5 "The train a sensor..." -> "They train a sensor..."
* p6 "the use of association rule mining to find property chains" -> needs to be rephrased
* p6 "...with different different..." -> duplicate
* p7 "For the building Knowledge Vault, ..." -> "For building the Knowledge Vault, ..."
* p9 "Although not using established any standard outlier detection algorithm,..."
* p9 "...why whey were..." -> "...why they were..."

* reference 36 is broken

(4) Importance of the covered material to the broader Semantic Web community.

The material covered in this paper is quite important for the Semantic Web community, in particular for the Machine Learning sub-community.