Review Comment:
This article “Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: A survey“ provides a survey of the literature on the topic of ontology evaluation. The paper attempts to survey approaches, metrics and subjectivity in ontology evaluation. Unfortunately, in its current state, this paper does not read like a survey at all but just a collection of few papers randomly put together, thus lacking in breadth and depth. I provide my critique for on each of the four criteria, based on which this paper was reviewed.
(1) Suitability as introductory text, targeted at researchers, PhD students, or practitioners, to get started on the covered topic.
This paper is not at all suitable as introductory text to get started on the covered topic. There is no clear overview of the concepts regarding ontology evaluation and there are several concepts introduced but none are studied in detail such as ontology evaluation, ontology reuse, ontology integration. First of all, I think a brief description of the basic concepts of ontologies should be provided such as what is an ontology, what is ontology evaluation etc. This would help any one new to the field at least gain enough background knowledge to go through the rest of the paper. Then a consensus of the concepts should be provided since currently there is a lot of overlap, which is causing confusion. For example, sometimes the authors call it “metrics”, other times “measures” or “criteria” or “term”. Is there any difference between them? Another example is use of the words “method” and “approaches” - what is the difference? This also points to the title of this paper. I would keep the terminology consistent.
More importantly, I do not regard this paper as a “survey” at all because a survey is usually conducted for several reasons [1] such as: (i) the summarization and comparison, in terms of advantages and disadvantages, of various approaches in a field; (ii) the identification of open problems; (iii) the contribution of a joint conceptualization comprising the various approaches developed in a field; or (iv) the synthesis of a new idea to cover the emphasized problems. This paper does not summarize nor compare the various approaches - they are just randomly mentioned throughout the paper or very superficially referenced and clumped together. Although the authors attempt to identify “limitations” (or “open problems” per-say) in the current approaches, these are also not clearly explained and are also all over the place, which makes it difficult to identify them clearly. The Table 3 does attempt to provide an overview of the approaches, but only one paper is cited, which itself is a survey of ontology evaluation techniques done in 2005. What about all the other papers/approaches that are newly proposed in the literature? There is no real attempt to unify or summarize these findings. Only in the “limitation” section there are some more references, which should actually be in the main part of the paper. Even there its only mentioned how many metrics are there rather than discussing in detail which metric is applicable in which situation and how it is measured. Thus, I do not see much scientific contribution in this paper. Right now, it looks like a related work section of several papers put loosely together.
I provide a detailed list of my critique for specific parts:
Title and Conclusion
- The title is “Approaches, methods, metrics, measures, and subjectivity in ontology evaluation”. As I already mentioned, what is the difference between “approaches” and “methods” and between “metrics” and “measures”?
- Then in the conclusion, you only mention “approach, methods and metrics”. It is again confusing here that you don’t mention subjectivity. This is only loosely mentioned in the last paragraph. Is this a contribution or not?
- Moreover, there is a lot of repetition in the conclusion section - you mention twice about ontology quality and correctness and about the focus of this paper.
- Also, why provide an explanation of what is “correctness” only in the conclusion?
- Please just be consistent with what are the contributions of this paper.
Abstract
- What do you mean by gap analysis? Did you do this? Right now it is quite unclear.
Section 1
- In general, it is rather weakly motivated as to why ontology evaluation is important. I suggest to provide a stronger argument.
Section 2
- “An interesting definition of ontology evaluation has be given by Gómez-Pérez et al. [17] and later echoed by Vrandecic et al. [40].” -> It would be helpful to the reader to actually provide the definition here rather than her having to look up the references for it.
- I would strongly suggest to even provide the definitions/explanations for each of the concepts here such as what is an ontology, what is ontology evaluation, what is ontology reuse etc. so as to make this paper stand by itself as an introductory text to any one new in this field. Provide examples and/or references too for each.
- After ontology verification in Section 2.1, you suddenly talk about ontology integration and merging? What is the connection? Then you introduce ontology reuse. I find it hard to follow the structure of this section.
- Is the difference between ontology integration and merging introduced by the authors? If not, please just add a reference from where you got them. Also, it is not entirely clear to me what the difference is. Maybe adding an example (perhaps visually) would help. For me, ontology integration sounds more like ontology expansion.
Section 3
- Again, either clarify the difference between metrics and measures or use a single term throughout the paper.
- All the papers are mixed up here - clearly adding bullets and also adding an overview table would help.
- Also, the section 3.1 heading mentions “State of the art”, isn’t this whole paper supposed to be a “state of the art”? Also, details of only a couple of papers are provided. Are all covered? This point is covered in my critique for (2)
- Even the three types of metrics of [14] should be further clarified or explained with examples.
- In Section 3.2, why does your work “perceive” ontology evaluation as these perspectives. Isn’t this supposed to be a survey of existing literature? Aren’t there any more already defined in the literature? Please provide references.
- For ontology quality, is there only one paper that discusses it?
- Provide examples for internal and external attributes.
- Figure 1 seems rather vague for the depiction of “ontology evaluation” but probably only tries to show what an ontology is. I would remove this or significantly improve it so that it clearly depicts what are the detailed steps for ontology evaluation.
- This section leads me to wonder what is exactly the difference between “quality” and “correctness”? I would think “quality” is the general overall concept and “correctness” is one of the data quality dimensions
- Also, it is mentioned “These scenarios necessitate the need for separation of these concerns and advocates for separate determination of each.” I would expect the authors to do this sort of analysis in this paper!
- In Table 1, I do not see clearly the four layers of the suite. I only see two evaluation perspectives and seven metrics. Even in that, there is an overlap in the measures such as “coverage”. Also, is there no measure for “Conciseness”?
- First you mention “This metric suite is reminiscent of Button-Jones et al. [6]’ metric suite for ontology auditing.” and later you say “This metric suite has been largely based on Vrandecic [41]’s evaluation criteria and Button-Jones [6]’s metric suite.” Which one is it? Then you mention the eight criteria proposed by [41]. Then you say it is not an exhaustive list so I am not sure what exactly is your contribution here. Why don’t you provide a list of all those that you have actually found in those papers? Then if you add more of your own, clearly mark which ones are from previous literature and which ones are newly introduced.
- I would put Table 2 before Table 1 - first provide the definitions of the metrics and then how they can be measured. Also, would be helpful to map the terms/metrics in Table 2 to the “evaluation perspective” (from Table 1).
- Also, they do not seem to be very clear definitions in Table 2, for example, for Accuracy, is the second sentence part of the definition? Same for cohesion.
- The terms/metrics provided in Table 2 also seem to overlap - such as “coupling” and “cohesion”, “coverage” and “completeness”. Probably merge them or show how they are inter-related.
- Also, only a few of the definitions have references, are the others introduced by the authors?
- Ideally, each of the terms/metrics in Table 2 should be also provided with a measure. For example, how is Completeness measured in a OWA and CWA perspective?
Section 4
- The caption of Table 3 is “An overview of approaches to ontology evaluation as related to the aspects (layers) of an ontology.” but in the text it is mentioned that the table is only a comparison between two classifications and that is also taken from one paper. In fact the table is a replica of the table in [2] so I do not see the reason to reproduce it here.
- Provide more details about the classifications - add examples, add more references - name/highlight the types.
- In Sections 4.1 to 4.3, why are only the different types of the third classification discussed? What about the other classifications?
- I appreciate you try to find limitations in each of those classifications discussed in the last paragraphs of each section but then there are limitations in Section 5 too. I would just provide a separate section at the end.
- Doesn’t user-based evaluation also involve crowdsourcing based tasks/evaluation? I provide some references later on, which are missing and should also be considered as part of this survey.
Section 5
- Isn’t subjectivity an important core contribution of your paper? Why does it appear only in the “Limitations” section. Is “subjectivity” the “only” limitation in ontology evaluation? I would make “Subjectivity in ontology evaluation” a section by itself and then add all the limitations in another separate section (including the ones from Section 5) and call it “Open Problems”
- I did not understand 5.2 Subjectivity in thresholds entirely. It read rather vague, probably providing an example might help.
- What about those tools which automatically assess the quality of ontologies? Where does “subjectivity” come in the picture then?
[1] Barbara Kitchenham. Procedures for Performing Systematic Reviews. Department of Computer Science, Keele University, (2004)
(2) How comprehensive and how balanced is the presentation and coverage.
Even though the authors cite several papers that propose a methodology for ontology evaluation, it is unclear how many are actually surveyed. After reading this paper, I was left confused as to which and how many ontology evaluation methods are actually surveyed and how many and which methodologies are exactly available for ontology evaluation. Thus, it is difficult at this point for me to judge whether the paper is comprehensive enough unless I look up these papers and then compare the references individually myself. Sometimes it seems that this survey is based on only two papers - Button-Jones et al. and Denny’s thesis, which are repeatedly mentioned in most of the paper, whereas the others are just mentioned in the limitations section. In this regard, this paper is rather imbalanced. Also, on that, having written a survey paper myself, the citation of a thesis was strongly criticized so I would recommend to refer to the papers (that the thesis refers to) rather than referring the thesis only because a thesis is not peer-reviewed work. In any case, I strongly recommend the authors to dive deep into each of the papers they cite and provide a summarization and comparison of the various approaches. In the current state, the paper reads like random musings of some papers that the authors seemed to have chanced upon as opposed to having undertaken an actual survey.
I am aware of Denny’s thesis, which already provides a number of frameworks for ontology evaluation published in 2010. The authors also cite another survey done on the same topic in 2005. I also found this book chapter [2] published in 2010 which compares different ontology evaluation techniques and also provides a tool.This raises several questions: Did you include all of the papers that they cited? How many more papers are newly published since the publication of this thesis? Was it a significant number of papers? Did you include all of those too? What methodology did you follow to search for the papers? What inclusion/exclusion criteria did you use to include the papers that you mention? Please provide a table with the list of all papers that are related to ontology evaluation included in this survey. Providing such a table would be much easier for a reader or a researcher, PhD student or practitioner to get started on the covered topic.
I am also surprised that the authors do not mention any ontology evaluation tools, which should be an important part of this survey considering there are several tools out there to perform this assessment. Please provide an overview of the tools and also a comparison of them. I already identified tools for ontology evaluation and provide references later in my review. But, the authors should make an effort to look for all the ones currently available if they plan to include it in their survey.
[2] Samir Tartir, I. Budak Arpinar, Amit P. Sheth. Ontological Evaluation and Validation. Theory and Applications of Ontology: Computer Applications 2010, pp 115-130
(3) Readability and clarity of the presentation.
The paper is poorly structured and quite difficult to follow. The sentences are unnecessarily complex and repetitive. Also, there are several facts which are unaccompanied by references, which leaves the reader wondering whether they are really true or speculations by the authors. I have provided an incomplete list of some of the formal errors that I encountered at the end of my review. However, I strongly recommend the authors to proof read the paper and have a native speaker or third person also read the paper to ensure clarity in the presentation.
It is extremely important for a survey paper to provide a general overview of the existing approaches (be it in the form of a figure/tables) so that the reader can easily refer to the paper to choose any one for their purpose. As of now, it is all very chaotically put together into one paper. I think the authors should really ensure that the text is clearer and tighter in the next iteration.
(4) Importance of the covered material to the broader Semantic Web community.
The topic of ontology quality/ontology evaluation is indeed an important topic to the Semantic Web community. However, this paper, in its current state, is far from providing an overview of all the existing approaches proposed so far in the literature. I think the authors should make an effort to investigate further into the existing approaches. I think in the paper they do have several approaches already referenced but they are all over the places so I would suggest that they look into each one of them in detail and show the reader clearly which paper to refer to in order to evaluate which specific metric(s).
Here are my recommendations to the authors:
- Unify the concepts regarding ontology evaluation - definitions, formulae, etc.
- On a bigger picture, provide an explanation on the importance of ontologies, ontology quality, ontology evaluation etc.
- Clearly identify which research questions you aim to answer with this survey (set the boundary) - ontology evaluation, ontology integration/matching, ontology reuse. For each of these, then there are several papers that tackle each area.
- Use consistent terminology
- Provide a quantitative and qualitative overview
- Provide an overview table of which approach provides which category, which metrics, which tools it provides
- Extract the (common) steps and metrics involved in each of the approaches and/or compare the different approaches qualitatively
- Look at each criteria/metric in detail and discuss how it can be measured and then provide references to which papers measures which metric
- Provide an overview of the tools - even perhaps compare the tools based on certain criteria (e.g. https://en.wikipedia.org/wiki/Non-functional_requirement) or actually apply them to an example ontology and evaluate it
- I think even some important and interesting aspects of ontologies need to be explored such as “evolution of ontologies and their evaluation”, “domain specific ontology evaluation”, “crowdsourcing ontology verification” (I provide few references related to these)
- It is great that you provide limitations but I would rather see them as challenges and just only provide them as bullet points or clear paragraphs without any references so that it paves way for new research in this topic. Right now it is all clumped together with the existing literature so it is difficult to clearly identify these challenges. This could be a significant contribution.
I think even the addition of a few of these recommendations can significantly help improve the quality of this paper and its contributions.
Here is a short list of papers that should be looked into either to gather additional references or to consider adding the papers themselves:
- Aruna, T.; Saranya, K.; Bhandari, C., "A Survey on Ontology Evaluation Tools," Process Automation, Control and Computing (PACC), 2011 International Conference on , vol., no., pp.1,5, 20-22 July 2011
- Sara García-Ramos, Abraham Otero, Mariano Fernández-López. OntologyTest: A Tool to Evaluate Ontologies through Tests Defined by the User. Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living. Lecture Notes in Computer Science Volume 5518, 2009, pp 91-98
- Peroni, Silvio, David Shotton and Fabio Vitali. "Tools for the Automatic Generation of Ontology Documentation: A Task-Based Evaluation." IJSWIS 9.1 (2013): 21-44. Web. 1 Aug. 2014. doi:10.4018/jswis.2013010102
- Fernandez, Miriam; Cantador, Iván and Castells, Pablo (2006). CORE: a tool for collaborative ontology reuse and evaluation. In: 4th International Workshop on Evaluation of Ontologies for the Web (EON 2006) , 23 - 26 May 2006, Edinburgh, UK.
- Jonathan M. Mortensen, Paul R. Alexander, Mark A. Musen , and Natalya F. Noy. Crowdsourcing Ontology Verification. The Semantic Web – ISWC 2013. Lecture Notes in Computer Science Volume 8219, 2013, pp 448-455
- Cristina Sarasua, Elena Simperl, and Natalya F. Noy. CROWDMAP: Crowdsourcing Ontology Alignment with Microtasks. ISWC 2012
- Xi Deng, Volker Haarslev, and Nematollaah Shiri. Measuring Inconsistencies in Ontologies. ESWC 2007
- Samir Tartir, I. Budak Arpinar, Michael Moore, Amit P. Sheth, Boanerges Aleman-Meza. OntoQA: Metric-Based Ontology Quality Analysis. IEEE Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources 2005
- Liping Zhou. Dealing with Inconsistencies inDL-Lite Ontologies. The Semantic Web: Research and Applications. Lecture Notes in Computer Science Volume 5554, 2009, pp 954-958
- Peter Plessers, Olga De Troyer. Resolving Inconsistencies in Evolving Ontologies. ESWC 2006
- Survey on vocabulary and ontology tools. A Semicolon project deliverable. Version 1.0. http://www.semicolon.no/wp-content/uploads/2013/09/Semicolon_Vocabulary-...
- D1.2.3 Methods for ontology evaluation. Knowledge Web. http://www.starlab.vub.ac.be/research/projects/knowledgeweb/KWeb-Del-1.2...
As a side note, I would strongly recommend the authors to look at other accepted surveys in the same journal or even elsewhere to get an idea of what a survey should essentially contain.
Here is an incomplete list of formal errors that I encountered:
Abstract
- ascertain -> ascertaining
- web at large has been has been -> web at large has been
- deciding one the suitability -> rephrase
1 Introduction
- is a emerging field -> is an emerging field
- “It first gives a context to ontology evaluation by defining the notion of ontology evaluation (Section 2.1) and discussing ontology evaluation in the context of ontology reuse as an example scenario for the role of ontology evaluation (Section 2.2).” -> too much use of “ontology evaluation” which makes this sentence difficult to interpret
- Why is Section 4 mentioned before Section 3 in the last paragraph?
- You forgot to mention Section 5
2.1 A definition
- I would rename this section
- determining which in a collection of ontologies would -> determining which, in a collection of ontologies, would
3.1 Ontology evaluation metrics: State of the art
- ( are also -> (are also
3.2 Ontology evaluation measures: Perspective, criteria, metrics
- Ontology quality perspective -> Ontology quality perspective.
- Ontology correctness perspective -> Ontology correctness perspective.
- ( the model) -> (the model)
- and get a high score -> and gets a high score
Table 1
- precision -> Precision
Table 2
- determining is the asserted -> determining if the asserted
- if its its classes -> if its classes
3.3
Verendicic -> Vrandecic
4.1 Gold standard-based evaluation
- ( the target -> (the target
4.2 Application or task-based evaluation
- set of concept, -> set of concepts,
4.3 User-based evaluation
- two type of -> two types of
- will be give a weighted value depend on -> will be given a weighted value depending on
4.4 Data-driven evaluation
- how appropriate -> how appropriately
5.1 Subjectivity in the criteria for evaluation
- Keep the heading consistent with that mentioned in the previous paragraph - (i) subjectivity in the selection of the criteria for evaluation
- Ontology evaluation can be regarded over several different decision criteria. -> rephrase
5.1.1 Inductive approach to criteria selection
- inductive in that a -> inductive, in that, a
- (i) Number of Root Classes (NoR) - Number of root classes explicitly defined in an ontology and (ii) Number of Leaf Classes (NoL) - Number of Leaf classes explicitly defined in an ontology - remove the repetition
- Burton-Jones et al. [6]’s -> I find this way of citation strange. Either say “In [6], the authors...” or “Burton-Jones et al. [6] propose…”. Also keep it consistent, sometimes it is “The work of [17]...” or “Button-Jones et al. [6]’ metric” - use one form of citation. This applies to all such references.
- Also there are several sentences which are not backed by any reference and/or need more details such as:
-- While this is attractive, it presents a challenge in deciding which ontology to reuse (as they are reusable knowledge artefacts) and hence the topic of ontology evaluation. -> the sentence structure is very weak and needs references.
-- For example, given an algorithms ontology, one may introduce another class or type of algorithm… -> add reference
-- A considerable number of ontologies have been created. -> add some examples
-- The relevance of ontologies as engineering artifacts has been justified in literature. -> add reference
-- Most research on ontology evaluation considers ontology evaluation from the ontology quality perspective. -> add reference
-- Table 2 only some of the definitions have a reference and others not. Are the ones that don’t have a reference contributed by the authors?
- Also check the capitalization of certain words in the references e.g. oops! -> OOPS!
|