Review Comment:
The paper presents a new ontology, related to describing greenhouse gas emission data, bridging a number of existing formats in the area. The paper addresses an interesting and pressing problem, which is very real for many companies out there, struggling to meet reporting standards and assess their environmental impact to meet regulations and guidelines. While this is a practical contribution, I currently do not see a sufficiently clear research contribution of the paper, including an insufficient dept of both the discussion of state-of-the-art as well as the evaluation of the proposed artefact. Alternatively this could be converted into an ontology paper, shifting the focus from the scientific contributions to a practical, reusable, resource. However, even with that in mind, both the related work and evaluation sections would need considerable improvement.
More specifically, regarding the journal's evaluation criteria (1) originality, I do not see the original research contribution, i.e. the new knowledge that this paper contributes to the community. I think that the ontology architecture presented, i.e. "bridge" ontologies as mapping tools for expressing connections and transformations between standards and formats, is slightly novel - not as a proposal (many research efforts claim they will do this), but as an actual completed project, where three different formats have been aligned through such a bridge ontology. However, the paper does not take advantage of this novelty, and simply focuses on the resource itself, rather than what we can learn from this effort.
This brings us to review criteria (2), significance of the results, which is unfortunately also low in the current form. Following from the discussion above, the actual scientific results are minimal (most results are practical, an artefact), hence, the new knowledge gained is not significant and the paper will not significantly help other researchers build on this in the future.
Regarding criteria (3), quality of writing, the paper is reasonably well written, in terms of language and clarity, but it is missing several crucial parts to complete the "storyline", as already mentioned.
More specifically, starting from the related work section, this section covers a broad range of topics, from datasets for sustainability to data models for LCA. While for each such topic, the discussion is brief, and the account of state of the art work does not seem complete. If a selection has been made, then it is not clear why particular work was included and other work left out. For instance, knowledge graphs for sustainability has been a topic for several years, and many datasets have been published, including efforts from Google and other large organisations, linked governmental data etc. There is also the whole area of linked energy data, which is not even mentioned. Either this section has to be considerably extended, or perhaps the title should be changed - is it really relevant to survey and contrast against all kinds of sustainability efforts? On the other hand some sections seem incomplete, such as the fact that the PACT formats are not even mentioned in this section, while this work is acknowledge in the introduction as the main alternative (although still emerging), i.e. standardisation instead of mapping between many other formats.
Next, from looking at sections 3 and 4 it is quite unclear what WISER is supposed to be. In the title of section 3 it sounds like it is the KG that is called WISER, and then in the title of section 4 it reads as that section will present the ontology of the KG. However, from the content of the sections it seems that the ontology is already described in section 3 (subsection 3.3), which raises the question what section 4 is actually about? Or are they different ontologies? Is it the development methodology and more details of the same ontology described in section 3.3, or are these actually different things? If they are indeed the same things, then I would suggest to start with section 4, describing the methodology, and details of the ontology, and only after that show how this can be used to represent data, and map between the standards.
There are also many unclear points in the development process: What is actually the scope of the bridge ontology? And WISER? Is it only bridging the geographical aspects of these formats? Or is the geographical requirements and mappings just an example? Several of the figures illustrating examples are not sufficiently explained, neither in terms of notation (e.g. what do the arrows between classes and properties indicate in Figure 3? Domain and range restrictions? And what about the dashed lines "mapping" in Figure 4?) nor in terms of their content. It is also not clear why the notation is different between Figure 1 & 3 and Fig 4 & 5?
In section 3.2 generation of an RDF KG from XML documents is discussed. It is a bit unclear how this fits in with the rest of the paper, which is about the ontology. Does this bring any novelty and scientific contribution, or is it more a part of the use case, i.e. how the ontology can be used? Additionally, the approach is poorly motivated. Why is a translation via Java classes used? Why was mappings, such as using RML, ruled out? And what about other kinds of transformation approaches, such as OTTR/OPPL? If this approach is part of the research contribution, it should also be backed by related work, novelty and generalisability discussed, and choices as the one mentioned above better motivated, as well as results evaluated and discussed. Figure 2 is also not very clear - what do the two arrows mean? The database sends a database to a generic Java class??
Further, sections 3 and 4 need to focus more on the learnings from this work - what are the challenges in creating bridge ontologies? How were they addressed? What are the cases that could not be covered? Why? What other things can we learn from this?
Finally, the main emphasis of the paper should be on the evaluation - this is where we can really learn something, and where the scientific contribution should be grounded. However, for this to be possible, the evaluation should be described in much more detail. The title of the evaluation section 4.2.7 "Set of queries" does not really match the content. I would suggest to make this its own section, called “Evaluation", and then several subsections, e.g. "Evaluation setup", "Evaluation results", "Analysis" etc. While assessing the query performance of the integrated data is an interesting evaluation, actually applying the ontology in its intended use case should also be an essential part. The web application briefly mentioned could be a part of this, but on one hand the description is way too brief, and on the other hand it is not clear whether this has even been used, e.g. in a real use case, by actual users etc. And what can we actually learn from using the ontology for this application? How are we now making reporting or LCA assessment better? What are the gains? In fact, from the paper it is not even clear what the focus of the application is - what does "data gathering" mean? Entering new data into the system, or accessing and gathering data from different datasets/databases?
The paper completely lacks an analysis of the results, and a discussion of limitations and implications of the research.
The supplemental material is comprehensive, but not well documented. For instance, the ontologies lack a documentation page for human consumption, and some ontologies even lack documentation (annotations) in the OWL files (e.g. labels and comments). This makes it difficult to assess the quality of the artefacts themselves.
Minor issues:
- Page 3: "databases that based on a new" -> "that are based on a new"?
- Page 4: "analizing" -> "analyzing"
- Figure 3 - why is one class more orange than the rest? It is also not entirely clear where all the lines go - do they branch out or cross each other?
- What do you mean with TBox-data in section 4.2? Normally data is the ABox.
- Footnotes are missing on page 9.
- Figure 4: What do the different colors mean? Why are some arrows dashed and some not? What does "mapping" mean technically?
- Don't break the listing 1 on two sides of a figure, and on two separate pages.
- Page 11: "grater" -> "greater"
|