Focused Categorization Power of Ontologies: General Framework and Study on Simple Existential Concept Expressions

Tracking #: 2406-3620

Vojtěch Svátek
Ondřej Zamazal
Miroslav Vacura
Jiří Ivánek

Responsible editor: 
Axel Polleres

Submission type: 
Full Paper
When reusing existing ontologies for publishing a dataset in RDF or developing a new ontology, preference may be given to those providing extensive subcategorization for the classes deemed important in the new dataset schema or ontology (focus classes). The reused set of categories may not only consist of named classes but also of some compound concept expressions viewed as meaningful categories by the knowledge engineer and possibly later transformed to a named class, too, in a local setting. We define the general notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontology’s signature, conform to the language, and are subsumed by the focus class. For the sake of tractable experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms. The characteristics of the chosen concept expression language and associated patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression type frequency in class definitions; second, the occurrence of the heuristic patterns (mapped on the expression types) in the Tbox of ontologies; and last, for two different samples of concept expressions generated from the Tbox of ontologies (through the heuristic patterns) their ‘meaningfulness’ was assessed by different groups of users, yielding a ‘quality ordering’ of the concept expression types. The different types of complementary analyses / experiments are then compared and summarized. Aside the various quantitative findings, we also come up with qualitative insights into the meaning of either explicit or implicit compound concept expressions appearing in the semantic web realms.
Full PDF Version: 


Solicited Reviews:
Click to Expand/Collapse
Review #1
By Dörthe Arndt submitted on 22/Apr/2020
Major Revision
Review Comment:

The paper presents a method to measure the adequacy of a given ontology for re-use. In that context, the idea of a focus class, a class covering the main interests of the potential user, is introduced together with a formula to quantify the categorization power for the sub-concepts and properties of that class. This quantification also relies on the ontology language chosen and that language should be powerful enough to provide meaningful insights but limited enough to allow easy calculations. The remainder of the paper investigates in how far Simple Concept Expressions, a language supporting only existential quantification, is suited to fulfil that role. The authors tackle that question by investigating the expressions used in actual ontologies and by performing tests on users who were asked to vote on a meaningfulness of the concepts they were provided with. The user tests are then also used to make a first concrete suggestion how the formula for the categorization power which relies on weights depending on the language can be instantiated for Simple Concept Expressions.

While I really like the overall topic of the paper which is how users can know whether or not an ontology fits their needs and I also like the idea of having focus classes, I also see shortcomings:

The paper seems to make some assumptions how an ontology will be used which is never explained but crucial for the understanding.
The paper is not well-structured. New concepts are not sufficiently explained when they are introduced. The overall structure of the paper is only explained at the end of the paper and not at its beginning.
The authors introduce the concept of “focussed category patterns” which according to them correspond to concept expressions in OWL. I do not see that correspondence (a domain declaration using rdfs:domain is for example not the same as an existential restriction) and it is also not explained in the paper (which could have convinced me). This part needs clarification.
Definitions and descriptions of experiments are not always clear in that paper..

Concrete recommendations to the authors:
Explain how you think an ontology will be used (maybe in an extra section). Do you focus on reasoning or only on querying the results? How complex are such queries you expect?
Restructure the paper such that the parts are more connected and the concepts you use are explained when you introduce them and not later. I hope that this will also shorten the paper a little bit since you spend much space explaining the structure.
Approximative patterns: I doubt that the patterns you identified actually express the context expression types you claim they express, but I also do not have all information (for example I don’t know the SPARQL queries you use). I would like to see this part to be discussed in more detail. How do you map and why do you map that way?
Please be more careful with your formal definitions (for example Definition 3) and explain the notations you use more carefully (for example RHS, LHS).

More detailed comments:
Introduction (end): at that point it is not clear what the difference between “concept expression types in ontology axioms” and “syntactic patterns in the TBox of ontologies” is. Please already explain it there.
Many of the very useful explanations you give in Section 8 come far too late. I for example only understood the difference between what you do in Section 5 and what you do in Section 6 after having read that section. It would help to better differentiate the different approaches early on and move as many of the explanations of Section 8 as possible already to Section 2.
Related to the previous issue: please clarify the difference between t_1,...,t_4 and p_1, …, p_4. Since the different DL concepts you use to define the t’s can also be represented in RDF syntax I first thought that you meant these RDF-OWL representations when you spoke of your patterns and that made it difficult to distinguish between the t’s and the p’s. It would help if you would emphasize the difference.
likewise, I still don’t know whether the students were confronted with the t’s or the p’s, please clarify.
Definition 3: which role does the restriction play in the definition? Do you only replace the placeholder variables by elements of the signature or do you also allow recursive applications. From the definition I understood that no recursion is allowed, but then you mention on page 5 the “possibility of recursively composing property restrictions”. Could you please clarify?
You spend quite some space to explain the structure of your paper (for example at the end of page 6). I think that is very necessary, but I would either change the structure or make the explanations more clear. For example: why do you start in the explanation at the end of page 6 explaining that section 4 (which is not the next section) explains the source “syntactic patterns” (which is the second item on your “source list” and not the first)? I think it would be easier for the reader to follow if you either explain what happens in the remainder of the paper in the order the sections appear or to follow the order of your source list. The way you currently do it causes unnecessary confusion for the reader. (Page 6 was only an example, the also happens at other places.)
page 5, definition of FCP: why don’t you put the FCP in a definition? How will you make sure that equivalent concepts are only counted once? To me that seems to be rather difficult and I would like to know how you do that or plan to do that in practice.
end of page 5: D \subset FC -> please mention that you mean a proper subset (only because notations differ).
end of page 5: you say that w could be “accidentally” nonzero? I don’t get the example. Why is it a problem if the classes teacher and student are not disjoint, can you explain?
Please explain the concepts you use already when you introduce them (can be a short explanation). As an example see page 6, example 2, item 2: It is very frustrating to read about “meaningful syntactic patterns” and learn that you will only enlighten us what this concept means in Section 4. This makes your paper hard to understand.
Page 7: weights based on RDF datasets using an ontology -> Do you look before or after applying reasoning? Some classes are very useful for reasoning but will never be instantiated directly.
Page 7 onwards: I don’t always get your use of RHS and LHS, especially because you also use it for equivalences and the equivalence relation is symmetric. If there is a certain way how you expect DL axioms to be written (for example that a named class is always on the RHS when it is equivalent to an unnamed class), then please mention it. Maybe concrete examples could also help here.
Section 2.6. As stated above: the whole idea of FCP patterns needs more explanation. Later it becomes more clear, but here the reader does not really understand in what sense your patterns are approximate? Additionally, I have the feeling that you assume a specific way of using the ontology which is fine, but you should explain how that use is. Later, you briefly mention SPARQL queries. Do you want to use the ontology for querying? I would also already bring a very short example of a pattern here instead of simply referring to section 4.
page 8 equation 3: “some specific conclusion” -> I would expect that to be better specified in the formula/definition itself instead of referring to the text above.
page 8, your remark about the universal restriction and the open world assumption: I understand that universal restrictions cannot easily be validated, but it is new to me that we want to validate. Reading that part makes me think that you should spend a section on how you expect an ontology to be used. Apparently you want to do complex SPARQL querying on top, you want to do validation (there, it is a separate discussion whether we should even do validation with OWL which is rather made for reasoning, but if that is what you want and you clearly state it, I am fine with it).
page 9, beginning: in addition, such instances have their class already defined… -> I don’t understand your comment, please clarify.
page 9, table 1 and text: Please provide the SPARQL queries you use, without that it is rather difficult to see the relation between your patterns (p’s) and the context expressions (t’s). Please also consider to add more explanations to the capture of the table.
page 10, patterns: please have in mind that the patterns you understand as restrictions are also used for reasoning. Assume for example that everything which barks is considered as a dog. It seems that your tests suggest that then it would be meaningless to declare that the domain of “barks” is “dog” since we can assume that every dog barks (at least that is how I understand you examples?). From a reasoning point of view that declaration makes sense. I do not need to declare that something is a dog if it barks but the reasoner will derive that. I just write down that example because I assume that your whole approach assumes a specific use of ontologies and I think you really need to share these assumptions with us. Currently it reads like you are simply changing the semantics of RDFS and I guess that was not your intention.
page 12, p3: please elaborate why exactly the range should only be asserted. Does your pattern also include cases in which C is directly stated as the domain of P or do you exclude these cases? C is a subclass of itself but I don’t know whether the reasoner will produce a triple for that.
page 13, comment about being mutually exclusive: I would say that every instance which fulfills pattern 3 also fulfils pattern 2. So, how do you mean your comment?
page 14: “Note that the subsequent use … similar ...“ -> similar to what?
page 14: please clarify RHS and LHS with some example.
page 14: please spend some time to describe the test set-up, to me it is not clear what you tested.
page 15/16, example: I don’t understand the fatal accident example. Why is that a problem if you include cars with no accident? It again looks like you want to change the meaning of existing constructs.This can be solved if you clearly describe the intended use of ontologies here.
page 16, n.0.1: please do not call it n since that makes me expect a natural number.
page 20, formula: for the value “no judgement” wouldn’t it make sense to simply not use it in your formula at all? More concrete: why don’t you reduce k by 1 for every instance of “no judgement”? Would that not help with the “comprehension bottleneck” you mention on page 21?

Review #2
Anonymous submitted on 20/Aug/2020
Review Comment:

This submission introduces the notion of categorization power of an ontology, discusses how it can be computed, and performs an empirical evaluation that involves both automated computation in ontology repositories and cognitive experiments with humans.

To compute the focused categorization power (FCP) in an ontology O for a concept FC, one, roughly speaking, counts the number of “interesting" subconcepts of FC that one can build. The counting is weighted depending on how “interesting” individual subconcepts are. The authors provide some hints on what “interesting” could mean, e.g., the subconcepts should not be equivalent to FC.

Intuitively, FCP can be used to measure how much knowledge about the concept FC is contained the ontology O. This can be used to select from a collection of ontologies an ontology that is most suitable for some context (specified using FC).

The submission touches an interesting problem, but unfortunately I believe the paper and results are not of sufficient quality and depth to be accepted at a journal.

Let me just point out some of the problems.

1. The paper would be inaccessible to the general audience. This is supposed to be a journal publication, but I don’t think that an ordinary PhD student or a young PostDoc would be able to learn much from this paper. After reading the introduction, as a more senior researcher, I could only get a vague impression of the motivation and, especially, the results and insights of the paper. In fact, the authors don’t make a serious attempt to provide an overview of the scientific contributions of the paper. Some bits are presented in the comparison with the previous conference paper, some bits are presented when discussing the structure of the paper. Please provide a clear, substantial, and complete discussion of the contributions of the paper.

2. The presentation of technical details is too vague. The paper deals with a technical problem that is related to the automated generation of concept expressions, but the authors do not provide sufficient background details to make the discussion precise. In the end, the proposal on how to compute FCP is made informally by presenting some design suggestions in Section 2.3.-2.5. But this is a key part: without something concrete regarding weights, I don’t see much value in the proposal, because at the current abstract level it is simply trivial. I think that the authors at least should come up with a concrete proposal on weight computation, i.e. a concrete instantiation of what is now just an idea/framework. I find the examples of the paper not helpful because they are also very vague. Perhaps, one could make them more precise, by having a concrete pair of ontologies and then comparing them according to FCP in some concrete setup.

3. Definitions 1-3 introduce some machinery for constructing and manipulating DL concept expressions. As a person familiar with DL literature, I simply cannot understand why these definitions deviate so much from the standard notions in DL literature (why “restrictions”, why “place holder variables”, why "concept expression types”, why “substitutions”?). DL literature offers well-established notions, notation, and nomenclature; I think one can and should employ them directly in this paper.

4. As mentioned, the paper deals with the task of generating DL concept expressions. There is a vast literature on this in the area of DLs. This is often called “non-standard reasoning tasks”, among which the tasks of computing "most specific concepts" or "least common subsumers” are probably most well known. Another task is "learning concept expressions”, which also has received significant attention. The challenges that one is facing there are similar to the ones of this paper (e.g., the infinite search space in general). A notion that is related to L-categories is that of “downward refinement operators” (e.g., in works of Lehmann & Hitzler). As a motivation, the authors write “A large part of the use cases of ontologies on the web consists in assigning data objects to certain categories (…). Furthermore, prior to the assignment, the objects are already known to be instances of some (more general) class, to which we will refer as the focus class (FC).” The above mentioned task of computing "most specific concepts” is specifically geared towards supporting such an assignment of objects to categories.

It seems that the authors are not aware of these works in the DL literature. I am not saying that specifically the notion of FCP has been considered already, but tasks with similar underlying technical challenges have surely been studied, resulting in what I believe are more sophisticated approaches than described in the submission.

5. The current shape of Section 4.3 is just unacceptable for a journal publication. One needs to make the algorithms more precise.

6. In Section 5.3 the authors write “From the point of view of focused categorization, logical conjunctions are actually not very interesting, since the conjunction can be simply achieved by applying multiple categories on the categorized individual.” To me this is a strong indication that the proposal has fundamental problems. E.g., specifically using conjunctions one will usually create different complex concepts that best describe a given collection of objects. If conjunctions are not interesting, I don’t see how the proposed framework can potentially be interesting. This, e.g., goes against the basic ideas in the area of learning concept expressions from data.

7. In Section 5.3 the authors write “In all, the analysis suggested that the L_SE types play a significant role in the family of all anonymous expressions commonly used in OWL [A], and that the design of an FCP formula restricted to this simple CEL is thus meaningful [B].” I don’t understand how can the authors conclude [B] from [A]. Intuitively, the shape of concept expressions for measuring FCP should be closely related to the kind of queries that users can pose. Users will not be interested only in simple atomic queries, they can pose more complex queries, e.g., conjunctive queries or full-fledged SPARQ queries. This suggests that it is imperative to consider also CEL where expressions can be significantly more complex than the expressions commonly found in ontologies. This is related to Point 6 above; I don’t see how one can have a meaningful approach without integrating conjunction.

Review #3
By Luigi Asprino submitted on 26/Aug/2020
Major Revision
Review Comment:

The work presented in the paper aims at extending an existing framework (introduced at EKAW 2016) for analysing the (sub)categorization power of ontologies with respect to a (focus) class. The authors target the scenario where an ontology engineer or a LOD practitioner wants to reuse an existing ontology for developing a new ontology or publishing a new dataset and the preference may be given to that ontology providing an extensive subcategorization for the classes deemed important in the resource to be released. Therefore, the ultimate goal of the framework is to provide a tool for helping Semantic Web practitioners in the choice of the ontology to reuse.

Structure and overview of the content of the paper
The authors intuitively introduce the targeted problem in Section 1 which also provides a brief overview of the paper and a motivating example.
The general framework is introduced in Section 2. The framework mainly relies on:
1) a concept expression language (CEL) which is the language that imposes the rules that tells how to form the (sub)categories of the focus class (subcategories may be either named classes or expressions)
2) a weight function that aims at estimating the capability of the generated categories to finer subcategorize the focus class.
Section 3 introduces a CEL (called, simple existential) which acts as a running example for the paper.
Section 4 introduces the focused category patterns for the simple existential CEL. A focused category pattern is a graph pattern which (as far as I understand) is used to identify potential subcategories complying a CEL from an input ontology.
Section 5 presents an empirical analysis of the most common concept expressions used in the ontologies available on the web.
Section 6 presents a study of the occurrence of the focused category pattern in the ontologies.
Section 7 presents an experiment which is meant to evaluate the meaningfulness of the categories generated using the simple existential CEL.
Section 8 discusses the results of the analyses, Section 9 provides an overview of the related work and Section 10 concludes the paper by outlining the ongoing and future work.

General comment
Providing a framework for supporting the choice of what ontology to reuse is clearly valuable and the direction of measuring the categorization power with respect to a target class is worthwhile further investigations.
This is a very interesting work and, in general, the text is well-written and easy to read.
The work is sound and also well motivated, contextualized and, of course, its contribution is within the topics of the journal.
Most of the relevant work is cited and clearly positioned with respect to the authors' contribution, however a reader might benefit of the reference to a recent empirical analysis [1] of the overall modelling style which complements the analysis presented in in Section 5 and a set of guidelines [2] for implementing the ontology reuse in Linked Open Data context.

Although the paper presents an extension of an existing work, the extension, which mainly regards the formalization of the framework, substantially motivates the paper. However, I would have liked to see a different running example in the extended version (the simple existential CEL has been already presented at EKAW 2016).

My major criticism comes from: 1) vagueness of the guidelines for computing weight function; 2) role of the FC patterns; 3) Criteria for estimating the quality of the categories.
Vagueness of the guidelines for computing weight function. The focused categorization power of an ontology is measured by summing the weight of all the generated categories with respect to the FC. The weight is a function expressing the quality of the generated category. The authors provide a set of informal constraints that this function has to respect without letting the reader to understand how to practically compute the value. Even in the running example the function is never completely calculated. In order to make this framework a practical tool for ontology and LOD engineers the authors have to provide clear guidelines for computing the weight of the generated categories.
Role of the FC patterns. As far as I understand the FC patterns are needed in order to select from the input ontology the concept expressions that subcategorize the focus class. If this interpretation is correct, I invite the authors to put this in more explicit words. The FC patterns have to conform to the chosen CEL but I couldn’t understand how these are obtained from the CEL. Consider for example the Lnam CEL, why the pattern is (C rdfs:subClassOf FC) instead of (C rdf:type owl:Class) which it seems to me more appropriate with respect to the CEL? By the way, I consider the notion of FC pattern as part of the framework and in my opinion this should be presented in section 2.
Criteria for estimating the quality of the generated categories. Regarding the criteria specifically, I would ask the authors to provide better motivation for them. Particularly, regarding the size of the potential category, I couldn’t understand how practically this can be calculated if the dataset (for which the engineer is looking for an ontology to reuse) is not aligned with the candidate-for-reuse ontology (it seemed a contradiction to me). Moreover, I couldn’t understand why the authors consider to be important the relative size of the generated categories and I would like to see a better motivation for that. Finally, the criteria seems to consider the generated category only individually, but I consider also the hierarchy that they form an indicator of their quality. This seems to be never assessed by the framework.

Other minor comments:
A reference is needed to substantiate the claim “A large part of the use cases of ontologies on the web consists in assigning data objects to certain categories (with some consequences following from this assignment).”
In the sentence “Intuitively, in a well-designed ontology …” at page 5 the term “well-designed” seems to contradict the example below.
There are some parts of the paper that I think that can be rephrased for improving the readability:
The point 2) in Section 2.4
The sentence “To avoid any mismatch of the presented ‘weight sources’ list with the ‘weight sources’ list from Section 2.4, note that the sources from Section 2.4 are applied ‘deductively’, to estimate the weight of a particular category, while the sources in this section serve for ‘inductive’ derivation of the (mean) weight pertaining to a whole CE type.”
A reference is needed to substantiate the sentence “It has been observed that ontologies are often huge either in terms of classes or in terms of properties but rarely in terms of both.”

[1] L. Asprino, W.Beek, P.Ciancarini, F. van Harmelen and V.Presutti. Observing LOD using Equivalent Set Graphs: it is mostly flat and sparsely linked. In ISWC 2019
[2] V. Presutti, G. Lodi, A. G. Nuzzolese, A. Gangemi, S. Peroni and L. Asprino. The role of Ontology Design Patterns in Linked Data projects. In ER 2016