# Model Outlines: a Visual Language for DL Concept Descriptions

**Submission to: RR2010 special issue, editors Diego Calvanese and Thomas Lukasiewicz.**

**Revised manuscript, now accepted for publication. Previous version was "accept pending minor revisions", reviews below.**

**Solicited review by Domenico Lembo:**

All my comments on the first version of the paper have been taken into account by the author and proper clarifications have been introduced in the article, when needed. In particular, one of my main concerns was due to my incorrect understanding of the use of labels for clusters in Model outlines. This aspect has been clarified in the author's rebuttals, and also some further explanations have been added to the paper. Also, I particularly appreciate the presence of an appendix presenting proofs of correctness of the algorithms. I believe that the paper can be now accepted for publication, and I do not have further comments on it.

**First round reviews:**

**Solicited review by Natalya Keberle:**

The paper summarizes the research on and implementation of a visual language for DL concepts description.

This language is called "model outline", it allows to present concepts from $mathcal{ALCN}$.

It is the paper previously published at RR-2010, completed with the results of the paper from DL-2008, and extended with the discussion of applicability of the approach of model outlines.

Presented are two main algorithms - for translation between a

$mathcal{ALCN}$ concept and a model outline. However, in the paper there are no proofs of completeness and soundness (even if they seem simple) of both algorithms.

Compared to the previous work (DL-2008 and RR-2010), the algorithms are presented completely, with all the details.

The paper does not introduce novel theoretical results, extensions for more expressive logics, e.g. with nominals, inverses, qualified number restrictions. It does not report on a more rigid evaluation of the applicability of model outlines to practical tasks (the results are already published in RR-2010). It does not report on a novel implementation of the model outlines.

The author sketches the applicability of the algorithms and a visual notation proposed: a basis for a visual learning and testing environment for teaching Foundations of Description Logics, and a visual notation to formulate class expressions (or queries, following the terminology of the paper).

Discussion of application of model outlines mentions: understanding concept definitions by visualizing them, visual concept construction by means of creation and edition of its model outline, and a knowledge base model exploration.

My suggestions are:

1)to add the last paper by the author on the topic (RR-2010) to the list of references;

2)to clearly articulate what the contribution of the submitted paper is;

3)to avoid term "query" unless clear definition of what type of query the author uses - conjunctive, instance checking, or simply the concept descriptions from DL Query tab of Protege 4;

4)to provide proofs of correctness of the translation algorithms.

**Solicited review by Gergely Lukácsy:**

The paper introduces "Model Outlines" a visual way of representing Description Logic concept descriptions.

I find the motivation and the topic of the paper convicing, especially the proof explanation part; this is very important both to the knowledge engineers verifying or debugging their own ontologies and to end users trying to understand what is going on.

I have reviewed an earlier version of this paper and I am still happy to report that the the paper is well written and easy to read.

It is a good example of how to explain something first at high levels without discussing all the details (see Section 3 for example). The figures in the papers illustrate the main concepts and help understanding.

I still find the model outlines themselves convincing and easy to follow. Specifically, I liked the way how forall is handled and how the set of instances are visualised.

Earlier I raised concern over the fact that the the proposed visual language is only applicable to ALCN and it contains no discussion on how to extend this further. Unfortunately this is still the case with the present version of the paper. As somebody who was implementing reasoning algorithms I personally know that changing a little bit in what we allow in the DL language might have serious impacts on the algorithms themselves. Similarly I can imagine that adding certain constructs, such as inverse or property chains, will simply not fit into the current intuitive way model outlines are constructed. The paper mentions several times that this is an ongoing work, but I would really like to see at-least a very high level discussion on how to go forward. As a matter of fact, some of these things seem easy to fit in, like nominals.

Specific comments:

- Page 1, Section 1: description logic should be in captials (Description Logic)

- Page 1, Section 1: "... building a query may include writing modified concept descriptions that contains free variables..."

We are talking about instance retrieval and (I assume) conjunctive query answering. I would suggest to name these things with a corresponding reference accordingly.

- Page 1, Section 1, Figure 1: I really think that this example should be explained here and not half page later.

- Page 2, Section 1: You talk about "DL symbols", but "exists", "forall", etc. are not "DL symbols", but DL operators (or simply operators).

- Page 2, Section 1: firs occurance of "OWL" without any reference, etc. And its not very precise anyway, I guess we want to say OWL DL here.

- Page 2, Section 1: "... consists of diagrams characterizing the class of models of a given..." - the term "model" appears here the first time in the paper in the mathematical sense and it not explained. At least we need a reference.

- Page 2, Section 1: ".., after applying a carefully defined set of simplification rules ..." - I think we need a "cf. Section XX" here.

- Page 5, Section 3: "... all individuals attending the graduate courses in question must belong to class "Enrolled"" - it is easy to

miss "ALL Enrolled" in Figure 4. I am not sure if we have a way of making the reader's life easier.

- Page 12, Section 5: Following [...], we defined our main goal as: ==> Following [...], our main goal is to show that Model outlines....

- Page 12, Section 5: The fact that the test participants were given a Portugese translation of the Manchester OWL keywords, raised an alarm for me: translation of english keywords might not necessarily make as much sense as their original counterpars, because of the way the given language expresses things. "R SOME C" might be easy to read for a native english person, but "R something C" might not make any intuitive sense in a different language even if "something" is the literal translation of SOME.

- Page 12: What is the difference between a Logician and a Mathematician?

- Page 13: "Question 14 elicited 4 errors..." - is there a link pointing to the detailed case study? Can I see the exact questions asked, etc?

- Page 13: "... and 1 said both were equally bad" :)

**Solicited review by anonymous reviewer:**

The paper presents a visual language (called model outline) for representing complex concepts specified in the Description Logic ALCN. Both graphical and abstract (LISP-based) syntax of the language are presented. Various algorithms are provided for translation from the visual representation (in its abstract syntax) to ALCN syntax, and from the ALCN syntax to the (abstract) visual syntax. Some usability tests are presented which compare model outlines with Manchester syntax, and show usefulness of the visual language presented in this paper.

The paper has various points in favor:

- its contributions are connected to the ontology visualization problem, which is an important and hot topic;

- it describes algorithms for translation from ALCN syntax to model outlines syntax and vice-versa, thus it provides complete means for such translations;

- it presents a promising usability evaluation;

- it discusses possible applications for model outlines.

There are however also some weaknesses:

- the impression is that the major gain in using model outlines is with very complex concepts. How they are frequent in the practice is not clear;

- part of concept expressions is represented in the visual model through box and cluster labels, i.e., it remains in textual form. I could not completely understand which is the impact of this choice (see also the specific comments below), but it seems to me that this weakens the approach;

- the treatment is basically clear, but various technical aspects remain difficult to grasp: in particular, the formal syntax for ALCN model outlines (Fig. 6) is only briefly commented, (an, even simple, example showing both graphical representation and LISP-based syntax might help the reader); algorithm 1 is not commented at all, which makes it difficult to follow;

I recommend therefore that the author makes an effort to consider the above points and comments on them in the paper. He should also fix the following specific problems for the paper to be accepted for publication.

After equation (1) in the introduction, please add some words to explain the meaning of the concept described by the equation.

page 2, beginning of the second column: make soon clear that dashed lines denote optional aspects.

I do not understand the use of the name "cluster", which does not recall the graphical symbol used to represent it.

Figure 4 (and in general regarding to the syntax). The syntax of the labels seems quite free to me: is there a precise syntax for these (e.g., a precise way to specify cardinalities)? Otherwise the designer can put here whatever she wants. For example, what does ALL Enrolled mean?

Figure 4: Not clear to me (even after reading the entire article), why there is a dashed cluster in the box connected to the arrow "supervises".

Make it clear as soon as possible in Section 3 that there cannot be two (or more) arrows with the same role name starting from the same cluster.

In the description of the second (more general transformation), D can be a complex concept. Thus D can become a label of a target box. I believe that this is wrong, otherwise the power of the graphical representation is strongly limited. This should be commented and fixed.

In section 4, before commenting Algorithm 2, it might be useful to make clear that situations like the following do not requires transformation, but generate various cases:

forall R[(C_1 sqcap D_1) sqcup (C_2 sqcap D_2) ] sqcap exists R.C_1 sqcap exists R.C_2

--

Typos

page 8, column 2, row 1: (2b) -> (2c)

page 8, column 2, row -14: C_n -> C_{n+1}

page 9, column 1: point 3 and 4 might be better reorganized in the same item.

page 9, column 1, row -11: "with all of the L_i literals, and all of the D_i nonliterals" -> "where each L_i is a literal and each D_i is a nonliteral"

page 9, column 2, row 12: "each each" -> "each"

Algorithm 4: BUILDCLUSTERCASES is invoked with only one parameter, but it requires two parameters.

page 12, column 1, last paragraph: it seems that the experiment carried out with notation A has used a different domain than the experiment done with notation B.

page 14, column 2, row 20: "belongs in" -> "belongs to"