Data Sharing in Agricultural Supply Chains: Using semantics to enable sustainable food systems

Tracking #: 2987-4201

Christopher Brewster
Nikos Kalatzis
Barry Nouwt
Han Kruiger
Jack Verhoosel

Responsible editor: 
Guest Editors Global Food System 2021

Submission type: 
Full Paper
The agrifood system faces a great many economic, social and environmental challenges. One of the biggest practical challenges has been to achieve greater data sharing throughout the agrifood systems and the supply chain, both to inform other stakeholders about a product and equally to incentivise greater environmental sustainability. In this paper, a data sharing architecture is described built on three principles a) reuse of existing semantic standards; b) integration with legacy systems; and c) a distributed architecture where stakeholders control access to their own data. The system has been developed based on the requirements of commercial users and is designed to allow queries across a federated network of agrifood stakeholders. The Ploutos semantic model is built on an integration of existing ontologies, and the Ploutos architecture built on a discovery directory and interoperability enablers which use graph query patterns to traverse the network and collect the requisite data to be shared. The system is exemplified in the context of a pilot involving commercial stakeholder in the processed fruit sector. The data sharing approach is highly extensible with considerable potential for capturing sustainability related data.
Full PDF Version: 

Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Michael Brückner submitted on 29/Jan/2022
Major Revision
Review Comment:

Paper: Data Sharing in Agricultural Supply Chains: Using semantics to enable sustainable food systems
The paper presents a practical approach to data sharing involving various data sources. As an example, data sharing relating to parts of the fruit sector is presented.
A. Feedback (Journal review criteria)
Importance: (High). Agriculture and its supply chain mechanisms are of increasing importance worldwide, given the growing population and expanding markets. Digital technologies, and semantic data sharing, may be supportive. At least, they should be considered for more thorough investigations. The paper addresses an important topic in the realm of semantic technologies.
Usefulness: The paper gives enough details for reproducing results in a general way. More information on building the individual PIEs would be necessary, though.
Relevance: The topic is highly relevant.
Stability: -
Impact: EU based agricultural institutions, agencies and companies may benefit from the full implementation of PLOUTOS and affiliated components.
Overall impression (0-100): 55 (language usage and paper organization impair the score, see below)
Recommendation: Major revision
B. Specific remarks
Title: Authors should consider shortening the title, as I was wondering what "sustainable food systems" might be. "Using Semantic Technologies for Data Sharing in Agricultural Supply Chains" should suffice.
Abstract: Results of the pilot test should be given in the abstract.
Introduction: Appropriate
Previous work: Appropriate
Methods: The methods adapted from the IoT SSN as stated in Section 2 could be stated in more detail together with the PCSM.
Section 6.2 gives an explanation of "Knowledge orchestration", which needs some clarification: "For example, given a specification of the knowledge item that is requested a PIE through the PRDD can figure out the appropriate knowledge base to access it. " So, the PRDD decides which knowledge base to approach for a given item. How does the PRDD handle the case, when the item can be retrieved from more than one of the KBs?
Results and discussion: All data providers in the supply chain need to be equipped with a PIE. How much effort does this mean overall?
The list at the top of p. 24 is repetitive (see. Section 6.1) and could be removed.
Section 7.2 indicates in the first paragraph some ideas of future work regarding scalability, which is only mentioned as a keyword in Section 8. Authors may consider to differentiate "scalability" a bit more in Section 8.
Originality: Low.
Article organization: Section 3 uses "Use cases" as the heading, but there are none. It starts with an outline of the approach to testing. The authors should consider placing the competency questions (start at Line 22, p. 5) at the beginning to resemble use scenarios.
Figures and tables: Figures should be placed after their description, e.g., Fig. 1 (p. 7 top) and description on p. 8. The same holds for Table 2.
Fig. 1 is hard to read, especially where arrows interfere with annotated text.
Language usage: The text is generally easy to follow. Unfortunately, there are several incorrect formulations, which must be revised. Some examples are the verb forms on Lines 41-42 (p. 1), the use of "e.g." without commas on Lines 13 and 15 (p. 2), Line 18 (p. 2): "There are, of course, many other purposes for data sharing in the agrifood sector, both currently identified, …", Line 51 (p. 2), Line 26 (p. 4), Line 45-46 (p. 4): missing preposition & the area specification should be "10 ha", Line 16-17 (p. 5) does not make sense to me. Line 20-21 (p. 3) contains a fragment. "lot-id" (Line 28, p. 19) and "LOT-ID" (Line 21, p. 21) are the same, I assume.
A thorough revision by a native speaker of English is strongly recommended.
Please also define acronyms and abbreviations at the first use. Some examples: CAP (Line 10, p. 2), GSI (Line 21, p. 2).

Review #2
Anonymous submitted on 30/Jan/2022
Minor Revision
Review Comment:

This paper proposed an ontology and a framework for data sharing and query across agifood supply chains. This is an important and challenging problem since there exist various actors in food supply chains with varying needs and heterogenous software infrastructure. An ontology can significantly improve supply chain traceability by data harmonization and integration.
This is an interesting paper and the authors have demonstrated a deep understanding of the domain. The requirements for the ontology and its related data sharing technology are defined realistically and accurately. Literature study is reasonable and comprehensive.
The authors are well aware of the existing ontologies and vocabularies in agri-food industry (and other relevant ontologies in the field) and have reused them in the proposed ontology appropriately. The proposed software architecture is also sound and is developed based on a pragmatic approach.

Comments (mainly pertaining to the ontology):
• It is recommended to avoid using the term ‘concept’ to refer to ontology classes. Concept is a vague term and causes confusion. It makes sense to use this term in development of SKOS thesauri but in an OWL ontology, use the term ‘class’ instead to refer to ontological entities (universals or types).
• It would be useful if you can provide ‘natural language’ definitions for the core classes (such as parcel, observation, location, feature, etc.) in the ontology. These definitions might come from an imported or reused ontology (which is perfectly fine) but it helps the reader better understand the semantics of those notions in the context of this work. It also avoids misinterpretation.
• The value of using part-observation-property pattern is not clear. In particular, it is not obvious what benefit we will gain from categorizing various classes under the part ‘pillar’. You only need ‘hasPart’ property to specify the whole-part relationships. An OWL ontology has only one taxonomy (classes structure) that is built based on a the ‘is-A’ (class/sub-class) relationship. Creating another taxonomy for parthood is unnecessary and technically not correct.
• ‘Feature’ is a polysemous term and should be avoided to the extent possible. The fact that both Building and Parcel are sub-classes of Feature demonstrates the vague nature of this term.
• It is not clear why classes such as ProductOperation and ParcelOperation are needed. They are clearly defined classes (ProductOperation is an operation that has product as a participant). Are they these two classes disjoint? Is it possible that an instance of Operation (such as cleaning operation) can be classified under both ProductOperaton and ParcelOperation? Do you allow multiple inheritance? Please provide justification for why you need these classes.
• For the Operation class, you have used a restriction (responsibleAgent Max 1). What if an operation has more than one responsible agent?
• The authors are encouraged to distinguish between asserted (primitive) classes and defined classes. Classes such as Soil or Budling are considered to be primitives as they can exist on their own but classes such as ParcelOperation are defined classes since they are built based on other classes and properties (using the equivalence axiom).

Review #3
By Emma Griffiths submitted on 06/Jul/2022
Major Revision
Review Comment:

The manuscript entitled “Data Sharing in Agricultural Supply Chains: Using semantics to enable sustainable food systems” describes work performed as part of the Ploutos Project to create and implement a semantic framework to enable greater interoperability between agricultural and other food system platforms and databases. The Project has resulted in the development of a new application ontology that provides new ontological classes/terms and integrates a number of existing agriculture/environment/sensor domain ontologies based on RDF and OWL, as well as tools based on the ontology framework to enable translation between service/platform-specific languages so that users can query information across a spectrum of (until now) disparate data collection and integration platforms. This work was performed to enable traceability of food products and information about their production, and has potential in a wide number of areas such as sustainable development.

I think that the work described in this manuscript is of interest as it brings to the forefront the real challenges of integrating different types of data that are often ignored and assumed to be a minor problem/fact of life in industry, and also helps to showcase the power of ontologies. The authors state in the paper that until now food/Ag related ontologies have largely been used to annotate research data (which is not strictly true as ontologies like FoodOn are being used to standardize and enhance pathogen surveillance data in real world public health and food safety surveillance networks and investigations but that is a minor point) and have not been used to integrate data in more powerful ways. This is very true. There are reasons for why this has largely been the case, but the fact remains that in the semantics world, particularly in the food semantics world, we lack good examples of the types of transformative things you can do with ontologies that can entice uptake in the community where it is sorely needed (they just don’t know it yet because we lack good “before” and “after” implementation stories).

Significance of the results
This paper could be of great interest to a range of audiences, including the agricultural, semantics, and the software development communities. Because of that, the story should be told in a direct way that clearly describes and separates what was done from what is needed/what is the vision. A lot of time and writing real estate is currently spent setting up the vision and what is needed, and that confounds the expectations of the reader. Also, starting by laying out the semantic framework bogs the reader down in heavy details so that the implementation near the end is likely to be skimmed over, which is a great disservice to your work. It’s like telling me how a magic trick is going to be performed before you do it, so that when you actually show me the trick, I’ve lost interest because the magic is gone.

Quality of Writing
The authors are clearly skilled in writing, but I feel like the story could be better structured to engage the reader. To this end, I have a number of suggestions to better improve the flow and readability.
1. The “magic” of Ploutos is in its ability to transform and integrate data across multiple disparate platforms, and to enable users to query it. Start with that idea. Start with what you’ve actually built (a framework for translating data between different existing databases and platforms) which has been applied in a proof-of-concept study but that will be expanded to achieve a wider vision in the future. Then describe the needs assessment that you did (where you have the other use cases), then describe the platform architecture in more detail, then the semantic framework to achieve this transformational capacity. The other stuff about the principles, access controls and the data governance (users have control over their data), is that actually implemented yet? It was not clear to me from the peaches example that that is in place. So that should probably go into future work in the discussion where you can expand on what is needed. If I’m wrong, you should work that into the implementation description in the beginning e.g. what information can different users differentially access regarding the peaches?
2. You give a list of questions that highlight user needs. It would be helpful to show and explain why those questions can’t be directly answered right now (which is why you needed to build the ontological framework). Provide a diagram based on the peaches example that highlights the points where the data gets stuck. Figures 13 and 14 almost serve this purpose. Can you walk us through one of the questions and show us how Ploutos (via the transformation enablers and the knowledge mappers) enables users to get answers where previously it was difficult or it couldn’t be done? You do that in a complicated way in the text and provide conceptual diagrams in Fig 13 and 14. Could you put the query and the answers (in simple terms) on Figs 13 and 14? I think it’s just a matter of more clearly spelling out for the reader where the blockages were before, and where Ploutos provides bridges to get the user from the question to the answer using a specific example.
3. You did not create Alterra and GaiaSense, correct? It would help if you briefly described what these things were and what they visualize/analyze and clearly state that currently they are separate platforms/databases so a user can’t seamlessly connect their data/services. The Gaiasense dashboard in Fig 11 is generated by that platform right, not Ploutos? But Ploutos offers Gaiasense dashboards?
4. The manuscript is quite long. Can sections be cut down/removed e.g. Table 4? It seems highly general and not directly pertinent to what was actually built.

Regarding the Ploutos ontological framework:
1. It’s not clear to me how big the Ploutos ontological framework is. Can you provide a number of the classes and relationships?
2. You provide process diagrams for different data integration/transformation examples, but could you provide a list or diagram spelling out what types of vocabulary are currently available (distinct from what could be included to address the other use cases)? For example, you mention things like agricultural/supply chain certifications, is there vocabulary in Ploutos right now for structuring that information or is that on the to-do list?
3. You mention that EnvO richly describes their terminology (as that is part of the OBO Foundry best practices - to provide definitions and unique IDs for all the classes, subclasses, and instances). Can you describe if you provide definitions for all the “concepts” contained in your OWL file and how people can go about requesting the integration of new services into the Ploutos framework?
4. Do you have plans to integrate parts of any other OBO Foundry ontologies that might align with Ploutos use cases e.g. you mention FoodOn, AgrO, and there could be others like Uberon (for anatomical parts of animals) etc.? Can you mention this in the text?
5. OBO Foundry ontologies share a common upper level ontology (everything is basically classified as a thing, a process or a quality/property). Does this not cause logical errors with your upper level ontology (Part, Observation, Property)?

Accompanying Data File
The ontological framework does not appear to have a README or Wiki which makes it difficult for a user to navigate and assess. There is a traceability demo that has a very basic README that could benefit from more details. A docker image of the software is provided. Little else is provided regarding actual code. Because of this, I can’t say if the provided resources are complete or could be used to replicate the peaches implementation. I don’t see information about the API for example, or the Registry. GitLab is sufficient for long-term discoverability.

Taken together, I think this is an interesting project deserving of discussion within the community, and publication providing what work has been completed vs the potential of the Ploutos framework has been more clearly articulated, and more supporting documentation is provided in GitLab.