A metadata schema for documenting material samples from multiple domains

Tracking #: 3785-4999

This paper is currently under review
Authors: 
Steve Richard
Dave Vieglais
Andrea Thomer
Sarah Hyunju Song
Neil Davies
John Deck1
Quan Gan
Eric C. Kansa
Sarah Kansa
John Kunze
Kerstin Lehnert
Danny Mandel
Chris Meyer
Rebecca Snyder
Ramona Walls1

Responsible editor: 
Cogan Shimizu

Submission type: 
Application Report
Abstract: 
This paper documents a metadata schema, implementation, and associated vocabularies developed for the Internet of Samples (iSamples) project to integrate geoscience, archaeology/anthropology, biology and genomics sample descriptions in a single cross-domain catalog. To develop the sample description scheme for sample discovery across these disparate domains, we reviewed the metadata schema and example metadata from each project partner, as well as other existing schemes. Top level classes in the schema include MaterialSampleRecord, Curation, SamplingEvent, SamplingSite and Agent. By factoring sample type classification into material type, material sample object type, and sampled feature type, it has been possible to classify the approximately 6,000,000 samples in the combined corpus. Category vocabularies for these classifications were developed based unique value summaries from related fields in the source sample metadata, tested using a card sorting exercise and by development of code for automated mapping from source metadata. Each vocabulary has on the order of 20 categories with some hierarchy; the category concepts are intended to be covering, but might overlap. These vocabularies are implemented in SKOS, and published with the ARDC Research Vocabularies Australia (RVA) vocabulary service. The metadata schema is defined using a LinkML YAML file, and implemented as a JSON schema used to validate instance documents. To support interoperability mapping from the iSamples metadata schema to several other schemes is provided in the project Github.
Full PDF Version: 
Tags: 
Under Review