Seitenhierarchie

Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

...

  • We need to identify our research data context in our target scientific domain. By context, we mean all the related machines, samples, lab protocols, and generally everything that is related to the experiment that is not reflected in the data itself.
    • Example: John Doe performs an experiment in the lab and generates some data and uploads them to the data repository in CKAN. However, we have no idea what was the process by which this data is generated, As a result, the data is not understandable. 
  • The next step is to semantically describe the identified contextual data in the previous step. Here basically we model our data and annotate them to be machine-actionable also. 
    • Why annotated? Humans are not the only data users. As matter a of fact, soon most machines (AI for instance) are supposed to perform actions on data. Without annotation, the machine's precision and comprehension weaken. 
    • How to annotate? Ontologies and Vocabularies are rich sources to find domain-specific annotation. 
    • Where to find these annotations? Terminology Services. TIB already has one: https://terminology.tib.eu/ts
    • What if I cannot find a proper annotation? Contribution. You can develop your own vocabulary for your specific domain. Benefit? people in your domain will use it.
  • RDM has to serve Linked Data. The last step is to Link your contextual metadata and data. This means you need an actual link that connects your SMW graph to the CKAN graph. For example a link from a Sample page in SMW to the related dataset in CKAN. 


CKAN:

  • CKAN is a ready-to-use data repository. You can just install it and manage your datasets. However, not enough for RDM. The general extra steps to be taken are:
    • We need to link our data to the contextual concepts that we implemented in SMW before. Like linking your dataset to the corresponding machine/device that generated it. Example extension developed by Lab lskLSKckanext-Semantic-Media-Wiki
      • Note: you do not have to make a link from CKAN to SMW and vice-versa at the same time. OneA one-way link is enough. The direction is your choice. 
    • Identifying other contextual metadata for your dataset in CKAN. It is true that we have linked our data to the context in SMW. But there are some other contextual metadata that are not in SMW (They should not be).
      • For example, what is the publication related to this dataset? For instance, Lab lsk LSK developed a plugin for linking publication(s) to your dataset:   ckanext-Dataset-Reference
      • You can also define some custom metadata for your dataset in ckan by extending the CKAN schema. Example from Lab lsk LSK (plugin crc1153_dcat_ap): https://github.com/TIBHannover/ckanext-crc1153
        • Scenario: let's say your dataset has a domain-specific metadata named 'Temperature'. CKAN does not support this and as the a result, you need to extend the CKAN schema. 
    • Here we also need to annotate our data. CKAN provides the possibility to export your dataset metadata as in RDF format based on DCAT https://github.com/ckan/ckanext-dcat
    • However, just exporting in DCAT format is not enough!
      • DCAT only describe describes your dataset in a general context. What about those context contexts that you added in the previous steps?
      • Solution: Extend DCAT to add your domain metadata (your custom annotation). An example from lab lsk  LSK (plugin crc1153_dcat_ap): https://github.com/TIBHannover/ckanext-crc1153
      • Where to find these annotationannotations? Terminology Services. TIB already has one: https://terminology.tib.eu/ts
      • What if I cannot find a proper annotation? Contribution. You can develop your own vocabulary for your specific domain. Benefit? people in your domain will use it.

...