Seitenhierarchie

Sie zeigen eine alte Version dieser Seite an. Zeigen Sie die aktuelle Version an.

Unterschiede anzeigen Seitenhistorie anzeigen

« Vorherige Version anzeigen Version 5 Nächste Version anzeigen »

Transfer package specifications

TIB uses several levels of transfer packages, which are described here. The graphic Package structure provides a general overview of the transfer packages used:

Specifications for transfer information packages are relevant for the delivery of objects.

The graphic Transforming transfer information packages to SIPs and AIPs describes how transfer information packages are transformed to pre-ingest SIPs, post-ingest SIPs and AIPs.

Transfer information packages

Different transfer information structures are used for different scenarios:

  1. SIPs with a simple structure and one representation
    SIPs with a simple structure can be submitted as ZIP files (example: the University Publications team) or as folders (example: the German Research Report team).
  2. METS deposit
    1. for objects with multiple representations or complex data structures
    2. Objects with multiple representations or complex data structures and externally created METS file

  3. Connection to the repository via OAI or another interface

The transfer information packages are described in the form of normalised tables. The table below explains the structure of the normalised tables.

Specification parameter

Implementation

Naming convention

Naming convention according to which the package must be named

Package structure

Structure in which the package must be available

Content data

Description of the minimum and maximum number of files expected

Permissible file formats

Description of the permissible file formats, where relevant

Representations

The number and type of representations allowed

Quality of data

Describes whether only valid and well-formed files are accepted

Metadata

Describes whether or not the object must be indexed in the Gemeinsamer Verbundkatalog (Union Catalogue, GVK)

Identifier

Identifier that uniquely identifies the object and links it to descriptive metadata, such as a PPN (identification number), an EKI (identification number given by the first cataloguing institution) or a handle

Legal metadata

Describes whether the object belongs to a collection in which several licence texts, licence text versions and access rights can be allocated. If this is the case, this must be mapped in a superordinate directory structure.


Objects with a simple structure and a representation

Example: the University Publications team – legacy data transfer


With this transfer information structure, there must be exactly one ZIP file, which may contain several files. All files in the ZIP file belong to the MASTER representation.

Specification parameter

Implementation

Naming convention

The ZIP file is named as follows: PPN__name of author

Package structure

One ZIP file per SIP, which contains all data belonging to the doctoral thesis

Content data

At least one PDF file

Permissible file formats

At least one PDF file is expected

Representations

MASTER

Quality of data

Only valid and well-formed PDF files are accepted.

Metadata

The object must be indexed in the catalogue.

Identifier

An identifier that refers to a catalogue record is expected to be found in the name of the ZIP file.

Legal metadata

No special identification necessary; all objects are subject to the same licence text and have the same access rights.


Example: the German Research Report team

With this transfer information structure, there must be exactly one file that belongs to the MASTER representation.

Specification parameter

Implementation

Naming convention

The file is named using the PPN.

Package structure

One PDF file per SIP

Content data

Exactly one PDF file

Permissible file formats

Exactly one PDF file

Representations

MASTER

Quality of data

Non-valid and non-well-formed files are also accepted.

Metadata

The object must be indexed in the catalogue.

Identifier

An identifier that refers to a catalogue record is expected to be found in the name of the PDF file.

Legal metadata

Objects are sorted by licence agreement and access rights before ingest.


Objects with multiple representations or complex data structures

In the transfer structure for complex objects, different representations may be ingested with 1-n files each.

Specification parameter

Implementation

Naming convention

The package is named at the top directory level using the EKI (identification number given by the first cataloguing institution).

Package structure

For each SIP, there is a directory named using the EKI. It contains a directory for each representation; the directory is named based on the name vocabulary defined in the archive. The content data are available in the representation folders.

IDENTIFIER
|--MASTER (mandatory)
|	|--File1
|	|--File n
|		|-- Folder 0-n
|			|--File 0-m
|--MODIFIED_MASTER (optional)
|	|--File1
|	|--File n
|	|-- Folder 0-n
|		|--File 0-m
|--DERIVATIVE_COPY (optional)
|	|--File1
|	|--File n
|	|-- Folder 0-n
|		|--File 0-m

Content data

At least one file per representation

Permissible file formats

No limitation

Representations

MASTER, MODIFIED_MASTER, DERIVATIVE_COPY

Quality of data

Non-valid and non-well-formed files are also accepted.

Metadata

The object must be indexed in the catalogue.

Identifier

EKI

Legal metadata

The acquisition team uses a superordinate directory structure to assign the objects to the collection groups, publications types, applicable licence texts (in different versions, where applicable) and access rights.

Objects with multiple representations or complex data structures and externally created METS file

In the transfer structure for complex objects with externally created METS file, the path identifier/content/streams may contain different representations with 1-n files each. A METS file which validates against the Rosetta xsd (https://developers.exlibrisgroup.com/rosetta/integrations/mets-dnx/) must be handed over.

Specification parameterImplementation

Naming convention

The package is named at the top directory level using an unique identifier.

Package structure

For each SIP, there is a directory named with an unique identifier. It contains one dc.xml with Dublin Core metadata and a directory named content. content directory contains a METS file ie1.xml and a directory streams, which contains the actual data. If multiple representations are handed over, the relevant files must be allocated to the corresponding representation in the METS file.

Other representation  names than MASTER, PRE-INGEST_MODIFIED_MASTER and DERIVATIVE_COPY must be coordinated with TIB’s digital preservation team.

IDENTIFIER
|--dc.xml
|--content
|	|--ie1.xml
|	|--streams
|		|--File1
|		|--File n
|		|--Folder 0-n
|			|--File 0-m

Content data

At least one file per representation.

Permissible file formats

No limitation

Representations

MASTER (mandatory), PRE-INGEST_MODIFIED_MASTER (optional), DERIVATIVE_COPY (optional), further representations according to prior agreement

Quality of data

Non-valid and non-well-formed files are also accepted.

Metadata

The object has not be indexed in the catalogue.

Identifier

Unique identifier

Legal metadata

Legal metadata are captured as follows:

1)      The access right to the object as assigned by the depositing institution shall be documented as dcterms:accessRights (e.g. private/public or another controlled vocabulary)

2)      The depositing institution‘s right to preserve the object shall be documented as dc:rights

3)      The submission agreement as concluded between TIB‘s team digital preservation and the depositing institution shall be documented as dcterms:license.

Connection to the repository via an OAI interface using the example of Leibniz Universität Hannover Institutional Repository

With this transfer information package, records are ingested via the OAI interface of Leibniz Universität Hannover Institutional Repository; the records must contain the metadata, at least a title and an identifier, and 1-n files. All objects within a record belong to the MASTER representation.

Specification parameter

Implementation

Naming convention

The original file name of the file is kept. No specification check is performed.

Package structure

At least one file per SIP

Content data

At least one file

At least one metadata format must include direct links to the files and supplements belonging to the record.

Permissible file formats

No limitation

Representations

MASTER

Quality of data

Non-valid and non-well-formed files are also accepted.

Metadata

The repository must contain at least the object’s title and identifier metadata. There may also be additional metadata.

Identifier

A repository-internal handle

Legal metadata

The applicable licence terms must be stated.

Objects with metadata from source systems or complex dependencies between data packages

Specification parameterImplementation

Naming convention

Das Paket ist auf der obersten Verzeichnisebene mit einem eindeutigen Identifier benannt.

Package structure

Pro SIP gibt es ein Verzeichnis, das mit einem eindeutigen Identifier benannt ist. Darin sind eine dc.xml mit Dublin Core-Metadaten enthalten, optional können eine harvest.xml mit Angaben zur Abholung von einer Datenquelle und eine collection.xml mit Angaben zur Zuordnung eines Objekts zu einer Sammlung enthalten sein.

Die Repräsentationsverzeichnisse enthalten die Inhaltsdateien.

MD5-Prüfsummen können optional als eine Prüfsummendatei für alle Dateien in allen abgelieferten Identifier-Verzeichnissen auf der Ebene der Identifier-Verzeichnisse oder als eine MD5-Prüfsumme pro Datei in den Repräsentationsverzeichnissen übermittelt werden.

Repräsentationsnamen abgesehen von MASTER, PRE_INGEST_MODIFIED_MASTER und DERIVATIVE_COPY müssen mit der TIB abgestimmt werden.

root
|--[checksums].[md5] (optional eine Checksummendatei für alle Dateien in allen abgelieferten Identifier-Verzeichnissen)
|--IDENTIFIER
|	|--dc.xml (Pflicht)
|	|--harvest.xml (optional)
|	|--collection.xml (optional)
|	|--MASTER (Pflicht)
|		|--1-n Files
|		|--0-1nFiles.[md5] (optional je eine Checksummendatei pro Datei in der Repräsentation)
|		|--0-n Verzeichnisse
|			|-- ...
|	|--PRE_INGEST_MODIFIED_MASTER (optional)
|		|--1-n Files
|		|--0-n Files.[md5] (optional je eine Checksummendatei pro Datei in der Repräsentation)
|		|--0-n Verzeichnisse 
|		|-- ...
|	|--DERIVATIVE_COPY (optional)
|		|--1-n Files
|		|--0-1nFiles.[md5] (optional je eine Checksummendatei pro Datei in der Repräsentation)
|		|--0-n Verzeichnisse
|			|-- ...
|	|--SOURCE_MD (optional)
|		|--1-n .[xml]
|		
|	|--weitere Repräsentationen (optional)
|--IDENTIFIER

Content data

mindestens eine Datei pro Repräsentation

Permissible file formats

keine Beschränkung

Representations

MASTER (Pflicht), PRE_INGEST_MODIFIED_MASTER (optional), DERIVATIVE_COPY (optional), weitere nach Absprache

Quality of data

Es werden auch nicht-valide und nicht-wohlgeformte Dateien akzeptiert.

Metadata

Das Objekt muss nicht im Katalog nachgewiesen sein. Es muss eine dc.xml vorhanden sein.

Identifier

Eindeutiger Identifier

Legal metadata

Rechtliche Metadaten werden in der CSV-Datei wie folgt erfasst:

1) Das Zugriffsrecht auf das Dokument, so wie für die Nutzung von der abgebenden Stelle erteilt, soll über dcterms:accessRight (z.B. als private/public oder mit einem anderen kontrollierten Vokabular) dokumentiert werden.
(2) Das Archivierungsrecht der abgebenden Stelle wird als dc:rights dokumentiert .
(3) Die Übernahmevereinbarung, wie zwischen dem Team LZA der TIB und abgebender Stelle abgeschlossen. Dies erfolgt über dcterms:license.
4) Zugriffsrechte auf das Objekt in Rosetta werden als Access Right erfasst.

Pre-ingest SIPs

A submission application creates Rosetta-compliant pre-ingest SIPs from various transfer information packages, and transfers them to Rosetta during the second step.

Post-ingest SIPs

After deposit, the pre-ingest SIPs become post-ingest SIPs, which are enriched with additional metadata by the system. The transformation process is complete when a package has been transferred to permanent archival storage and successfully deposited there.

During further processing in Rosetta, the post-ingest SIP is transformed to an AIP, and is automatically enriched with additional metadata. A post-ingest SIP becomes an AIP once it has been transferred to permanent archival storage and saved there successfully.

  • Keine Stichwörter