iRODS zone

The integrated rule-oriented data system (iRODS) is a solution which offers abstraction of physical data storage systems in the form of datasets and collections similar to the normal computer files and folders. Apart from this it also implements a custom binary data transfer protocol which uses mulitple parallel TCP streams to transfer data between the instances. Each zone uses an SQL server to store the hierarchy of datasets/collections and their metadata. The data transfers are encrypted using TLS.

Basic iRODS deployment is a so-called zone, which has its own set of users and a storage assigned on a local storage array. Each zone maintains its own tree of datasets/collections, their metadata and enforces its access rules. Multiple zones can set up a federation between them and use the same parallel transfer protocol to move data between them. Therefore, it is assumed that each location providing storage resources to the platform deploys its own iRODS zone and federates with others as needed.


Integration to LEXIS

The LEXIS Platform built a set of APIs which use iRODS to store and transfer data and their metadata. This to an extent leverages work provided by EUDAT, mainly services such as B2SAFE, B2HANDLE and B2STAGE. EUDAT provides mainly capability to obtain a PID for a dataset in the platform through its B2HANDLE and traceable replication between remote locations.

Persistent unique identifiers (PIDs)

PIDs play a crucial role in the FAIR data management principles and in any sustainable data management plan. Within EUDAT systems, data can be directly addressed (and e.g. then retrieved) via B2HANDLE PIDs, and a few “key metadata” are directly stored in the each PID entry. In LEXIS, the B2HANDLE client is deployed on the iCAT servers as a Python library. PIDs can thus be assigned to any object or collection within iRODS, helping us to make the public results of LEXIS Workflows “FAIR”.

Users and projects management through iRODS-Keycloak syncing mechanism

One of the current limitations of the iRODS system is inability to use OpenID/JWT tokens as means for authentication to its API directly. It is realised via a token broker and an user metadata stored in the user account. The user account stored in an iRODS zone corresponds to an user of the LEXIS Platform and contains a subject id (SID) of the user stored in the LEXIS Platform Keycloak. Using this mechanism, calls to the iRODS API can be authenticated by tokens issued by the LEXIS AAI service.

Implementation of this mechanism relies on a periodic synchronization of the LEXIS projects and associated users to each zone, according to its assignment to a particular LEXIS Project. The script queries the Keycloak API and synchronizes the projects and users. For projects it maintains and enforces a structure of collections (folders) in each iRODS zone and for users it keeps the user accounts and their SIDs synchronized in each associated zone. This allows to maintain the access rights across the federation according to the users role in each project.