==========
iRODS zone
==========

Currently, iRODS zones are a main system for storing data
and their metadata in the platform.

The integrated rule-oriented data system (iRODS)
is a solution which offers abstraction of physical
data storage systems in the form of datasets and
collections similar to the normal computer files
and folders. Apart from this, it also implements
a custom binary data transfer protocol which uses
multiple parallel TCP streams to transfer data
between the instances. Each zone uses an SQL server
to store the hierarchy of datasets/collections
and their metadata. The data transfers are encrypted using TLS.

Basic iRODS deployment is a so-called zone,
which has its own set of users and a storage
assigned on a local storage array. Each zone
maintains its own tree of datasets/collections
and their metadata and enforces its access rules.
Multiple zones can set up a federation between
them and use the same parallel transfer protocol
to move data between them. Therefore, it is assumed
that each location providing storage resources
to the platform deploys its own iRODS zone and
federates with others as needed.

.. image:: irods_zones.png
   :target: ../../_images/irods_zones.png


--------------------------
Reasons for choosing iRODS
--------------------------

When designing the data management system
for the **LEXIS Platform** and choosing right
data back-end technology, there were several
requirements to be fulfilled. These included:

- Unified access to LEXIS data in a file-system-like semantics
- Reliability and redundancy
- Support for diverse storage back-end systems
- Support for the LEXIS AAI
- Support for storage policies, for example selective data mirroring
- Support for metadata and persistent identifiers in the system
- Support for system access via REST APIs

Excellent open systems in this sector are,
for example, iRODS, Onedata, Rucio and dCache.
However, the best fitting system for the **LEXIS Platform**
was iRODS. It stands out for its intuitive file-system-like
semantics, flexibility in storage policies
and metadata it stores, file-system-like view on all data,
high-availability setup, its support for various
storage back-ends, support for implementing storage
and mirroring policies, various iRODS clients available
and, most of all, for its integration
in the feature-rich European projects.

----------------------------------------------------------------------
Managing users and projects through iRODS-Keycloak syncing mechanism
----------------------------------------------------------------------

In previous version, one of the limitations
of the iRODS system was its inability to use
OpenID/JWT tokens as means for authentication
to its API directly. It was realised via
a token broker and a user metadata stored
in the user account.

In the latest version of the DDI,
there is `iRODS HTTP API <https://github.com/irods/irods_client_http_api>`_
service integrated that supports OpenID/JWT
authentication natively. This allows direct
authentication of every call coming to the API
without a need of additional broker service.
The HTTP API provides a unified RESTful interface
to the iRODS system that can be used by various
clients, like DDI, portal, and others.

However, iRODS still requires a mechanism
to periodically synchronise the LEXIS projects
and their associated users across zones based
on their assignment to a specific LEXIS project.
The script queries UserOrg service and synchronises
the projects and users.

For data resources,
it maintains and enforces a structure of collections
(folders) in each iRODS zone.
For users, it keeps
the user accounts synchronised in each associated zone.
For projects, it keeps iRODS groups, where users
are assigned to the groups. The project assignment
to data resource is done via group ACL permissions
on the collections. This approach allows to maintain
the access rights across the federation according
to the users' role in each project.

------------------------------------------------------
Tracking data resources across iRODS zones
------------------------------------------------------

In version 2.4.0 of the DDI, a new iRODS zone structure
was introduced. Each data resource, representing storage
allocated to a computational project, is mapped to its
own iRODS collection.

This structure enables tracking of individual data resources,
allowing us to notify users when a resource reaches its capacity limit.
In addition, because LEXIS projects are represented as iRODS groups,
data resources can be easily shared between LEXIS projects when needed.

Overall, this mechanism reflects real-world computational allocations
and storage resources within the iRODS zone.
