Distributed Data Infrastructure
This page provides information about the data management services in the LEXIS Platform. Component that holds all services together is called Distributed Data Infrastructure (DDI). The DDI is responsible for managing data and metadata in the LEXIS Platform. It provides a set of APIs for data upload, download, staging, transfer, and metadata management.
DDI is handling all data in the form of datasets. A dataset can contain entire tree of files and folders. Each dataset has a set of metadata values indexed in an OpenSearch instance and stored as iRODS metadata.
In this section we describe all services implementing the APIs and the way they are deployed on the locations.
Services in Distributed Data Infrastructure
Here is a list of services and APIs in the DDI. They are divided into two groups:
- Services deployed in the LEXIS Platform Core
Metadata API
Transfer API including Transfer Worker
Staging API
Synchronisation script
OpenSearch + Additional services (e.g. Redis, PostgreSQL)
- Services deployed on the target sites
iRODS zone
Staging worker
The DDI is based on locations. Each location represents a system connected to the platform. Platform supports multiple types of locations:
HPC cluster over SSH (SFTP)
NFS / local POSIX
iRODS zones