Orchestration Service#
Computational workflows (Directed Acyclic Graphs, DAGs in short) within the LEXIS Platform are managed using Apache Airflow. According to its official documentation, the Apache Airflow is an open-source platform developed by the community to enable the programmatic creation, scheduling, and monitoring of workflows. The platform is designed to be scalable, dynamic, and easily extensible. Workflows in Airflow are defined using the Python programming language, leveraging built-in operators — which represent individual workflow components — or custom operators as needed.
Integration of the Airflow in the LEXIS Platform is described in the Figure below.
The Airflow integration in the LEXIS Platform#
The left side of the figure illustrates the possible methods to create a workflow. The blue elements indicate approaches that utilize Jinja2 templating. The LEXIS Workflow Definition (LWD) is based on a YAML specification, allowing users to define custom workflows via YAML files. Alternatively, a custom Workflow DAG can be provided directly in Python script. However, such files must undergo administrative review to ensure compliance with security requirements.
The LEXIS Provider, depicted on the right side of the figure, extends Apache Airflow through custom LEXIS Operators that manage interactions with external services such as Distributed Data Infrastructure (DDI), HEAppE Middleware, Authentication and Authorization Infrastructure (AAI, see Keycloak & Zero Trust), and UserOrg (see Backend Services). Additionally, the LEXIS API Plugin enhances Airflow by introducing a custom security manager, to manage user’s permissions based on OpenID tokens, and custom REST API endpoints, enabling workflows to be created, updated, or deleted programmatically. Using the LEXIS API Plugin, workflows can be created that include containerized applications, computational jobs defined by the so-called HEAppE command templates, or custom job scripts. Furthermore, LWD endpoints are provided to enable workflow creation using LEXIS YAML specification.
To achieve optimal control over LEXIS’s Airflow system, it is also possible to deploy a dedicated instance and enhance this instance by incorporating additional operators or endpoints that align with specific requirements.
More information can be found in the following subsections: