.. _user-dataset-creation:

==============================
Data Management :updated:`flag`
==============================

Supercomputers can make your life easier,
but they need to know how. You usually need to feed
them some data and provide them a set of instructions
to follow. If you are wondering how, then you are
in the correct section. All the available methods
of transporting the stuff from you
all the way to us is described here.

--------------------------
Datasets section
--------------------------

This is the first subsection of the *Data Management*
where you can upload and manage your data. You
can find it in the main Menu.

.. figure:: img/howto/dataset_wayto.png
   :width: 768
   :align: center
   :alt: Datasets this way

The section is divided by the projects you are part of.
For every project, you can see all the datasets present there.
Also, if you want to quickly find the datasets
you personally uploaded, use the *Show Only My Datasets* switch.
Whenever you make the switch, a little icon will show up
that will allow you to save this setting throughout
the whole **LEXIS Platform**.

--------------------------
Uploading a dataset
--------------------------

Some projects might require an input :ref:`dict-users-dataset`
to start the computation. Therefore, you need to first
upload the dataset to the **LEXIS Platform**.
To start the upload, look for the *Upload Dataset* button. 

.. figure:: img/howto/upload_dataset_button_wayto_.png
   :width: 768
   :align: center
   :alt: Wheres da button

To start the upload of a new dataset, simply click on this button.

.. important::

   You can only upload a dataset if you have assigned
   LEXIS Computation Resource to your LEXIS Computational
   Project. You can find how in the
   appropriate :ref:`requesting-to-LEXIS-resources` section.

.. figure:: img/howto/upload_dataset_form_.png
   :width: 768
   :align: center
   :alt: Fill the form

A window will open where you choose the file to upload.
You can upload a single file, or compressed files
in *zip* or *tar.gz* formats. In the case of
compressed files, you can check the option to
unpack them at the destination.

.. tip::

   The maximum file size TUS protocol allows is *128 GB*.
   Larger files can be uploaded directly via **iRODS**
   with *Py4Lexis* client. *Py4Lexis* client and
   examples of how to use it can be found
   `here <https://opencode.it4i.eu/lexis-platform/clients/py4lexis#lexis-irods-manager>`_.

Next, be sure to give it a descriptive name so you
know its content when you revisit it later.
Do not forget to assign it to correct project.
Lastly, choose who can see your data.
User is for dataset owner only, project reveals
it to all members in the project, and you can also
make the dataset public, if you so desire.
Now we confirm whether we assigned
the Computation resource correctly. When you have
chosen your project from the dropdown *Project Short Name*,
the fields for *Target System* and *Target Resource*
filled in automatically. If these two items are empty,
double-check the resources assigned to your project.

You can move on with the *Continue* button.
You will be automatically chosen as the creator.
The rest of the fields will be filled as well,
but those can be modified by you.
Once you are satisfied, it is time to *Continue*.

Here, you can finally upload that dataset.
This last summary is here for you to make the last verification.
If everything is in order, clicking on the green *Upload*
button will send your data to LEXIS.

.. figure:: img/howto/upload_dataset_metadata_.png
   :width: 768
   :align: center
   :alt: Fill the data about data

.. figure:: img/howto/upload_dataset_summarization_.png
   :width: 768
   :align: center
   :alt: All in one place info

And since we did everything correctly, we will get this
confirmation screen!

.. figure:: img/howto/upload_dataset_success.png
   :width: 768
   :align: center
   :alt: Upload GOOD JOB

You can safely close this window even if the upload
has not finished yet. You can always check the status
of the upload in the *Dashboard*. Just visit the
:ref:`data_operations` to learn more.

.. _modify-users-dataset:

--------------------------------
Modifying a dataset
--------------------------------

In the *Data Management/Data Sets* section, you can find
all the projects you have access to as well as the datasets
assigned to these projects. There is a button in the
*Action* column that allows you to review the details
of the specific dataset.

.. figure:: img/howto/actions_datasets_.png
   :width: 768
   :align: center
   :alt: Actions on thee dataset

You can perform various actions on your dataset:
*View Files*, *Update Metadata*, *Edit Access*, *Download* and *Delete*.

.. figure:: img/howto/detail_of_dataset.png
   :width: 768
   :align: center
   :alt: My dataset is so detailed

.. _using-container-applications:

---------------------------------------
Containers section
---------------------------------------

This is the second subsection of the Data Management
where you can upload and manage your containers
and other containers uploaded to your projects.
You can find it in the main Menu.

.. figure:: img/howto/container_wayto.png
   :width: 768
   :align: center
   :alt: Where did I put those containers

Containers are one of the supported methods for delivering instructions
to a workflow in the **LEXIS Project**.
You first need to prepare your own :ref:`dict-hpc-container-application`
and upload it to your project from where it can be run.

Containers on the LEXIS platform are executed
using `APPTAINER <https://apptainer.org>`_.
For detailed information on building containers,
refer to the `Apptainer Documentation <https://apptainer.org/docs/user/latest/>`_.

---------------------------------------
Uploading a container
---------------------------------------

You can add a Container in the *Data Management/Containers* menu.
Click on the *Create Container* button.

.. figure:: img/howto/create_container_wayto_.png
   :width: 768
   :align: center
   :alt: You hiding from me container birther

In this window, choose the container you wish to upload.
Fill in the name of your new container.

.. important::
 The file **must be named** ``container.sif``
 for LEXIS to recognize it as a valid container.

In the dropdown menu, choose the project to assign the container to.
You can also set the range of users who can see the container.
**User** is for container owner only, **Project** reveals it
to all members in the project,
and you can also make the container **Public**.

.. figure:: img/howto/create_new_conteiner_form_add_.png
   :width: 768
   :align: center
   :alt: Fill me

.. figure:: img/howto/create_new_conteiner_form_params_.png
   :width: 768
   :align: center
   :alt: Fill me everything

At the end, a summary will be displayed for review
and to complete the upload of your new container.

.. figure:: img/howto/create_new_conteiner_form_result_.png
   :width: 768
   :align: center
   :alt: I feel filled

---------------------------------------
Modifying a container
---------------------------------------

In the *Data Management/Containers* section, you can find
all the projects you have access to as well as the containers
assigned to these projects. There is a button in the
*Action* column that allows you to review the details
of the specific container.

.. figure:: img/howto/actions_container_.png
   :width: 768
   :align: center
   :alt: Actions on thee container

You can perform various actions on your container:
*Update Metadata*, *Edit Access*, *Download* and *Delete*.

.. figure:: img/howto/detail_of_container.png
   :width: 768
   :align: center
   :alt: My dataset is so detailed

.. _about-custom-hpc-jobs:

---------------------------
Job Scripts section
---------------------------

This is the third subsection of the Data Management
where you can upload and manage your job scripts
and other job scripts uploaded to your projects.
You can find it in the main Menu.

.. figure:: img/howto/job_script_wayto.png
   :width: 768
   :align: center
   :alt: I forgot my job scripts

Sometimes you might want to just test something
smaller in scope and for that we have job scripts.
Scripts are written in Bash.

---------------------------------------
Uploading a job script
---------------------------------------

To create your own HPC job script,
navigate to *Data Management/Job Scripts*.
Locate the blue *Create Jobscript* button and click on it.

.. figure:: img/howto/create_jobscript_button_wayto_.png
   :width: 768
   :align: center
   :alt: Looking for a job

A new upload job script form will appear. Enter the name
for your job script and assign it to a correct project.
You can also set the range of users who can see the job script.
**User** is for job script owner only, **Project** reveals it
to all members in the project, and you can also make
the job script **Public**. If you correctly assigned
the resources to your system, it will be filled in.

And what is the most important, do not forget to
copy your code into the console. If you just want
to try this functionality, you can use the following code as an
:ref:`example job script <example-custom-job-script>` and continue.

.. _example-custom-job-script:

.. code-block:: bash

   # Example Job Script for LEXIS Workflow
   source /cvmfs/software.eessi.io/versions/2023.06/init/bash

   ls ./input # in this directory should appear staged input dataset
   cat ./input/vinice-geojson.json # reads content of uploaded file to dataset
   echo "I am running!!"

.. figure:: img/howto/custom_job_script.png
   :width: 768
   :align: center
   :alt: custom_job_script

Fill in the information on the next page.

.. figure:: img/howto/custom_job_script_wf_metadata_.png
   :width: 768
   :align: center
   :alt: custom_job_metadata

And finally have a last look if everything checks out.

.. figure:: img/howto/custom_job_script_wf_summary_.png
   :width: 768
   :align: center
   :alt: custom_job_script_wf_summary

And now your job script is added to the specified project.

---------------------------------------
Modifying a job script
---------------------------------------

In the *Data Management/Job Scripts* section, you can find
all the projects you have access to as well as the job scripts
saved under these projects. There is a little down arrow
that allows you to review the details of the specific job script.

.. figure:: img/howto/actions_jobscripts_.png
   :width: 768
   :align: center
   :alt: Actions on thee JOB

You cannot really change the content of the job script.
However, you can create a new version from the chosen
job script and make appropriate changes there.
