Functions

LABDRIVE Functions let you run code in the platform, in response to certain events or triggers, without needing any client-side script. They are similar to Amazon AWS Lambda functions.

Functions are useful when you want for the platform to behave in a specific way in response to external events, or when you want to add your own code to be executed on demand by you.

With the LABDRIVE functions, you just upload your code and add the triggers you would like to make it execute.

Users are able to define data container-level functions (LABDRIVE functions) that are executed on certain events:

  • CRUD Functions for files and metadata: When files are created, read, updated or deleted.

    E.g.: Every time you upload one of your astrophysics-related a file to a certain data container:

    • Extract date, telescope, run, subrun, datatype, source, position, angle, etc from its file name or embedded metadata to the metadata catalogue, so it is searchable using the web interface or the API,

    • Calculate each file integrity and

    • Tag the ones in which the file name parameters do not match the embedded ones (PIC Magic Telescope)

  • Periodic functions: Every minute, hour and day.

    E.g.: Webhooks to other systems

  • Executed-by-demand functions: When the user selects files using the GUI or launched using the API

    E.g.: For fetch.txt files containing a manifest of files to download, make LABDRIVE to download them and place them inside the container.

This guide focuses in the executed-by-demand functions (the ones users manually trigger using the Management Interface or the API). See Create lambda functions to learn how to create your own functions.

Real world example for a LABDRIVE Function

Let's say you have the following use case: You would like to perform a upload of your bagits. Then, you would like to perform an integrity verification to maintain the custody/integrity chain (to detect uploading errors) and finally, you would like to assign metadata to them.

There are multiple ways to achieve this behaviour in LABDRIVE, but this use case is a perfect example of combining Workflows and Functions. LABDRIVE Functions can be triggered by container's workflow changes, so it would be relatively easy to implement a 6 steps workflow:

Upload content: You would upload content to the container. Then, when your upload has finished, you would make a single API call (or use the Management Interface) to advance the container to the next status of the workflow. To do that, you can use the following API call:

  --url "$your_labdrive_url/api/container/{container_id}/step/next" \
  --header "Authorization: Bearer $your_labdrive_api_key" \
  --data ''

You upload your metadata as a file, as part of your package, as an Excel spreadsheet, for instance.

Waiting for ingestion to finish: A function will be triggered by the platform itself, that will wait for all files to be ingested. When this is completed, the function will move the container to the next status.

To know if a container is still ingesting content that you have uploaded, look for the files_pending_ingestion property for the container. If true, LABDRIVE is still processing content.

Integrity verification: A function will be triggered by the platform itself, that will launch a bagit verification process. If the process is positive (your bags are fine), it will move the container to the “Assign metadata” steep. If not, to the “Validation errors detected”.

Validation errors detected: A function will send you an email, telling you the bags are not fine.

Assign metadata: A function will be triggered by the platform itself, that will assign all your metadata to your bags. Or, alternatively, it would wait for you to update it.

Archive data: A function will be triggered by the platform itself, that will launch the process to move all data to a cold storage.

You could achieve the same results using API calls client-side and managing this process in your code, but Functions deliver a more integrated approach, and allow other users to simply upload data without needing to perform scripting. As an additional benefit, you have integrated logging for the whole process, and your code performs better (as it is executed server-side).

With this approach, you would be uploading the content and making a single API call at the end of the upload.

Launch functions using the Management Interface

1. Locate the data container you would like to add metadata to using the Containers menu section or by searching. This guide assumes metadata is properly configured for the data container. See Configuration\Metadata for more details or Working with data containers to see how to create them.

2. Select Check-in in case you are not checked in the container and you have the check-in/out enabled for the data container.

3. In the data container page, choose Explore content:

4. Select the file you would like to execute the function over and select the function you would like to launch in the sidebar:

It is possible to select multiple items using your mouse to click-and-drag a box around the files or folders that you want to select. You can also use Ctrl and Shift and use the Select all, Select none or Invert selection in the file browser top bar.

You can also filter the files in the folder you are looking at by file name, for instance, for selecting all XML or JPG files:

and then, launch your function from the side bar.

5. Some functions are going to execute immediately, while others may require several hours. To track the function progress, you can use the link provided in the confirmation window that LABDRIVE shows when you launch the function:

You can also go to the Container and select Functions, to see the ones in execution and their outcome:

Some functions will process and change your content, while others may create new files in the container. When a function produces a new report or file, it is shown as an Asset when you open the function execution details page:

Launch functions using the API

API examples here are just illustrative. Check the LABDRIVE API documentation for additional information and all available methods.

1. Sign in to the LABDRIVE Management Interface

2. Obtain your LABDRIVE API key by selecting your name and then Access Methods:

Launch the function

3. Basically, to execute a function you need to know the function ID and the container or file IDs you want to execute it over (or apply it to). You can get all functions that are loaded in the LABDRIVE platform using the following method:

$ curl --request GET \
      --url "$your_labdrive_url/api/functions" \
      --header "authorization: Bearer $your_labdrive_api_key"

4. Some functions receive parameters. To call them, use the following method:

$ curl --request POST \
      --url "$your_labdrive_url/api/container/{your container id}/file/0/function/{your function id}" \
      --header "authorization: Bearer $your_labdrive_api_key" \
      --data '{your parameters here}'

For instance, this function requests the file IDs and the path as parameters:

$ curl --request POST \
      --url "$your_labdrive_url/api/container/171/file/0/event/32" \
      --header "authorization: Bearer $your_labdrive_api_key" \
      --data '{"extra":{"filename":"","ids":["985248"],"path":"/"}}'

For small functions that are immediately completed, LABDRIVE will answer with a Success/error code, but for more complex functions, LABDRIVE will create a job to execute them, so you can track the execution progress.

When the function is launched, LABDRIVE will provide the job id in the response:

Then, you can do three things:

Monitor its execution

Some functions may take hours to complete. You can use the /job API endpoint with the job_id returned in the previous method to monitor its progress:

$ curl --request GET \
      --url "$your_labdrive_url/api/job/{your job id}" \
      --header "authorization: Bearer $your_labdrive_api_key" \
      --data '{}'

See function output log

For each job, you can get its log using the /job/{job id}/messages:

$ curl --request GET \
        --url "$your_labdrive_url/api/job/{job id}/messages" \
        --header "authorization: Bearer $your_labdrive_api_key" \
        --data '{}'

Review created assets/files

And, finally, some functions could create new assets (files). For example, your function could produce a report in a PDF file, or if the function is to compress data, it will create a ZIP file, for instance.

You can list the assets a job has created with the /job/{job id}/assets method:

$ curl --request GET \
        --url "$your_labdrive_url/api/job/{job id}/assets" \
        --header "authorization: Bearer $your_labdrive_api_key" \
        --data '{}'

And download the asset like any other file with the file_id, using the API or any other available download method (remember to include the "-L" in your call):

$ curl --request GET \
       --url "$your_labdrive_url/api/file/{your file id}/download" \
       --header "Content-Type: application/json" \
       --header "authorization: Bearer $your_labdrive_api_key" \
       --data '{}' -L --output my_execution_report.html

Create a LABDRIVE Function

Please see the Functions documentation in the Developer's guide

Last updated