LogoLogo
  • What is LABDRIVE
  • Concepts
    • Architecture and overview
    • Organize your content
    • OAIS and ISO 16363
      • Understanding OAIS and ISO 16363
      • LABDRIVE support for OAIS Conformance
      • Benefits of preserving research data
      • Planning for preservation
      • ISO 16363 certification guide
      • LABDRIVE support for FAIRness
  • Get started
    • Create a data container
    • Upload content
    • Download content
    • Introduction to metadata
    • Search
    • File versioning and recovery
    • Work with data containers
    • Functions
    • Storage mode transitions
    • Jupyter Notebooks
  • Configuration
    • Archive organization
    • Container templates
    • Configure metadata
    • Users and Permissions
    • Running on premises
  • DATA CURATION AND PRESERVATION
    • Introduction
    • Information Lifecycles
    • Collecting Information needed for Re-Use and Preservation
    • Planning and Using Additional Information in LABDRIVE
    • How to deal with Additional Information
      • Representation Information
      • Provenance Information
      • Context Information
      • Reference Information
      • Descriptive Information
      • Packaging Information
      • Definition of the Designated Community(ies)
      • Preservation Objectives
      • Transformational Information Properties
    • Preservation Activities
      • Adding Representation Information
        • Semantic Representation Information
        • Structural Representation Information
        • Other Representation Information
          • Software as part of the RIN
            • Preserving simple software
              • Jupyter Notebooks as Other RepInfo
            • Preserving complex software
              • Emulation/Virtualisation
                • Virtual machines as Other RepInfo
                • Docker and other containers as Other RepInfo
              • Use of ReproZip
      • Transforming the Digital Object
      • Handing over to another archive
    • Reproducing research
    • Exploiting preserved information
  • DEVELOPER'S GUIDE
    • Introduction
    • Functions
    • Scripting
    • API Extended documentation
  • COOKBOOK
    • LABDRIVE Functions gallery
    • AWS CLI with LABDRIVE
    • Using S3 Browser
    • Using FileZilla Pro
    • Getting your S3 bucket name
    • Getting your S3 storage credentials
    • Advanced API File Search
    • Tips for faster uploads
    • File naming recommendations
    • Configuring Azure SAML-based authentication
    • Exporting OAIS AIP Packages
  • File Browser
    • Supported formats for preview
    • Known issues and limitations
  • Changelog and Release Notes
Powered by GitBook
On this page
  • How Advanced File Search works
  • Examples

Was this helpful?

  1. COOKBOOK

Advanced API File Search

PreviousGetting your S3 storage credentialsNextTips for faster uploads

Last updated 3 years ago

Was this helpful?

You can get an overview on how to search for files/folders in a container in the section, but there are some cases in which you may need to perform advanced searches. For these cases, LABDRIVE provides a wide range of search options:

How Advanced File Search works

You can use the following properties for searching for files (remember that you can also search for user-defined metadata):

  • id: To get files/folders matching a certain file id

  • container_id: To get files/folders that are in a certain data container.

  • parent: To get files that are in a certain folder (parent folder)

  • filename: To get files matching the file name (e.g.: mydoc.txt)

  • fullpath: To get files matching the full path to the file (e.g.: /myfolder/mysubfolder/mydoc.txt)

  • deleted: 0 or 1

  • size: In Bytes, to get files larger than or smaller than a certain size.

  • type: FILE or FOLDER, to get only files or folders in your query.

  • structure: If the file is considered structured or unstructured content.

  • format: The PRONOM format for the file.

  • mime: The MIME TYPE for the file

  • date_update: Last update datetime for the file (E.g.: "2021-06-08 11:01:00.089657")

  • date_create: File creation date

  • storage_class_id: The storage class id associated to the file. See Storage.

And you can use the following operators:

  • like: like in the SQL syntax, supporting the % character only.

  • starts_with: The value starts with

  • ends_with: The value ends with

  • eq: Equal

  • !eq: Not equal

  • in: Value for the file/folder is one of the provided values

  • not_in: Not one of the provided values

  • gt: Greater than

  • lt: Lower than

  • gte: Greater or equal than

  • lte: Lower or equal than

You can combine them, for instance, if you would like to get all the files in the container 185 with a size larger than 702 bytes (703 and larger), you can use:

curl --request GET  --url "$your_labdrive_url/api/file" \
       --header "Content-Type: application/json" \
       --header "authorization: Bearer $your_labdrive_api_key" \
       --data '{
          "conditions": [
              {
                  "container_id": 185
              },
              {
                  "size": {
                      "operator": "gt",
                      "value": 702
                  }
              },
              {
                  "type": "FILE"
              }
          ],
          "limit": 100,
          "offset": 0
      }'

If you want to refine even more, listing only the PDF 1.5 files (fmt/19), you can use:

curl --request GET  --url "$your_labdrive_url/api/file" \
       --header "Content-Type: application/json" \
       --header "authorization: Bearer $your_labdrive_api_key" \
       --data '{
          "conditions": [
              {
                  "container_id": 185
              },
              {
                  "size": {
                      "operator": "gt",
                      "value": 702
                  }
              },
              {
                  "type": "FILE"
              },
              {
                  "format": "fmt\/19"
              }
          ],
          "limit": 100,
          "offset": 0
      }'

Examples

Finding by file extension

If you want to search by file extension, you can use the ends_with condition. For instance:

curl --request GET  --url "$your_labdrive_url/api/file" \
     --header "Content-Type: application/json" \
     --header "authorization: Bearer $your_labdrive_api_key" \
     --data '{
    "conditions": [
            {
                "container_id": 40
            },
            {
                "fullpath": {
                    "operator": "ends_with",
                    "value": ".jpg"
                }
            }
        ],
        "limit": 100,
        "offset": 0
    }'

You can achieve the same using the like condition, for instance:

curl --request GET  --url "$your_labdrive_url/api/file" \
     --header "Content-Type: application/json" \
     --header "authorization: Bearer $your_labdrive_api_key" \
     --data '{
        "conditions": [
            {
                "container_id": 40
            },
            {
                "fullpath": {
                    "operator": "like",
                    "value": "%.jpg"
                }
            }
        ],
        "limit": 100,
        "offset": 0
    }'

Finding files with a certain string in the file name

If you are looking for files in which its full path contains "validator", you could use:

curl --request GET  --url "$your_labdrive_url/api/file" \
     --header "Content-Type: application/json" \
     --header "authorization: Bearer $your_labdrive_api_key" \
     --data '{
        "conditions": [
            {
                "container_id": 40
            },
            {
                "fullpath": {
                    "operator": "like",
                    "value": "%validator%"
                }
            }
        ],
        "limit": 100,
        "offset": 0
    }'

This will find:

  • /validator/myfile.txt

  • /validator.txt

  • /my-validator-results/myfile.txt

Finding files created after a certain date

If you are looking for the files created after a given date, you can use:

curl --request GET  --url "$your_labdrive_url/api/file" \
       --header "Content-Type: application/json" \
       --header "authorization: Bearer $your_labdrive_api_key" \
       --data '{
          "conditions": [
              {
                  "container_id": 185
              },
              {
                  "date_create": {
                      "operator": "gt",
                      "value": "2021-07-05"
                  }
              },
              {
                  "type": "FILE"
}

          ],
          "limit": 100,
          "offset": 0
      }'

Finding files created before a certain date

If you are looking for the files created before a given date, you can use:

curl --request GET  --url "$your_labdrive_url/api/file" \
       --header "Content-Type: application/json" \
       --header "authorization: Bearer $your_labdrive_api_key" \
       --data '{
          "conditions": [
              {
                  "container_id": 185
              },
              {
                  "date_create": {
                      "operator": "lt",
                      "value": "2021-07-05"
                  }
              },
              {
                  "type": "FILE"
}

          ],
          "limit": 100,
          "offset": 0
      }'

Search