LogoLogo
  • What is LABDRIVE
  • Concepts
    • Architecture and overview
    • Organize your content
    • OAIS and ISO 16363
      • Understanding OAIS and ISO 16363
      • LABDRIVE support for OAIS Conformance
      • Benefits of preserving research data
      • Planning for preservation
      • ISO 16363 certification guide
      • LABDRIVE support for FAIRness
  • Get started
    • Create a data container
    • Upload content
    • Download content
    • Introduction to metadata
    • Search
    • File versioning and recovery
    • Work with data containers
    • Functions
    • Storage mode transitions
    • Jupyter Notebooks
  • Configuration
    • Archive organization
    • Container templates
    • Configure metadata
    • Users and Permissions
    • Running on premises
  • DATA CURATION AND PRESERVATION
    • Introduction
    • Information Lifecycles
    • Collecting Information needed for Re-Use and Preservation
    • Planning and Using Additional Information in LABDRIVE
    • How to deal with Additional Information
      • Representation Information
      • Provenance Information
      • Context Information
      • Reference Information
      • Descriptive Information
      • Packaging Information
      • Definition of the Designated Community(ies)
      • Preservation Objectives
      • Transformational Information Properties
    • Preservation Activities
      • Adding Representation Information
        • Semantic Representation Information
        • Structural Representation Information
        • Other Representation Information
          • Software as part of the RIN
            • Preserving simple software
              • Jupyter Notebooks as Other RepInfo
            • Preserving complex software
              • Emulation/Virtualisation
                • Virtual machines as Other RepInfo
                • Docker and other containers as Other RepInfo
              • Use of ReproZip
      • Transforming the Digital Object
      • Handing over to another archive
    • Reproducing research
    • Exploiting preserved information
  • DEVELOPER'S GUIDE
    • Introduction
    • Functions
    • Scripting
    • API Extended documentation
  • COOKBOOK
    • LABDRIVE Functions gallery
    • AWS CLI with LABDRIVE
    • Using S3 Browser
    • Using FileZilla Pro
    • Getting your S3 bucket name
    • Getting your S3 storage credentials
    • Advanced API File Search
    • Tips for faster uploads
    • File naming recommendations
    • Configuring Azure SAML-based authentication
    • Exporting OAIS AIP Packages
  • File Browser
    • Supported formats for preview
    • Known issues and limitations
  • Changelog and Release Notes
Powered by GitBook
On this page
  • Upload tool in use
  • Parallelization of uploads
  • Upload prefixes and containers

Was this helpful?

  1. COOKBOOK

Tips for faster uploads

PreviousAdvanced API File SearchNextFile naming recommendations

Last updated 3 years ago

Was this helpful?

LABDRIVE offers a huge scaling capability when handling file uploads and downloads. While the downloads scale without limitation and without any particular approach, there are some recommended techniques for uploading content faster when you plan to upload volumes over 10TB.

Three main elements contribute to a faster uploads. Combine them all to obtain the best performance:

  • Upload tool in use

  • Parallelization of uploads

  • Upload prefixes and containers

Upload tool in use

You can use any S3-compatible tool to upload content to your Data Containers, but to get the best results, we recommend the use of the most recent version of the . Unlike other tools, it has been optimized for parallelization.

Some Linux distributions install older versions of the client by default. Make sure you are always using the last version.

You can use the guide for examples on how to use this tool.

Parallelization of uploads

If you are planning to transfer a large amount of files, with relatively small size each, you can benefit a lot of parallelizing multiple upload processes.

The AWS CLI tool already have some built-in parallelization, but to achieve better results, you can launch multiple processes in parallel.

Upload prefixes and containers

Amazon S3 supports a request rate of per second per prefix in a bucket. The resources for this request rate aren't automatically assigned when a prefix is created. Instead, as the request rate for a prefix increases gradually, Amazon S3 automatically scales to handle the increased request rate.

If there is a fast spike in the request rate for objects in a prefix, Amazon S3 might return 503 Slow Down errors while it scales in the background to handle the increased request rate. To avoid these errors, you can:

  • Distribute objects and requests across multiple folders or containers, as the limit stated above is "per folder".

Configure your application to gradually increase the request rate and , and/or

AWS CLI
AWS CLI with LABDRIVE
3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests
retry failed requests using an exponential backoff algorithm