Changelog and Release Notes

New features, improvements, and fixes for the platforms Flexible Intake, LABDRIVE, LIBSAFE Go and LIBSAFE Advanced.

Version 2024.04.0

New and Improved

  • Functions: an extra function "parameters" has been added to the MD5 Validate function. The code can be obtained from the repository.

  • User Log: Added registration of the user who launches jobs during integrations.

  • Load optimization: Loading has been optimized when browsing through collections. The loading of OpenAccess collections from the OA_COLLECTION metadata has been optimized.

  • Improved searched: A new filter "type" has been added to improve the filtering of searches.

  • Permissions: Changing the order for write content metadata permission. Added permission to edit metadata without having write permission on the container

  • Archive Space Integration: A new Source Field to Incoming Metadata Crosswalk in the Archive Space integration has been added.

  • Interface: Category column has been added in the table of both object and container descriptors.

  • New parameter in Object Metadata Schema: A new property has been created in the Object Metadata Schema so that metadata can be assigned but not be visible in the WI. It can only be displayed by API if the option is selected.

  • Improved advanced search: The advanced search has been updated. The date has been included and a SELECT is displayed with YES or NO for Bool descriptors.

  • OA Integration Discussion: New integration to control the transfer connector component from LIBSAFE. The user can now configure the metadata fields as well as the viewers, adapting and customizing the tool to their workstream.

  • Permission management for running integration: With the new permissions, the integration can be accessed correctly and jobs can be launched.

  • Grid mode for containers by Archival Structure: New development to have a GRID MODE. This mode is useful when the hierarchy is too large.

  • Add method to pylib to add events to objects: Add a method to the pylib library to allow inserting events in the table PREMIS events.

Fixed

  • Container template auto check: On the new container once elected, a template, check in was disabled even when templated said free of choice. This has been fixed.

  • Grammar review: Some spelling and grammar changes were made across the platform:

    • Section menu home translation key has been fixed, typing error in Object metadata/Categories menu.

    • Window error when restoring a file. Objects were deleted and restored. Window appears.

  • Wrong data in the inventory: Container contents are not updated. Contents are updated after 24 hours.

  • Bulk Metadata Editor:

    • When using the RECURSIVE button to uncheck all directories, you have to go page by page doing it. This has been fixed.

    • Fixed error in displaying changes to be made in Bulk Metadata Editor.

  • Problem with labels in function code: When we insert HTML code in the code of the functions it is stored the first time. If we refresh, we can see that these tags do not appear. It has been fixed; the changes are saved.

  • MD5 function:

    • The function has a TRIGGER defined for each algorithm. When we select an MD5 the correct name appears but if we select a SHA1 the name of the MD5 appears. This has been corrected. The trigger name is correctly listed with SHA1.

    • This function works correctly.

  • Problem saving a function trigger: When the function pulled a REPO and had triggers, it put as repository password = false. It has been checked and compiles correctly.

  • Explore content screen shrink over: Explore Content's screen shrinking over time with repeated use of Bulk Metadata Editor function. This has been fixed.

  • Fixing File / folder count: Elastic query counting files and folders fixed

  • Simple search sort selector: In the simple search, the user will be able to sort by the following fields:

    • Date of ingestion (field: date_create)

    • Size (field: size)

    • Filename (field: filename)

    • The sorting will be ascending(A-Z) and descending(Z-A).

  • Error when purging a file: When the file is purged, the job is created correctly, the status is verified as completed and the file disappears.

  • Transfer Connector: Changes and updates to the transfer connector:

    • Parameters

    • Default language separator

    • Language of the metadata

    • Profiles

    • LIBSAFE Integration

    • Fallback Folder

  • Copy/Paste Function: The function was not working but it has been fixed.

  • Data container template: data container template can be selected and saves changes.

  • XML previewing: XML files are correctly displayed now.

  • Fixing File count bug: This has been solved.

  • OAIS AIP packages configuration: We are improving the platform capabilities related to the upcoming OAIS updated version release. By using the “OAIS AIP packages configuration” new configuration section, organizations are able to add semantics, full Representation Information and accurate Preservation Description Information to their exported AIP packages. Additionally, a function that creates them whenever they are needed and a multi-platform resolving tool is provided.

  • Active Content Analysis Update: Active Content Analysis works correctly every 6 hours.

  • Modify endpoint to obtain files or a single file: It is verified that by means of API it also returns the related tags in the file.

  • Node permissions: The nodes are hidden correctly if the user does not have the required permissions.

  • Problem when launching an integration job: Users with EXECUTION permissions report that when they click on the job button it keeps spinning continuously and does not work. This has been fixed.

Version 2023.11.0

New and Improved

  • Advanced search validation: the interface has been optimised so if a search is missing elements to be performed, the user gets a validation error warning message.

  • Code visibility button: an option has been implemented in the Containers by Archive Structure interface to hide the nodes and sub-nodes code. This functionality does not affect the order of the structure, which remains alphabetical by the given codes.

  • Function replication validation: a field has been added to validate the maximum and minimum replications for a given function.

  • JP2 platform preview: this format is now available through the previewer tool.

  • Maximum characters warning: when inputting long names in the archival structure code, the system would prompt a generic 500 error. This has been updated giving more context on the problem to the user.

  • Notifications: a new feature using an endpoint to send notifications to users has been made available. This can be set up through functions.

  • Session duration: the login time has been increased to four hours.

  • Unstructured files: the "structure" filter from the search faceted search has been removed as it's no longer in use.

Fixed

  • Advanced search drop-down: selecting a metadata field from a list did not work for all the searches. This has been fixed.

  • Advanced search error: when using file or folder ID as parameters, the platform was returning an error. This has been fixed.

  • Bulk metadata malfunction: the function editor was only working for the first ten actions. This has been fixed.

  • Date value mismatch: this metadata value would display the selected date minus one day following the browser date instead of UTC standards, This has been solved.

  • Download reports error: fixed the report notification link sent by email as it was not downloading content.

  • Check-in container issue: error collecting ownership properties is now solved.

  • Elasticsearch endpoint: an endpoint regarding file size was returning inaccurate information in the reports. This has been adjusted.

  • Grammar review: some spelling and grammar changes were made across the platform.

  • Integration accessibility: the permissions did not match the assigned groups, allowing all users to run jobs. This has been controlled through the different platform options.

  • Log sorting issues: jobs with the same timestamp were being sorted incorrectly. This has been fixed so the system always uses the ID.

  • Missing events: some weblogs were not being recorded by the platform. All insertions are now visible.

  • Move/Copy function list: the drop-down was limited and did not display all the available folders, as well as order them in ascending size. The file ID search has also been fixed and is now available.

  • Nodes overload: the number of calls made to load the node structure has been sensibly reduced, allowing for faster navigation on the archival structure panel.

  • Parent nodes conflict: when editing nodes, it was possible to include two nodes as each other parents, which caused a display conflict. A limitation has been included to prevent the issue from happening.

  • RegEx value: functions with a conditional RegEx value were not displayed when such RegEx was selected. This has been fixed.

  • Renaming issue: correction in wrong renaming of objects using the Object Explorer.

  • Secret parameter: this parameter, available in functions, was not sending the information as expected. It has now been fixed.

  • Stuck scroll option: the vertical scroll in node/container permissions that was affecting some browsers has been fixed. Users can now scroll all the way down after selecting any users or groups.

  • Submission areas bugs: the user was unable to select a data container template for the submissions template. The submission would also not redirect properly to a container creation window when there was a conflict on forced templates. Additionally, submission IDs and external links were being cross-referenced and redirected to the wrong archival structure. All issues have been fixed.

  • TAG datatype: when this datatype was being used in functions, it sent the wrong information to the interface. This has been fixed.

  • Unicode Characters: The system is now able to save functions where UNICODE characters are present.

  • Webpage files previsualization: opening of htm|html files with the Previewer option is now solved.

Version 2023.08.0

New and Improved

  • Advanced search: a new search option has been added to the platform, allowing the user to look for items with a more advanced approach. Queries can be saved and shared with other users to speed up results. This new advanced search also offers the user the possibility of downloading the items from the search interface, as well as opening them on the stored container.

  • API search: this feature has been improved on the conditional parameters. They have been modified to allow more flexibility in term searches.

  • Creator user name: the container details was only listing the creator user ID. The table now shows the name of the user at the time of the container creation.

  • Customisable logos: the interface template has been modified to include custom content on both the login page and the platform's footer and headers. This change has to be requested and implemented by the LIBNOVA team for now.

  • Functions default order: the list has been sorted so the most recently run functions appear first.

  • Health Index information: this has been added to the PRONOM information in the item's details. Now its health can be easily identified through the colour-coded bar next to the format information.

  • Link metadata fields: these types of fields have been improved, allowing the user to follow them to the linked destination. A new action button will open the link in a new window (either on the Web or somewhere in the platform). Additionally, a magnifier glass button allows the user to search within the platform and add a link to a file, folder, or container within the platform.

  • Move/Copy function improvement: now the container ID shows in the drop-down menu, and it’s searchable by the user. This improves the use of the function, targeting the container destination easier and faster.

  • New Containers Summary Report: this platform-level report returns the details of all active containers in the platform, so information, like assigned workflow and object metadata schema, can be seen without accessing the containers individually.

  • Nomenclature review: the platform is being reviewed, and some of the terms and descriptions are being reworded for clarity. Names and titles are also being standardised and adapted to widely used industry terms.

  • OpenAccess integration: new integration to control the transfer connector component from LIBSAFE. The user can now configure the metadata fields as well as the viewers, adapting and customising the tool to their workstream. New metadata types and viewers have been added to this integration to enhance the transfer of files.

  • Paste metadata: a new option has been added to the object metadata edition. Instead of importing the fields through a function, the data can be pasted for each individual item just by copy-pasting it from a spreadsheet.

  • PRONOM reports: the health index has been recalculated and applied to new formats after the latest PRONOM update.

  • Report organisation: the list of reports, both at the container and platform level, has been sorted alphabetically for easier access.

  • Tag drop-down: a new parameter for tags has been to the platform. When this tool is used in function configuration, the system will create a drop-down of the available tags instead of requiring a text input.

  • Thumbnail generation: the maximum size of the file for allowing thumbnail autogeneration has been set to 200 MB. This aims to smoothen navigation in large containers.

  • TXT viewer: the previewer component has been improved so these files can be opened with no issues.

Fixed

  • Copy/Move function: warning messages have been added when the origin of a file/folder is the same as the destination, not allowing this action altogether. This avoids redundancy and unnecessary versioning.

  • Date metadata bug: the delete button didn't show up when more than one value was input. This was affecting the date-type metadata fields.

  • HTML files: the previewer was not able to load these files correctly. This has been fixed.

  • Hover text adjustment: when hovering on the home screen charts, some of the texts would appear off-screen. This has been adjusted for all resolutions.

  • Folder summary visibility: this information would disappear once the metadata was edited and saved, having to reload the folder to obtain it again on the right-hand side column. This issue did not affect files.

  • Forced template issue: there was a logical issue that was preventing the user from selecting schemas freely on the Data container templates, as the options were stuck on a forced template. This issue was also requesting a template when generating new containers, even though it's not needed. This has been fixed.

  • PREMIS event storage: One metadata-related PREMIS event was not being stored properly. The issue has been addressed and fixed.

  • Scheduled reports access: this functionality expired shortly after a report was launched and the notification was sent. The access has been increased to 31 days after email notification.

  • Security Audit Report: this very extensive report was having problems showing all the entries and loading the pages. The results have been limited to 1000 at the time, fixing the page overload when going back and forth.

  • Two-Factor Authentication bug: the input text was the same colour as the background, so it would appear invisible. This has been changed so now there is a contrast in the text.

Version 2023.05.0

New and Improved

  • PRONOM: the PRONOM list has been updated to the latest version (v111).

  • Skipped logs: unnecessary logs have been omitted from the web interface to allow smoother and faster navigation.

  • Terms normalisation: vocabulary regarding metadata has been reviewed and changed across the platform to standardise them.

  • User Interface improvements: several visual changes have been implemented to improve the user experience. These include visualisation design, colours, removed autocomplete settings, "scroll to top" options, and navigation bars.

Fixed

  • Bulk edition of boolean values: the bulk metadata editor function was not detecting the boolean values for a given field. This has been fixed so now the options are pre-loaded upon selection.

  • Change storage class access: browser settings were preventing the user from reading a file after changing its storage class from cold to hot. This has been fixed so the user can access the file.

  • Copy/Move function permissions review: this function has been updated so the user cannot send any files or folders to containers where permissions are controlled.

  • Empty content file uploads: the API was unable to upload a file without content. This has been fixed.

  • Extension change on download: files with CSV, JSON, XML, and XLSX extensions would transform upon download. This has been fixed so they remain in their original format.

  • "Forgot password?" bug: the text for the password recovery in the login screen was populating as the same colour as the background, deeming it "invisible". The page has been edited so now the user can see the characters input in the text box.

  • Unrecognised extensions preview: the object explorer was unable to display HTML, HTM, TGA, and RAW formats within the platform. This has been fixed.

  • Javascript update: the library has been updated to the latest version.

  • LIBSAFE transfer connector: there was an issue affecting the bulk action of this transfer connector. This has been fixed.

  • Upload multipart: the system can now comfortably support bigger upload requests through the browser. However, this upload tool still remains not recommended for main bulk ingestions of the repository.

Version 2023.03.0

New and Improved

  • Container details: a new information box has been added to the Description and Overview section of a container, detailing the creator ID, date created, workflow, items metadata schema, and description.

  • Database search: there were incoherences between the search engine and the database, which would leave a container in a loop looking for un-hashed files. This has been updated.

  • Disabled report preview message: when a report is too big to be displayed from the web interface, a message will pop up informing the user why the screen is not loading and to download the report instead.

  • Embedded metadata extraction: a new tool has been implemented to obtain the technical metadata during file ingestion. The ExifTool is now combined with Apache Tika for this metadata extraction and indexing.

  • Extension search performance: the search engine has been optimised regarding file formats search, adding more accuracy to the queries.

  • New LOG datatype: a new metadata type has been created to store log information. This metadata field can only be overwritten from the API.

  • New local Object Explorer: the object explorer has been optimised for a more secure and stable experience. Some of the new features include allowing uploads of over 2GB through the browser, optimisation of file deletion, and rewiring all browser actions to core functions (like copy/move) for better performance.

  • Password reset: the login interface has been updated to allow password recovery. Please notice that this option is only available for local users and not those accessing through single sign-on.

  • S3 protocol warning messages: potential data loss messages have been added to the platform, warning of the use of S3 protocols using tools outside the platform.

  • Recursive directory information: an information box has been added to any folder's properties showing a recursive summary of its contents (size, folders, and files within the directory).

  • Title display improvement: node names have been adapted so that if they are too long to be displayed completely, they are fully readable through a tooltip.

  • Unstable uploads warning: disclaimer information has been added to the upload options, warning the users of the different issues they might encounter when the browser is used for larger uploads.

  • Virus security messages: when a user attempts to open or download a file that has been flagged as infected, a warning message will display on the interface. This forces the user to confirm a possible infected download before the action is launched.

  • Visible descriptors: a new feature has been added to the Submission Areas which allows leaving metadata descriptors hidden from the donors and using them internally only.

Fixed

  • Boolean parameters: their behaviour was not as expected as boolean parameters were returning only one condition regardless of the input. This has been fixed.

  • Controlled Copy/Move options: this function is not available if no files or folders are selected, preventing moving all the objects in a container by mistake.

  • Data representation accuracy: now the charts across the platform don't display any coloured sections if the unit they're representing does not contain any data.

  • File restoration: an error was allowing users to mark files in cold storage as "restorable" when they cannot actually be recovered. This has been fixed, as only deleted files in standard storage can be restored.

  • Folder-rename function result: the link provided when running this function was showing a 404 error message instead of redirecting to the function summary. This has been fixed.

  • Endpoint search: core functions could not be filtered by conditions. The endpoint has been fixed to allow this kind of search.

  • Page title display: the tab was not displaying the label of the page but the full URL instead. This has been fixed.

  • Purged files log: permanently deleting files from a container resulted in a logline that wrongly indicated that the file was not marked for deletion. This has been corrected.

  • Special characters display: some characters were not being formatted appropriately in reports. This has been fixed so the text shows cohesively.

  • User database: there were some incoherences in the user configuration panel (like showing more or fewer users than the filter would indicate). This web interface issue has been fixed.

  • User edition: when a pre-created user was edited, it was not possible to remove them from all the groups (at least one had to remain). That has been fixed.

Version 2023.01.1

Fixed

  • Report data export: data was not extracted and output when the report was exported to CSV, producing an empty document just with the headers. This has been fixed for the reports of Used Space by Archival Structure, Storage Use per Archival Node, and Storage Use per Container.

Version 2023.01.0

New and Improved

  • ArchivesSpace integration: new integration that allows synchronisation between an ArchivesSpaces instance and LIBSAFE though a set of rules and custom configuration.

  • OpenAccess integration: the connection between the preservation and the discovery area OpenAccess can now be custom-configured from an integration available in the platform. This allows more flexibility when configuring the transfer connector metadata, profiles, and many more.

  • New OpenAccess report: this report shows all the files published in OpenAccess through the Transfer Connector at platform level, as well as the link to the record.

  • New Files with Viruses report: a list of infected objects can now be retrieved from a report that analyses the platform globally.

  • Configuration variables: a flexible configuration has been added to adapt the navigation within the web interface to custom preferences (i.e. navigate back to Archival Structure instead of Containers by Workflow by default).

  • Metadata data type call: new API method that returns the data type for metadata fields.

  • Security headers: updated AWS security headers in the platform responses to keep downloads from the cloud secure.

  • Thumbnail display: the Object Explorer in submission areas and shared containers is now able to display thumbnails for several formats (i.e. tiff, ogg, webm, etc.).

  • User experience improvement: when creating or editing metadata fields, new status messages (like "loading") have been added to inform the user of internal processes that might be running while interacting with the platform.

Fixed

  • Health Index report: an issue was preventing the health index to show the traffic light colours associated with the format's status, appearing in a greyscale instead. This has been fixed.

  • Function description field: the user was not able to leave this field empty when they were editing a function. This has been changed so it's not a mandatory value anymore.

  • ID column display: the container ID value was set as a digit and not returned properly in some web interface tables. This has been fixed by setting it to a string, so it's always visible and accurate.

  • Error on container status: the system would get stuck when trying to display the container status in Archival or Workflow views, and was unable to calculate it. This has been fixed.

Version 2022.12.0

New and Improved

  • New Details Container Inventory report: a new report at the platform level has been made available to monitor the status of all the containers ever created in the platform, including those which have been permanently deleted.

  • Export embedded metadata: a new download button has been added to the file embedded metadata view so it can be exported.

  • External users containers: improvements have been applied to the submission area and shared containers, restricting and hiding elements the external users could not interact with (as zip extraction).

  • Submission areas template: the templates have been modified so variables (like the donor's name) are included in the confirmation email.

Fixed

  • Directory hierarchy issue: when many nested folders were uploaded into the platform, the system was not keeping the imported structure and placing them all at the root level. This has been fixed and a structure is ingested respecting the imported hierarchy.

  • Empty value list: the system was allowing the creation of value list metadata fields with no values, which was creating further issues when trying to access the object schema. This has been fixed and now the platform prevents the user from creating an empty value list.

  • Hidden values: long value lists were not accessible from the object metadata panel. The user can now scroll down and navigate through all the available values.

  • User filter: the filter for groups did not load properly on the users' panel. This has been fixed.

Version 2022.11.0

New and Improved

  • Source code on function creation: a new option has been implemented in the function section that allows the user to select a code from the GIT repository as a source. This prevents the code from getting obsolete when there is any change, although a manual redeployment is still required.

  • ID complementary information: a column with the ID number of the respective element has been added to the interface to ease their identification when using methods like the API. This affects elements like users, groups, and metadata fields (both container and descriptive).

  • Recovery tasks stopper: a recovery task can now be manually stopped when recovering a file or a version (as long as the task has not been yet completed).

Fixed

  • Content analysis duplicates: a variable was preventing the primary content analysis to show the duplicates on the container details page. This has been fixed.

  • Content template: this option was not working properly and the content from the selected container was not being duplicated on the destination container. This has been fixed for both platform and submission area containers.

  • Purge files: a fix has been implemented so the root folder in a container cannot be deleted by mistake. They can only be deleted if their container is deleted as a whole.

  • Search by full path: the system was automatically applying IDs to deleted files as names, making it difficult for the user to identify the deleted file. The full archival path and name can now be used to find a deleted item in the Storage and Integrity tab.

  • Shared elements message: the task confirmation message was not showing up when sharing an element through the container function. This has been fixed.

  • Shared folders link: the anonymous link to a shared folder was not appearing on the folder's properties panel. This has been fixed.

Version 2022.10.0

New and Improved

  • New user profile endpoint: this new GET method allows the user to obtain information about their profile through the API.

  • New report endpoint: using this method on the API provides all the reports available on the platform.

  • New storage endpoint: this API method can be used to know, through a true/false boolean response, if a file currently exists within the storage.

  • New sorting parameter: this endpoint allows sorting from API/file/elastic.

  • PREMIS events log: the event now records the full file path, including the container. The element type has also been changed to small caps for consistency in presentation.

  • Metadata copy/move action: the file metadata is forced to be migrated to the destination file when a move/copy action is performed on containers with identical schemas. This prevents metadata loss.

  • Sort by ID: any lists and search endpoints are sorted by ID by default. This feature is intended for bringing consistency across the platform.

Fixed

  • Job queuing: if a function fails, it is not sent to the function queue and a job is not created. Before the fix, a job would be created after a failed function, being stuck in 'pending' status until manual intervention removed it.

  • Special character search: the use of special characters in the function search bar was causing issues and not generating any results. This has been fixed and the search engine is able to produce an output despite special characters.

  • Node deletion: an error was not allowing the user to erase a node whose containers were permanently deleted. This has now been fixed.

  • Decreased animation time: the animation for the used space graphics was displaying slowly. This has been optimized.

  • Elasticsearch query fix: the graphs and charts created by the use of this tool were not accurate. This has been fixed.

  • Recalculate values fix: if a container was left empty, the platform was unable to return null values regarding storage, and would be left on a recalculating loop. This has been fixed and it now is able to deliver values of zero for empty containers.

  • Total size adjustment: deleted files were taken into account when calculating the total size of containers. This has been adjusted to current files only.

  • Summary folder: the charts in the folder property were reflecting both files and directories. This has been fixed so only the files contained in the directory are accurately represented in the chart.

  • Function names: the label applied to the functions has been modified so it is displayed as it was input at creation.

  • Function controller message: when functions are created or edited and this causes an error, a more comprehensive message identifying the source of the issue is output.

Version 2022.09.0

New and Improved

  • Submission areas improvement: anonymous users can now upload bigger files in a submission container without being limited by the browser (previously restricted to 2GB per file). However, bandwidth and connection quality will still affect the speed of the upload.

  • Public container: it is now possible to share a container as a unit from the Shared tab through a unique link created by the system. This link can be regenerated and the sharing can be stopped on demand.

  • File deletion: the user can now perform a permanent deletion on the selected files without LIBNOVA's support. A PREMIS event will be recorded for each deleted file, registering information as timestamp, event type, and user.

  • Container deletion: the user can now perform a permanent deletion on a container without LIBNOVA's support. A PREMIS event will be recorded for each deleted container, registering information as timestamp, event type, and user.

  • PREMIS events visualizations: two charts have been added to the Events tab (top 5 by type and creation time timeline) for a better first-glance analysis of the PREMIS events taking place in the container.

  • New recorded PREMIS events: additional information has been added to the PREMIS log in the file details for new events. These are ingestion, versioning, renaming, and deletion.

  • Navigation improvement in massive selection: when selecting many files at once, the system would take a long time to process each file as individual requests. This update cancels the previous "summary" requests until the user finishes selecting all the files and processes them as one single request.

  • Object metadata schema information: the type of descriptive metadata applied to the objects in a container could not be identified anywhere after a container was created. This information has now been added to the Edit Container option in the Details tab, remaining as an unmodifiable setting.

  • JSON health check: when a JSON is uploaded for file metadata ingestion, the system performs a syntax check to verify the structure is correct.

  • API Key instructions: a message has been added to the API key generation in the user creation panel, which informs of the need to generate one to both connect to the platform via API and use any custom functions through the web interface.

  • Data loss warning: a message has been added to the move/copy function panel to warn the user of possible data loss if an object is moved/copied to a container with a different descriptive metadata schema, as any populated metadata fields will not be copied with the object.

  • Hash information improvement: the hashes box in the file properties has been rearranged so the type of hash is easier to identify.

Fixed

  • Change Class Storage function: this function was not recognizing the files inside directories when applying a change of storage class to a folder o group of folders. This has been fixed.

  • Identifier malfunction (LABDRIVE): the processor was having issues completing a correct file characterization. This has been solved for medium to large archives.

Version 2022.08.1

New and Improved

  • PRONOM update: the file formats and their health index have been updated according to the latest PRONOM standards (v107).

  • DataBase characters: it is now possible to input Unicode characters like emojis to metadata fields, as well as retrieve this information via API.

  • User interface User Experience (UX) improvements: the platform interface elements have been improved to provide a better user experience. The LIBNOVA UX team has been working with several platform users to improve the interface's visual style and design elements in order to make the interface cleaner, more pleasant, and easier to use. This change includes a package of 32 UX improvements.

Fixed

  • Sort-by in reports: some of the sort-by options in reports were showing in fields where the information could not be sorted, causing the report to go blank if used. This has been fixed for the problematic columns.

  • Permissions in archived nodes and containers: the system would still keep the permissions assigned to groups and users as current even if these structures were archived (or deleted). This prevented the users from deleting groups, and it's now fixed.

Version 2022.08.0

New and Improved

  • New Publicly Shared Containers report: this report returns a list of all the containers whose content has been shared in their totality. It can be found under the Data Analytics/Reports menu.

  • File properties improvements: more details have been added to the properties panel for a file, such as creation and update dates, size, and format among others. Details on virus detection have also been added to the detail panel.

  • Hard-delete of file versioning: an option has been added under the Versions Information property tab to permanently delete any previous versions of an ingested file.

  • Metadata Extraction event: this event is generated every time a Tika tool for metadata extraction is used, adding the information to the file properties following the PREMIS guidelines.

  • Format Identification event: this event is triggered every time a file is characterized, creating a detailed report according to PREMIS guidelines.

  • Conditional fields in Function Parameters: this feature allows presenting options that are related to one particular condition when building custom functions.

  • Two-factor authentication: a TOTP option has been added to the user's profile, increasing login security. A new column has been added to the Data Analytics' User report to track this feature per user.

  • User Account Creating warnings: further information and instructions have been added when a user tries to log in but their ID does not match with any created on the platform.

  • Grammatical changes: information boxes in the metadata reports have been reworded for clarity.

  • Style changes: navigator and tab components have been added to the interface to allow smoother navigation.

Fixed

  • Bulk Metadata Edit correction: recursive bulk edition preview was not showing when selecting a folder was selected. Information boxes have also been redacted for clarity.

  • Collapsible nodes: this option was not available straight away when containers were explored by archive structure. It has now been fixed.

  • Diagnosis of file indexing: new tools have been implemented to deter and minimize the number of files partially or not indexed correctly.

  • Space Use calculation: the data showing at container level on space use has been fixed for more accuracy.

Version 2022.07.1

New and Improved

  • Storage in Use: a new tab has been added to the Data Analytics section which includes a breakdown of the storage user over time by class and types through filtrable timelines and charts. License usage against storage type has also been added to this new content analysis.

  • Container details dashboard: more data analysis elements and visualizations have been added to the container data tab, giving an overview of the type of storage used (hot and cold) and the versioning stored (deleted, current and old versions).

  • PREMIS Event tracker: each file holds a summary of the preservation and provenance events (PREMIS) in compliance with the highest conformance level. This new detailed list of events can be found in the properties of a file, on the Events tab.

  • Event logging improvement: information on hash generation, change of descriptive metadata, display of objects, and processing (per file) has been added to the container logs to improve the monitoring of events (previously named Action Log).

  • Sort by date: a sort by date filter has been enabled in the Events container tab to ease the lookout for actions performed on the platform.

  • Search on container views: a search box has been added to the Containers by Workflow and Containers by Archive Structure views to ease the locations of containers by string.

  • File Formats at Risk report improvement (container level): the report now shows only the items whose format is at risk, instead of all the formats identified by the platform. Thus, no null values will be displayed in the Total files column.

  • Used Space by Containers report improvement: more details have been added to this report at platform level, specifying the size and count of files by availability, versioning, and soft-deleted ones; as well as detailing the storage class (hot or cold) per type of item.

  • Dissemination event: a new log has been added to the container's events section when a file is downloaded using the file browser.

  • Virus event: details on virus scanning have been added to the container's event log.

  • Metadata import update: metadata files that were not formatted correctly were not showing in the import options per file. All files with JSON, XML, and CSV extensions are now considered fit for metadata upload regardless of the format.

  • Container permanent delete: a feature has been added to the platform and the API so the user - with the right permissions - can permanently delete a container, removing it as well from the S3 storage. Before this, containers could only be removed by soft-deleting them from the main view.

Fixed

  • Format Discrepancies report: this report has been modified to show only the expected formats for the files that show discrepancies. The unknown and uncharacterized files have been removed from the report for a more accurate analysis.

  • Container restoration: users with no admin permissions were not able to restore a deleted container even if they were allowed to do so. Now any users with 'delete container' permissions can restore a soft-deleted container.

  • API upload bug: when the command "...\file" was used to upload files through the API, it would lead to an incorrect logging or security problem message. This has been amended. Reported by Jacek (DESY).

Version 2022.07.0

New and Improved

  • New Data Analytics report - Container Metadata: this report features a list of all the containers in the platform and the metadata information assigned at container level.

  • New container report - Item (File/Folders) Metadata: new report at container level. It lists all the items in the selected container, as well as a collapsible section showing information on the populated metadata fields per item.

  • Functions' classes: the selectable parameter type (general and normalization) and automatic parameter owner (defined at the time of creation) have been added when creating custom functions. This feature allows the grouping and better organization of the functions available in the platform.

  • Permissions’ endpoint: added a parameter to the API endpoint to identify users’ IDs and return an array of allowed users More information here.

  • Sandbox improvement: all test instances have been labeled with a bright red information box to inform the user they are not using a production environment.

Fixed

  • Calendar in date fields: when using the provided calendar to select a date range for filtering events, the dates would get stuck and would reflect the last selected instead of a different start and end. The selection option has been fixed.

  • Multiselect error: when using a function and selecting multiple directories, the regex would only understand one. This has been fixed so multiple options are now processed.

  • 'Add' button in permissions hidden: even though the option would not work if the user was not authorized to make changes, the 'add' button has been hidden from the interface when a user not allowed to modify permissions in a container is on the 'permissions' tab.

  • User API key conflict: when a user does not have an API key assigned, they cannot run functions. Instead of failing, the function would remain on 'waiting' status indefinitely. This has been changed to a failed status and an error message pointing at the API key absence.

  • Message clarity: the warning information in the Move/Copy function has been redacted for clarity.

  • Language label correction: for the over-quota permissions option.

Version 2022.06.0

New and Improved

  • Sort by ID: changes have been implemented so the elastic endpoint sorts items by ID.

  • Storage Class confirmation message: a confirmation message has been added when the Change Storage Class panel is closed.

  • Vulnerability package update: implementation of the latest versions of jQuery UI (v1.13) and Lodash (v4.17.21), as well as the addition of response headers (STS) to increase the protection and security of the platform.

Fixed

  • PRONOM Search: it was not possible to filter file formats by name, as the platform would return no results. This has been fixed. Reported by Ross Spencer (Ravensburger AG).

  • Report rearrangement: displayed available reports in alphabetical order, as well as standardized nomenclature.

  • Redacted pop-up text: the information and warning messages for the Folder Rename function have been redacted for clarity.

  • Inactive users in Users report: these were shown as active even if the user was deactivated. The report has been fixed, showing the correct marker (grey icon) when the user is inactive.

  • API elastic search: the 'must_not' parameter had no effect when using API elastic search. This has been fixed.

  • Container path verification: when using the API, some files would not reach the destination container and were uploaded outside the path, resulting in both data loss and residual storage. Now the path is verified against the destination container so no files are uploaded outside that path.

  • SMPT server port settings: there was a problem defining the port so emails were not being sent when the alert was triggered. This has been fixed.

Version 2022.05.0

New and Improved

  • New report - Storage Policy per Container: a new report has been included in the platform giving an insight into all the containers and their specific storage solution, bucket, and class among other data.

  • Container Quota information: changes have been implemented to indicate if a container has an active quota limit, as well as allow getting that information through the API. Additionally, the report Containers over Quota has also been improved for clearer readability.

  • Containers Over Quota Policy: new section in the configuration panel that controls permissions on containers over the specified quota. This functionality, when active, controls which users or groups can interact with a container once it has reached the quota.

  • Report - Used Space by Container: redesigned the report to show more meaningful fields (like if the container has been soft deleted or not) and removed some redundant ones.

  • Item information fields: added more information to the file's properties panel such as the type of file, the date updated, and viruses (if any).

  • Container search flexibility: now it is possible to find containers by a specific metadata descriptor.

  • Hide versioning: sections related to file versioning are hidden when the type of storage does not allow this information to display.

  • Loading indicators: the interface has been adapted to show loading messages or indicators when a change of storage class is being handled in the background.

  • Performance improvement: changes on endpoint so container search is more efficient.

  • File upload controls: this has been added to better handle errors produced by file uploads.

Fixed

  • Function's log visualization: blank spaces in value fields were being displayed as dashes instead of the actual spaces. The logs now show the proper value between quotation marks as it was input by the user.

  • Duplicated numeric key: when some metadata fields were rearranged, the numeric key would get stuck (i.e. the sequence would read 1,2,3,4,4,6, etc.). This has been fixed so each field displays the accurate numeric order in the displayed list.

  • Search by IECode: correction on the search of containers using the IECode.

  • Undefined variables: fix for the use of undefined variables in file searches.

  • Container details refresh: the container metadata section was not being updated when the details tab was selected, not showing the most recent change. This has been fixed.

  • Upload multipart to S3: bug correction that happened when multipart files were being uploaded to S3.

  • Background reports failing: if a report failed, its label would remain as 'running'. This made the system keep running the same report in each execution, which would fail every time. Issue corrected.

  • Broken links: links to the API documentation were broken, redirecting to 404 error pages. This has been fixed to the right URL addresses. Reported by Ross Spencer (Ravensburger AG).

  • Submission Area link field: when a link-type metadata field was selected for the submission area container template, the box would not display - not allowing the user to input any value. This has been fixed. Reported by Ross Spencer (Ravensburger AG).

  • Ghost deletion of files: the platform was tagging some files as deleted (while they were not actually deleted). The Integrity Audit process detected the problem in the instance, so it caused no data loss, but the information given by the platform was not accurate. The issue has been identified and fixed. Reported by the University of Calgary.

  • General disruption of services: due to a certificate issue, the objects were not being indexed and functions were not available when accessing the platform. The connexion has been fixed and stability improvements have been implemented.

Version 2022.04.0

New and Improved

  • New Performance Diagnostics section: the platform has now, under the Data Analytics tab, a section to visually represent the processes and performance of the different functions and events taking place. The Home tab has also been redesigned and the platform processing and performance chart updated.

  • File Format Health Index: a 0 to 10 scoring has been added to every single format recorded by the platform. This index is based on PRONOM standards and has been implemented in new and pre-existing reports which measure and list file formats.

  • Permissions' interface view: the different options from the permission list were not being displayed correctly and refreshing whenever an option was selected. This has now changed to a static and more pleasant view when interacting with groups and users' permissions, applying the changes through a pop-up window. Reported by Steven NG from TEMASEK.

  • Submission Areas: the description input in the metadata field - in the container metadata schema- is now displayed when such schema is used in a submission area. This allows the submission panel to give more information to the donor and aid them when filing up the provided descriptors.

  • Metadata Download: this option is now available (per file) in formats JSON and CSV, as well as the XML option which was available before.

  • File metadata formats: when a user was importing metadata into a file, it would not specify the accepted formats for the metadata file. It now displays the message "File path: Only files with .csv, .json or .xml files shown here."

  • Link metadata field: this type of metadata value has been redesigned. A menu has been added to show the options for the different types of links supported, which are to a container, to another folder or file, or to an external page.

  • Archival Structure search: the platform now allows the search of archival structures for both name and code separately.

  • Executed functions filter: a date filter has been added to the functions tab in a container view to show a selected interval. It displays the last week's results by default.

  • Performance improvement: the code has been changed and improved so the container list (either by Workflow, by Archival Structure, or by Checked-in views) loads faster.

  • Internal logs dated: a date field has been added to the internal logs produced when a function is run. Now each iteration is tracked more accurately and errors can be targeted and identified easily.

Fixed

  • Functions' permissions: any user was able to launch the bulk metadata function and apply changes, regardless of permissions. This function is now disabled if the user does not have the right permissions for metadata edition.

  • Container's permission policy corruption: users lacking access to restricted nodes and subnodes were able to move containers to such archival structures. Furthermore, that limited user was able to change permissions that were inherited from the structure above. Now, individual users' permissions are taken into consideration when manipulating any archival structure or container with restricted access.

  • API functions permissions restriction: it was possible for a user without permission to use or view functions to retrieve the source code of such functions. This has now been fixed and made available according to the appropriate permissions the user holds.

  • Adjustment in the hashes reports: some changes have been implemented so large-width columns can be handled and visualised easier.

  • Container deletion fix: when a container was deleted, the interface would refresh the view to "Containers by Workflow". The interface now redirects to the Home tab.

  • Container ID report fix: some reports weren't taken the container ID into consideration when they were launched. This has now been fixed.

  • Metadata import error: the system would automatically show a 500 error if a metadata file was not selected. This has been fixed to a Please select a file message.

  • Regex trigger response: the on-demand functions were not answering to specific triggers when the user was interacting through API. This has been sorted, and the warning texts appearing when using the function corrected and updated.

  • JQUERY 3.5.1 update issue: since the previous update, the steps wizard plugin had stopped working properly. The plugin Smart Wizard has been implemented correctly and is now working as expected.

  • Grammatical changes: some warning and platform messages have been redacted to improve clarity and correct typos.

Version 2022.03.0

New and Improved

  • New Delete Group button: instead of restricting permissions to a group by unticking all the boxes, it can now be completely removed through a delete button.

  • New Rename Folder function: renaming folders require all elements inside the folder to be renamed one by one by the platform, which is a time-consuming task. A function has been developed so the platform will perform this action in the background, avoiding interruptions.

  • Boolean metadata field: this type of metadata has been added to the platform as a descriptor option.

  • Language file update: the language labels have been refreshed and updated.

  • Workflows management: a workflow now can only be deleted if it has at least one step registered.

  • Platform's navigation improvement: when interacting with very large archival structures, the page would auto-scroll up whenever a node or subnode's menu was selected. This has now been disabled for easier navigation.

  • Function Build Log label: the arrow to access the build log in the Edit Function panel was not easily found. A label has been added to give more visibility to the log access.

  • Submission Area notifications: the notifications sent via email when a submission area is created have been improved and the code re-structured.

  • Specific error messages: when a metadata value not matching the field type was input, a generic error message would display. The error message has been modified so the field for the incorrect value or values is highlighted in red.

Fixed

  • Administrator permissions: the root administrator (LIBNOVA only user) has been removed from the permissions list as it was clashing with the admin group.

  • File browser issues: the web browser file manager was getting stuck in the loading screen, forcing the user to refresh the browser to access it. This was because an event was being launched before the user could interact with the plugin, blocking their access to it. The fix allows for the event to run in the background without blocking the interaction by the user.

  • Hidden password: passwords have now been hidden from logs when a change is requested.

  • Bulk Metadata function: the metadata descriptor type number was missing and has now been added to the function.

  • Move/Copy function issues: the function was leaving residual empty folders when the Move option was selected. This has been reviewed and fixed.

  • Hashes report queue: when several hash reports were run one after the other, the previous one would stay in pending status without completion. This has now been fixed so a queue is generated.

  • Warning message on empty metadata schemas: no descriptors in a metadata schema was indicated to the user through a warning message. This now has been changed to an "info" type of message.

  • Link metadata field saving: if while editing a metadata field of link type was left empty (with --No Config-- value) and then saved, the system would restore the previous input values. This has been corrected so no values are saved if the user deletes them.

Version 2022.02.0

New and Improved

  • New Effective Permissions Audit report: permissions can now be tracked container by container by running a new report. This report displays information regarding the container id, container name, groups and users who have access to them, and the type of permissions (write and read permissions included).

  • New Quota Container report: this new report will list all containers with the exceeded quota assigned.

  • New Security Audit report: this new report will let the user select two dates and obtain the events happening between them. Those events include user login, user changing node/container permissions, user changing configuration, and user changing their own or another user settings.

  • New Users report: a detailed report of each user in the platform, the group they belong to, their last login time, and their permissions at platform level.

  • New Container Descriptor: added a new type of descriptor for the metadata container. Now link descriptors can be recognized in a container metadata schema.

  • Added information message: if a metadata validation fails, a pop-up message will warn of the error detailing the cause, IECode, DataType, and value. Additionally, an information line has been added if the validation fails using the API to inform the user.

  • Improvements in the Copy/Move function: if no file is selected and this function is run, the whole content of the container will be affected. Now a warning message tells the user this is going to happen as long as they don't select a particular set of files. Additional logs have been added to track the function's progress.

  • Metadata schema imports: now it is possible to upload a metadata schema in JSON format, as well as XML.

  • Function parameter improvement: "file" is now added as a type of parameter when creating custom functions.

  • Improved permissions section: each permission label is now listed along with a brief description, which eases group and user management.

  • Recalculating data: now the system takes into consideration any running and/or pending tasks when recalculating the amount of data ingested, giving more accurate readings.

Fixed

  • Container metadata update: information in the container metadata section was not being saved properly if the screen was refreshed. Now the newly input information is saved and refreshed automatically, avoiding any data loss.

  • Performance display: an error message would take up to 30 seconds to show in-screen if the platform was unresponsive. The fix allows the platform to load in the background not stopping any ongoing processes. If the user tries to interact with the platform and re-run a process that is currently in the background (like editing a field), a pop-up message will inform them of the background processes. The update includes a new code using AJAX instead of running the CURL in PHP.

  • Functions permissions restriction: it was possible for a user without permission to use or view functions otherwise restricted. This has now been fixed and made available according to the appropriate permissions the user holds.

  • General code fixes and improvements in parameters, permission fields, and bulk action errors.

Version 2022.01.0

New and Improved

  • Function performance message: when a function can not be created in nuclio, a pop-up message is displayed with further information. After the update, a pop-up message is displayed if Nuclio is not functional, warning the following: "Error inserting function to nuclio".

  • Ingestion Metadata Descriptor: the descriptor "LS_ingestion_status" was added to the platform to monitor through logs the status of ingestion between LIBSAFE Advanced and Flexible Intake. It also allows bulk metadata editions (Advanced and FI clients only).

  • Cookie policy: changed the default cookies to lax (more information here).

Fixed

  • Metadata fields: when a metadata value was deleted erasing the information instead of the field, the changes for those particular fields would not be saved. Now deleting the information is enough for the metadata field to be removed.

  • Function Copy/Move: when using the Copy/Move function, the destination could only be chosen with an auto-populated list. After the update, you can both use the auto-populated field or type the destination address.

Version 2021.11.0

Fixed

  • API Docs broken link: the option "try it now" was adding characters to the URL and causing errors. This has been corrected appropriately, recording the available options in a separate widely available document.

Version 2021.10.0

New and Improved

  • Information disclosure protection: implementation of Content Security Policy and headers against external petitions.

  • Nginx implementation and .htaccess removal: increasing performance and deleting files detected as vulnerable automatically.

  • Safe passwords: implementation of safe password policies, allowing only those with a minimum of 12 characters, at least one digit, at least one capital letter, and at least one special character.

  • Cookie update: register of cookies sent to secured connexions implemented when HTTPS is active.

Fixed

  • Identifier saves: when managing a user, pasting an identifier in the correct field and pressing the 'Save' button was not working correctly, as the system needs for the user to press enter for a correct save. A message below the identifier box has been added detailing the extra step needed to save the new or modified entry.

  • Duplicated users and groups: users and groups with the same ID were being displayed duplicated in the container and archival node permissions sections. This was just a visual error and has been fixed.

  • Smart Wizard implementation: after the last update to jQuery v3.5.1 library, the step plugin was not working correctly. It has been replaced with jQuery Smart Wizard.

Version 2021.09.0

New and Improved

  • SAML identification: changed the configuration so now it is easier to log in to the platform using the institution credentials.

  • Reports permissions: the access to reports was given as a whole, where the user could run any regardless of their individual permissions across the platforms and archival structures. Now there is more control, allowing the user or group to access reports specified in their individualized permissions, rather than all of them.

  • Submission areas: this tool has been modified so it can also be created through S3. Other improvements have been added like check-in, content template, and container templates options.

  • Transfer Connector integration: created the Transfer Connector integration from LIBSAFE to OpenAccess, defining profiles for the different formats handled. More information on the Transfer Connector and its changelog here.

  • New Copy/Move function: parameters have been defined so a specific container and route can be defined, as well as selecting between copying or moving those files to the new destination. This function allows running this process in the background, assuring no interruptions depending on the user's connection.

  • Escrow backups and exports: overall improvement in the backup process. Exports in XML formats are now possible, as well as supporting the compression in gzip. Faster backups from more secured connections and errors are thoroughly documented in an internal log for easier troubleshooting.

  • Hash params: this value is now mandatory when using this API endpoint and handling hash algorithms, so it returns a more accurate list of the hashes in one or more containers (more information here)

  • Custom functions improvement: implementation of a 'build command' option to easily install nuclio dependencies when creating custom functions.

Fixed

  • Filename and path search: fixed the search engine so results are narrowed down to the exact input (whether it is a file, path, extension, or portion of the path), to get accurate results.

  • Interface timeout: a server error was common when the user did not interact with the platform for a short period of time. This has been fixed, allowing more idle time per session.

  • Container permissions: when a user was creating a node, they could not assign permissions. This would risk being locked out of the newly created archival structure. Now permissions can be selected at the time of node creation and a warning message has been added to remind the user to assign themselves to the permission list.

  • Restore and delete filed versions: restricted these options for users who have write permissions on.

  • File format on restored items: when a file was restored, the system would rename it adding the timestamp but ignoring the format. This has been fixed so the file can be opened.

Version 2021.08.0

New and Improved

  • Metadata schema creation: a metadata schema can be imported in both JSON and XML formats, instead of having to create each field manually from the platform.

Fixed

  • Container creation: creating a new container failed under certain circumstances. This was addressed and fixed.

  • User permissions: the "Run Function" user permission was not available for all deployments.

Version 2021.07.1

New and Improved

  • Jupyter Notebooks integration: now you can preserve Jupyter notebooks with your data, opening and executing them right from the platform. This allows you to keep your data and the code that reads and analyses it together, which is not only convenient but a great way of documenting your data and creating provenance metadata.

  • Bag-it fetch function: if you have bag-it packages that contain a fetch.txt, and you would like the platform to download them all to your container, you just need to execute the Bag-it Fetch function.

  • Bag-it 0.97 and 1.00 creation function: if you have a folder with some content, and you would like to create a Bag-it from it, you just need to execute the Bag-it creation function. This function brings all your data together in a bag and creates required structures and manifests. Compatible with 0.97 and 1.00 bags.

  • Bag-it 0.97 and 1.00 validation function: when you have uploaded your bag-its to a container, this useful function performs validation for it for all its mandatory elements and its integrity/fixity manifests.

  • Platform's functions performance improvement: we have been working on improving the functions' performance when executing more than 960 functions in parallel. Performance was really good when the load was below 960 functions but was reduced due to a permissions-related overhead when over it. We have removed this limitation and tested it with 15.012 parallel functions with great performance (fifteen thousand plus).

  • General stability and performance improvements in several areas: we are ingesting 50 million files and 1PB of data as fast as possible on the platform. While testing, we saw several improvement areas in queue management and function execution. Time in the queue for the event/file has been reduced from 0,016 seconds (16 milliseconds) to 0,009 seconds (9 milliseconds), and function execution delay, which was over half a second is now usually below 0,0012 seconds (1,2 milliseconds).

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • When managing workloads over 2 million files, some parts of the management interface become slow or fail. We are working on it.

  • HTTP method has been disabled for the Management interface and the API.

  • While we test the XrootD integration, we are running it behind a reverse proxy. This makes the XrootD server to request the authentication details TWO times, instead of one. This will be removed when solved in the future.

Version 2021.07.0

New and Improved

  • Platform's functions: new functions let you run code in the platform itself, in response to certain events or triggers, without needing any client-side script. Functions are useful when you want the platform to behave in a specific way in response to internal or external events, or when you want to add your own code to be executed on demand (typically by yourself). See functions.

  • File Power Search API: the file search capabilities have been increasing based on the use cases and requests we have received. See Advanced API File Search.

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • When managing workloads over 2 million files, some parts of the management interface become slow or fail. We are working on it.

  • HTTP method has been disabled for the Management interface and the API.

  • While we test the XrootD integration, we are running it behind a reverse proxy. This makes the XrootD server to request the authentication details TWO times, instead of one. This will be removed when solved in the future.

Version 2021.06.0

New and Improved

  • Storage classes: the integration that allows the user to do managed transitions over multiple types of storage has been completed. Users can now easily transfer content to and from AWS S3 Standard, Glacier Deep Archive, etc.

  • Automated storage classes transition: users can use lambda functions, the API, or scheduled processes to trigger automated storage migrations. The platform itself will schedule some of them, allowing the user for instance to set the default storage to Deep Archive for a Data Container, but keep the objects in S3 Standard for a few days in order to process them.

  • Federated authentication automated account creation: the federated authentication engine has been improved, so now it automatically creates the accounts for new users that try to log in to the platform, if the Identity Provider they are coming from is within a list of authorized IdPs.

  • Federated authentication automated permission assignment: when a new user account is created using the federated authentication, it is now possible to create SAML attributes-based rules, that make users automatically belong to a certain group. For instance, if the user attribute indicates that they belong to a certain group, the user in the platform will also be included in a group.

  • Federated authentication wire-tapping: diagnosing SAML-related issues is really hard. We have added a function to the platform that allows the administrators to get every authentication-related event, with all its details. It is now possible to see what the SP is sending to the IdP, what is received from the IdP, etc.

  • Multiple federated authentication identities per user: in the previous version, it was possible to just add one federated identity to a platform's user. Now it is possible to have multiple identities associated with the same account, making it possible for users with multiple federated accounts to use any of them to log in.

  • Improved bag-it validation functions: we are improving the Bag-it validation rules to increase performance and scalability. This was suggested by Manuel Delfino (Port d'Informació Científica PIC).

  • Shared files: we are improving how files are publicly shared with an improved engine, following the ideas and suggestions that Sergey Yakubov (Deutsches Elektronen-Synchrotron DESY) sent us.

Fixed

  • Container deletion: users other than administrators were unable to delete containers. Now everyone (with proper permissions can). This was discovered and reported by Sergey Yakubov (Deutsches Elektronen-Synchrotron DESY).

  • Corruption of the permissions policy: under certain circumstances, the component that applied the security policy over the AWS S3 buckets failed for pre-update users. Now the component takes it into consideration and is applying them correctly. This was discovered and reported by Tibor Šimko (European Organization for Nuclear Research CERN)

  • Problem on sharing content: when a file was publicly shared by a user with specific permissions, the platform failed to pre-sign the URLs for the requests due to a permissions inconsistency. Now URLs are pre-signed correctly. This was discovered and reported by Tibor Šimko (European Organization for Nuclear Research CERN)

  • File download 404 error: under certain circumstances a user was unable to download files using the API, even if they had enough permissions to do so, leading to a 404 error. This has been corrected. This was discovered and reported by Tibor Šimko (European Organization for Nuclear Research CERN).

  • Intermittent S3 permissions problem: during an upload process, if a policy is re-applied to the bucket, connections may be interrupted. This was handled gracefully by the platform, but we have improved the process to cause less inconvenience. This was discovered and reported by Justin Clark-Casey (The European Bioinformatics Institute EMBL-EBI).

  • Bag-it validation function: the container-wide Bag-it validation function failed when launched over a large number of bags (over 20.000). We have improved its logic. This was discovered and reported by Manuel Delfino (Port d'Informació Científica PIC)

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • When managing workloads over 2 million files, some parts of the management interface become slow or fail. We are working on it.

  • HTTP method has been disabled for the Management interface and the API.

  • While we test the XrootD integration, we are running it behind a reverse proxy. This makes the XrootD server to request the authentication details TWO times, instead of one. This will be removed when solved in the future.

Version 2021.05.1

New and Improved

  • AWS S3 first-level integration completed: the integration of S3 as a first-class storage platform in the platform has been completed. Uploads and downloads can be routed to AWS S3 in order to get massive scalability and parallelization capabilities. Note that the following has changed:

    • The access key/private key to access the service has changed. Check your scripts. A new method to retrieve them programmatically has been created.

    • How containers are accessed when using S3 has changed. The path was container\file, and now it is bucket\container\file. New documentation exists to retrieve it programmatically or using the management interface.

    • File separators have changed to be consistent in the future. Path names are built using the standard "/" instead of the previously-used "\".

    • In order to massively improve performance, the approach of uploads has changed. It was: As every file gets uploaded, it is processed (the uploading capability was limited by the existing infrastructure). We have changed the synchronous process to an asynchronous one. Now the upload and processing processes are totally independent (uploads are now way faster, but the platform provides consistency progressively). We have created a widget on the home page to track the containers with active processes.

  • Performance improvement in public sharing: in order to benefit from the AWS S3 scalability, shared URLs are now redirected to the S3 using a pre-signed URL. When the anonymous user asks for a file, the platform checks if the file is publicly shared. If it is, it creates a signed request to AWS and redirects the user to the S3 URL. This way, public sharing has virtually no limits, even for huge workloads and concurrent users.

  • Kubernetes-based first-level integration completed: every platform component has been running to a Kubernetes architecture. Every component is now running inside a Kubernetes pod, so we can scale them up and down as needed.

    • An undesired outcome is that the XrootD hostname has also changed (check it in the Management Interface > Access methods or using the /api/user/internal/access_methods API method. We are working on improving this.

  • Improved parallelization in hashing workload: the hashing component (in charge of creating the MD5, SHA1, etc), has been rearchitected to better work in parallel and to be ready to auto-scale for huge workloads.

  • Improved parallelization in characterization: the characterization component has been modified to be able to work with object-based streams instead of the previous filesystem.

  • Improved API documentation: the API documentation has been improved, adding methods, removing the deprecated ones, and ensuring all parameters are documented.

  • New lambda function: CSV validation.

  • New lambda function: format conversion demonstration code.

  • New lambda function: source code language detection.

  • New lambda function: Bag-it repackage. If a Bag-it structure is modified in the platform (adding a file to it for example), this function re-generates the bag-it structure to include new files.

  • Platform's documentation (this site): new sections created, examples, etc.

  • New documentation section on OAIS and ISO16363 adoption: David Giaretta has created a great documentation piece to help users to understand OAIS and apply the ISO 16363 to a platform-based Archive.

Fixed

  • Database connections exhaustion: when launching workloads over a million files, too many processes were connected to the database servers, exhausting connections. We have adopted a queues-based architecture and made the database to auto-scale up to 128 nodes.

  • S3-events management: when launching workloads over a million files using S3, the AWS events of new files were received in parallel in huge amounts. This collapsed the service in charge of processing them. We have been testing and we finally adopted a new architecture to handle them.

  • API method /api/file/path/ was not properly working: the method was not working since the "\" to "/" change in the file-name separator.

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • When managing workloads over 2 million files, some parts of the management interface become slow or fail. We are working on it.

  • HTTP method has been disabled for the Management interface and the API.

  • While we test the XrootD integration, we are running it behind a reverse proxy. This makes the XrootD server to request the authentication details TWO times, instead of one. This will be removed in the future.

Version 2021.05.0

New and Improved

  • Improved public sharing: based on several users' feedback, the "public" sharing process has been improved. It is now possible to share/un-share multiple files way easier and the new development allows future sharing methods to be available using the same interfaces.

  • New public sharing URLs: the naming schema for the shared URLs has been changed, to prevent a malicious user to detect shared links.

  • Improved API documentation: the API documentation has been improved, adding methods, removing the deprecated ones, and ensuring all parameters are documented.

  • Improvement on file name-based search: search has been improved to better sorting when doing file name searches.

  • New API site: the product API has been published in a unified site (it was unique for each instance before). Now we can link API methods in the product documentation.

  • Platform's documentation (this site): new sections created, examples, etc.

Fixed

  • S3 session exhaust: S3 sessions/slots were not closed properly when using certain clients, leading to session exhaustion that made the transfers slower and slower, eventually failing. This was reported by Justin Clark-Casey from EMBL-EBI.

  • Linked metadata was not displayed properly: "link" type metadata, was not properly included in all queries, or it did inconsistently. Now it works as expected. This was reported by David Giaretta.

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • The platform needs from 0.3 to 2.2 seconds from when a file is uploaded to the point it is returned as a search result when using the advanced search or API interface. Notices have been included in the documentation and a solution is being researched. Reported by Justin Clark-Casey (EMBL-EBI) and Jakub Urban (CERN)

Version 2021.04.1

New and Improved

  • API response codes: the API answers are now delivering better error codes. This is based on Jakub Urban's (CERN) suggestion.

  • Improved API documentation: the API documentation has been improved, adding methods, removing the deprecated ones, and ensuring all parameters are documented.

  • Improvement on reports for big volume data: when the platform was loaded with over 5-6 million files, some reports started to slow down. This has been improved.

  • Data container templates: every time a container was created, the user needed to define the (growing) list of configuration parameters (metadata schema, policies, etc). With this feature, it is now possible to create templates, thus pre-defining several settings and making container creation easier.

  • API support for data container templates: files can be publicly shared using the API. Endpoints have been created to list shared files and to get their endpoints.

  • Data container templates policies: the node administrator is now capable of enforcing the use of a data container template for all containers created in a node, enforcing this way a certain policy for users.

  • Data container templates policies API support: templates can be created and applied using the API.

  • Public share: all-access restriction configuration was governed by user permissions, which required every person using the system to have an account. With this new feature, it is possible to share files, folders, or whole containers for them to be publicly accessible, even for users without an account.

  • Improved workflows editor: the workflow editor has been improved. Now, when editing a step, the user is not returned to the general view.

  • New help section: this documentation and the API doc are now available from within the platform (question mark top right in the interface).

  • Platform documentation (this site): new sections created, examples, etc.

Fixed

  • API support request integrated with the LIBNOVA support portal: queries made when using the API are now routed to the LSP.

  • Pressing enter canceled metadata creation: when editing a metadata field, the Enter key closed the form without saving the changes. Enter key now saves the changes and closes the dialogue.

  • 'Next' and 'Previous' buttons are not visible when using the bulk metadata editor: they are now visible.

  • Error when assigning permissions to a non-existing archival node using the API: now, an error is returned.

Operations

  • Improved connectivity: the 2x100 Gbps network link between the LIBNOVA Frankfurt datacentres and Geant is now live. It was 2x20 Gbps.

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • XrootD has no support for white spaces in the file names. Files with spaces in the file name are NOT available when using XrootD protocol. LIBNOVA has requested the XrootD team to consider this for a future release.

  • XrootD used authentication module (user/pass) has no support for username/password-less accounts. anon/anon needs to be used until we find a workaround.

  • The platform needs from 0.3 to 2.2 seconds from when a file is uploaded to the point it is returned as a search result when using the advanced search or API interface. Notices have been included in the documentation and a solution is being researched. Reported by Justin Clark-Casey (EMBL-EBI) and Jakub Urban (CERN)

Version 2021.04.0

New and Improved

  • Report notifications: when a scheduled report is executed by the system, an email is sent to the user to inform them about it.

  • Warning on creating a restricted container: when a container is created in a way so that permissions allow no one to access it (the creator user removes itself from the permissions), a warning message is shown.

  • Search now works also for file and pathnames: the search interface worked for metadata and content but was not able to match filenames and paths. Now the user can search for them.

  • Improved API support for metadata-related operations: several nested queries were needed for the user to update metadata using the API. Now metadata can be changed using the metadata fields' friendly names.

  • More granular API methods for metadata: when the user needed to update an item's metadata, the platform cleared all existing metadata and inserted the newly included one. Now, the user is able to indicate metadata operations in the query (ADD, REPLACE, DELETE).

  • Federated identity: now users can use their respective Identity Providers to log in to the platform.

  • Multi-user file browser in the management interface: due to simple multi-session handling in the previous platform's version, only one user could work in a container at a given time (check-in/out process). We are keeping the previous feature as it is also great for certain use cases, but we are making it now optional. Multiple users can work in a single container at a given time now.

  • Containers quota: it is now possible to specify a per-container storage quota.

  • XrootD protocol support: it is now possible to use XrootD protocol for accessing the containers (in read-only mode for now). User permissions to containers are retained.

  • Md5sum manifest generation function: it is now possible to use a new function to generate a md5sum manifest for the selected content (using the interface or the API). This is useful for the users downloading with the management interface, as they can now verify the integrity of the downloaded content easily.

  • Platform documentation (this site): new sections created, examples, etc.

Fixed

  • Formats and files dashboard chart froze: under certain circumstances, the chart in the dashboard froze. Fixed.

  • Permissions problem: in the previous version, it was not clear how to select users and groups to assign them as permissions when editing a container. It worked, but it was not intuitive. We have improved the UX and now the user understands much better how to do it.

  • Character "+" is now allowed in the S3 uploads: the "+" character was not allowed for files when uploading them to S3 in the previous version. Now it is possible to use them.

Known limitations and issues

  • Many of the new integrations are in alpha status.

  • Right-click menu in the Management interface immediately after dragging and dropping a file to a data container does not work sometimes. Refresh with ctrl+F5 or with the circled arrow icon next to the folder name in the container.

Version 2021.03.1

New and Improved

  • Functions prototype: initial implementation for the lambda (serverless) functions working in a distributed environment. Lambda functions allow the users to define in the platform how they want their content to be processed. This implementation only runs actions by demand. Additional support will be added in the next four releases.

  • New unified jobs management data structure: the concept of jobs allows the platform to have a scalable and parallelized distribution of workloads while keeping all processing elements orchestrated. After researching existing architectures on high throughput and scalable platforms, we have adopted a jobs management architecture for the platform. This release includes the first implementation.

  • S3 partial uploads support: in order to increase the up/download capabilities, S3 multipart uploads are now supported to parallelize workloads, this includes the capability to see orphan parts in order to complete o discard them. Related to this topic, we have discovered a potential improvement in this area that we are not finding in the existing community code base: taking the benefit of using multi-part hashing algorithms (not possible in mono-thread/file traditional hashing algorithms, but possible in newer ones) to hash uploaded parts in advance, greatly decreasing the resources used in this task by coupling it with the upload process, while the bit-stream is still in memory.

  • Search API: in order to support multiple search-related use cases, support for sending complex queries using the API has been included. We have been researching existing domain-specific languages and measuring adoption for them in the research community. We have been researching how to overcome the challenges of permission-based access and search results and identifying best practices, that are now part of the platform.

  • IECODEs API initial support in the add metadata using GET and POST metadata API methods: it was possible to add metadata to objects using the metadata id, but a query was needed by the user to get the metadata id for a file, based on the field code or descriptor (IECODE). This was not aligned with the use described for some of the use cases. We have been researching how other platforms approach to this and the API now supports using metadata id but also IECODEs when sending the request, making it easier for the user to create them.

  • Multihash: the platform is now capable of keeping more than one file hash for a given item. This brings support for several use cases used in the research environment in which using parallelizable hashing algorithms (when working with large datasets) is relevant. Data structures exist now, the interface shows any available hash and the API can be used to query and load for them. Logic has been added to support Adler-32 out of the box in addition to MD5, SHA1, SHA256, and SHA512. We have been researching reference architectures and algorithm implementations in several languages. The architecture allows fully parallelized multithread (for the algorithms supporting it) generation of hashes.

  • Platform's documentation (this site): new sections created, examples, etc.

Fixed

  • Unstructured and objects folders: it was possible for certain users to delete the Unstructured and Objects folders. The users' feedback claims this data structure is confusing, so we are going to remove it in a future release. But while it is still in the product, we solved this problem. Reported by Manuel Delfino (PIC). Fixed.

  • Work queue discarded elements: under some circumstances (over 8KB messages) the work queue was discarding some tasks and leaving them unprocessed. Fixed.

  • S3 endpoint performance degradation: under load (over 682MB/s for a multi-stream upload), the platform was not autoscaling S3 nodes up, which limited the upload capability. Fixed.

Known limitations and issues

  • Problems using files with the "+" (and other) characters using S3 protocol.

  • Many of the new integrations are in alpha status.

  • Right-click menu in the management interface immediately after dragging and dropping a file to a data container does not work sometimes. Refresh with ctrl+F5 or with the circled arrow icon next to the folder name in the container.

Version 2021.03.0

Fixed

  • Platform's API key not showing for new users. Fixed.

Known limitations and issues

  • Problems using files with the "+" (and other) characters using S3 protocol.

Last updated