> For the complete documentation index, see [llms.txt](https://docs.libnova.com/labdrive/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.libnova.com/labdrive/data-curation-and-preservation-1/preservation-activities/adding-representation-information/structural-representation-information.md).

# Structural Representation Information

## Information structural Representation Information

A Data Object may be described down to the bit level in a text document, by which we mean they are meant for human use rather than a computer being able to use it (this may change as computers using deep learning to truly understand the text).

As an example the Internet Engineering Task Force standards, on which much of the Internet is based, is documented in [RFCs](https://www.ietf.org/standards/rfcs/). For example the RFC describing ASCII format for Network Interchange is [RFC20](https://datatracker.ietf.org/doc/html/rfc20), which includes a table, constructed using text characters, to show how ASCII alphanumeric characters and a number of control characters should be encoded in bits.

Other RFCs include complex diagrams constructed using ASCII characters, for example [RFC5755](https://www.rfc-editor.org/rfc/rfc5755.txt).

Other text descriptions are for example PDF documents such as for [SOHO](https://sohowww.nascom.nasa.gov/publications/soho-documents/ICD/icd.pdf), or indeed paper documents in the case of older instruments.

## Formal structural Representation Information

Several formal description languages are available which allow the description of Data Objects down to the bit level in a way which can be used by computers.

### EAST (Enhanced Ada SubseT)

This is a [CCSDS and ISO standard language](https://public.ccsds.org/Pubs/644x0b3.pdf) which allows data to be described. There is a support website for EAST Based Access Tools – the [BEST toolkit](http://debat.c-s.fr/), and [other tools](http://vds.cnes.fr/) supported by the French Space Agency. An example complex structure for a communications package which can be described by EAST is shown below.

![](/files/KdK4O7aut2uOTcXNurIH)

&#x20;[CDPP (Plasma Physics Data Centre)](http://www.cdpp.eu/) uses EAST to describe and provide access to and combine many types of data, and provides many examples.

### Data Resource Broker (DRB)

[DRB ](http://www.gael.fr/drb/)is is an Open Source Java application programming interface for reading, writing and processing heterogeneous data.

### Data Format Description Language (DFDL)

[DFDL ](https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl)is a modeling language from the Open Grid Forum that allows description of text, dense binary, and legacy data formats in a vendor- neutral declarative manner. DFDL is an extension to the XML Schema Description Language (XSD).

DFDL is a way of describing the data. It is not a data format. DFDL should be able to describe many data formats, including:

* Textual and binary
* Commercial record-oriented
* Scientific and numeric
* Modern and legacy
* Industry standards

An open source interpreter is available as [Apache Daffodil](https://daffodil.apache.org/)

DFDL is used to describe various scientific data files, particularly those used in Earth Observation, by ESA in the Standard Archive Format for Europe ([SAFE](https://earth.esa.int/eogateway/activities/safe-the-standard-archive-format-for-europe)).

### LABDRIVE Automated File Format Identification <a href="#file-format-identification" id="file-format-identification"></a>

On ingestion LABDRIVE uses a number of tools  (see <https://public.docs.libnova.com/labdrive/api/#/PRONOM>) that enable the automatic identification of the file format of a particular file, typically by examining file signatures, usually the first few bytes of the file.

![](/files/CmybjdoUb4js1WCv0fPQ)

​The file formats are identified using [PRONOM ](https://www.nationalarchives.gov.uk/PRONOM/)identifiers, however PRONOM provides very little information about the format as illustrated in the following.

![](/files/BFDZWsJ0lA5RmigO5HnC)

The format identification allows the file to be displayed but does not normally identify the various components to which semantics may be attached.

### Tools to create Structural Representation Information

* the [BEST toolkit](http://debat.c-s.fr/), and [other tools](http://vds.cnes.fr/)&#x20;
* [Apache Daffodil](https://daffodil.apache.org/)
* DROID <https://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/>
* ASN.1 (<https://www.itu.int/en/ITU-T/asn1/Pages/asn1_project.aspx>)&#x20;


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.libnova.com/labdrive/data-curation-and-preservation-1/preservation-activities/adding-representation-information/structural-representation-information.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
