File Operations

File operations

Without any database access data files may be converted in the following directions:

Convert SDD file: Read an XML file according to the SDD schema 1.1 rev 5 and generate DELTA or EML files.

Convert DELTA file: Read DELTA text file(s) and generate SDD or EML files.

Additionally XML file check tools are available:

Check SDD file: Check if a text file is an XML file according to the SDD schema 1.1 rev 5.

Check EML file: Check if a text file is an XML file according to the EML schema 2.1.1.

Jan 14, 2025

Subsections of File Operations

Check EML

Check EML file

With this form you can check if an XML file is compliant to the EML2.1.1 or EML 2.2.0 schema. Choose Data -> File operations -> Check EML file … from the menu. After opening the window shown below the schema files will be automatically loaded. You may select the schema that shall be used with combo box Schema version (see image below). Starting with DiversityDescriptions v. 4.3.5 the EML export will be done using EML 2.2.0.

 

In the window click on the button to select the file you want to check. The check results will be diplayed in the center part of the window. By clicking the reload button  you can start a new check (see image below).  

 

May 3, 2024

Check SDD

Check SDD file

With this form you can check if an XML file is compliant to the SDD 1.1rev 5 schema. Choose Data -> File operations -> Check SDD file … from the menu. After opening the window shown below the schema files will be automatically loaded.

 

In the window click on the button to select the file you want to check. The check results will be diplayed in the center part of the window. If you generated a SDD file using Diversity Descriptions with deactivated Comptible option, the check result may show warnings for elements with missing schema information. You may check the option Include specific schema extensions, then the Diversity Descriptions specific schema definitions will be included. By clicking the reload button  or selecting another file you can start a new check (see image below).  

 

May 3, 2024

Convert DELTA

Convert DELTA file to SDD or EML

With this form you can directly convert data from a file in DELTA format into an XML file according schema SDD 1.1 rev5. No connection to a database is needed for the conversion. Choose Data -> File operations -> Convert data file → DELTA to SDD … from the menu to open the window. In the window click on the button to select the file with the data you want to convert. If the Multi-file option is selected before pressing the button, a folder selection window opens to select the folder where the DELTA files are located. For muti-file processing currently the files “chars”, “items”, “specs” and “extra” are evaluated. If during analysis any problem occurs, you may click on the button to reload the file and re-initialize the window.

 

The contents of the file will be shown in the upper part of the File tree tab page. If special characters are not displayed corretly, try a different Encoding setting, e.g. “ANSI”, and reload the document using the button.

The Insert “variable” state controls the handling of the DELTA state “V” for categorical summary data. If possible, a categorical state “variable” is inserted to the descriptor data and set in the summary data, when the state “V” is present in the description data.

If the Check strings for illegal characters  option is checked, all string literals that shall be exported from database are scanned for illegal non-printable characters and matches are replaced by a double exclamation mark ("‼"). Activating this option may increase the analysis processing time.

In the file tree you may deselect entries that shall not be converted. Use that option very carefully, because if you deselect entries that are being referenced by other parts of the input tree, e.g. descriptors referenced by descriptions, the analysis step might become erronous!

If during reading of the files expressions cannot be interpreted, suspicious entries are maked with yellow background (warning) in the file tree. When you move the mouse curser over the marked entries, you get additional information as tool tip or the tree node text itself tells the problem (see example below).  

 

 

Analysis

To analyse the data in the file click on the Analyse data button. During the analysis the program checks the dependencies between the different parts of the data and builds up an analysis tree in the lower part of the window. The analysis tree contains all data in a suitable format for the final step. During data analysis the icon of the button changes to  and you may abort processing by clicking the button. 

In the Analysis settings section (see image below) you set the document’s Language. You man change the display and sorting of the entries in the Language combo box from “<code> - <description>” to “<description> - <code>” (and back) by clicking the button . If you need language codes that are not included in the list, click the button. For more details see Edit language codes.

The Insert “variable” state controls the handling of the DELTA state “V” for categorical summary data. If possible, a categorical state “variable” is inserted to the descriptor data and set in the summary data, when the state “V” is present in the description data.

If the Check strings for illegal characters  option is checked, all string literals that shall be exported from database are scanned for illegal non-printable characters and matches are replaced by a double exclamation mark ("‼"). Activating this option may increase the analysis processing time.

In DELTA text in angle bracket (<text>) usually denotes comments, which are by default imported into the “Details” fields of the database. In the lower parts of the Analysis settings you may adjust a different handling for description, descriptor and categorical state items. 

  • For DELTA comments in descriptions you may Move comments to details (default) or Keep comments in description titles.
  • For DELTA comments in descriptors you may Move comments to details (default), Move comments to notes or Keep comments in descriptor titles
  • For DELTA comments in categorical states you may Move comments to details (default) or Keep comments in categorical state titles.

After changing one of these settings click on the Analyse data button to make the changes effective.

After analysis a message window informs you if any warnings or errors occured. You can find detailled error and warning information at the file and/or analysis trees by entries with red text (error) or yellow background (warning). When you move the mouse curser over the marked entries, you get additional information as bubble help or the tree node text itself tells the problem (see example below). By clicking on the status text besides the progress bar, you can open an analysis protocol (see below, right). 

 

If an analysis error occured, you are not able to proceed. You will first have to correct the problem, e.g. by excluding the erronous descriptor in the example above (after reloading the file). If a warning occured, it might not cause further problems, but you should take a closer look if the converted data will be correct.

 

Write data

Pressing the Generate file button in the Write SDD group box opens a window to select the target XML file. By default the target file has the same name as the DELTA file, followed by the extension “.xml”. The Comptible option controls generation of files with most possible compatibility to the SDD standard. On the other hand some data might not be present in the generated file, if this option is activated.

As an additional option you may generate file according the EMLschema, which consists of a data table (tabulator separated text file) and an XML file that contains the metadata including column descriptions. Click on the Generate file button in the Write EML group box. The generated file names will have the endings "_EML_DataTable.txt" and "_EML_Metadata.xml".

Pressing the drop down button EML settings in the Write EML group box opens the EML writer options. You can chose to include a special sign for empty column values or set the columns values in quotes (see left image below). Furthermore you may shose the column separator (tab stop rsp. comma) an decide if multiple categorical states shall be inserted as separate data columns. If you already generated EML files, the used settings will be automatically saved and you may restore them using the option Load last write settings. Finally click button EML settings to close the option panel. 

 

 

Handling of special DELTA states

In the DELTA format the special states “-” (not applicable), “U” (unknown) and “V” (variable) are available for categorical and quantitative characters. These states are treated in the folloging manner during import:

  • -” (not applicable)
    The data status “Not applicable” is set.
  • U” (unknown)
    The data status “Data unavailable” is set.
  • V” (variable)
    The data status “Not interpreterable” is set.

 

Jan 14, 2025

Convert SDD

Convert SDD file to DELTA or EML

With this form you can directly convert data from a file in XML file according schema SDD 1.1 rev5 into a DELTA file. No connection to a database is needed for the conversion. Choose Data → File operations -> Convert data file → SDD to DELTA … from the menu to open the window. In the window click on the button to select the file with the data you want to convert. If during analysis any problem occurs, you may click on the button to reload the file and re-initialize the window.

 

The contents of the file will be shown in the upper part of the File tree tab page. In the Analysis settings part you find the documents’ default language. If additional laguages are contained in the document, you may select one of them as the new language of the DELTA file. By checking Import translations you select all additional document languages for the analysis step. This option is automatically pre-selected if more than one language has been found in the file. In the bottom part of the window you find the actual processing state.

In the file tree you may deselect entries that shall not be imported into the database. Use that option very carefully, because if you deselect entries that are being referenced by other parts of the input tree, e.g. descriptors referenced by descriptions, the analysis step might become erronous!

 

Analysis

To analyse the data in the file click on the Analyse data button. During the analysis the program checks the dependencies between the different parts of the data and builds up an analysis tree in the lower part of the window. The analysis tree contains all data in a suitable format for the final step. During data analysis the icon of the button changes to  and you may abort processing by clicking the button. 

 

After analysis a message window informs you if any warnings or errors occured. You can find detailled error and warning information at the file and/or analysis trees by entries with red text (error) or yellow background (warning). When you move the mouse curser over the marked entries, you get additional information as tool tip or the tree node text itself tells the problem (see examples below). By clicking on the status text besides the progress bar, you can open an analysis protocol (see below, right). 

  

If an analysis error occured, you are not able to proceed. You will first have to correct the problem, e.g. by excluding the erronous descriptor in the example above (after reloading the file). If a warning occured, it might not cause further problems, but you should take a closer look if the converted data will be correct.

 

Write data

Pressing the Generate file button in the Write Delta group box opens a window to select the target delta file. By default the target file has the same name as the SDD file, followed by the extension “.dat”. The Comptible option controls generation of files with most possible compatibility to the DELTA standard. On the other hand some data might not be present in the generated file, if this option is activated.

As an additional option you may generate file according the EMLschema, which consists of a data table (tabulator separated text file) and an XML file that contains the metadata including column descriptions. Click on the Generate file button in the Write EML group box. The generated file names will have the endings "_EML_DataTable.txt" and "_EML_Metadata.xml".

Pressing the drop down button DELTA settings in the Write DELTA group box opens the DELTA writer options. You can chose to include some detail text and notes in the DELTA output (see left image below). For descriptions, descriptors or categorical states the details will be appended as DELTA comments (included in angle brackets “< … >”) to the respective titles. The notes will be appended as DELTA comments of the corresponding summary data. If you already generated DELTA files, the used settings will be automatically saved and you may restore them using the option Load last write settings. Finally click button DELTA settings to close the option panel. 

Pressing the drop down button EML settings in the Write EML group box opens the EML writer options. You can chose to include a special sign for empty column values or set the columns values in quotes (see right image above). Furthermore you may shose the column separator (tab stop rsp. comma) an decide if multiple categorical states shall be inserted as separate data columns. If you already generated EML files, the used settings will be automatically saved and you may restore them using the option Load last write settings. Finally click button EML settings to close the option panel. 

 

Handling of special sequence data

While SDD can handle molecular sequence data, for DELTA export these data will be exported as text data. To preserve the sequence specific descriptor data, they will be inserted into the text character as a special comment with the format, e.g. “#6. Sequence descriptor <[SequenceCharacter][ST:N][SL:1][GS:-][/SequenceCharacter]>/”.

If the analysis tree includes sample data, they will be included as items at the end of the DELTA file. The naming of those spetial items will be <description name> - <event name> - Unit <number>. Sampling event data will not be included in the DELTA file.