Diversity Agents

Import Export

The chapter below refers to DiversityCollection but is valid for DiversityAgents as well. The CacheDatabase is accessed via DiversityCollection

An overview of some options for the im- and export in the module DiversityCollection is shown in the image below. The options Importwizard, Exportwizard, Archive and Replication are available in most of the modules.

Example for a data flow

The image below shows an expample for a data flow from the original source to the final GBIF-portal. As a first step the data are imported via the Import wizard are imported into the database. After the data are given free for publication, they are transferred into the cache database .  From there they are transferred into a Postgres database containing a package for conversion into ABCD. Finally the BioCASE tool for mapping the data is used to provide the data for GBIF.

Jan 14, 2025

Subsections of Im- & Export

Diversity Agents

Archive

Creating an archive

The data related to a project can be exported into an archive. Choose Data - Archive - Create archive... from the menu. A window as shown below will open.

Select the project you want to create an archive of and click on the Find the data =\\\ button. The data related with the project will be imported into temporary tables to allow you to inspect them in advance of the creation of the archive (use the buttons to see the data). To create the archive, click on the Create the archive button. A directory will be created containing a XML file for every table. For a common introduction see the tutorial: Video starten.

You can include the log data by selecting the option as described in the tutorial: Video starten.

 

Resetting the database

Before you restore an archive, please make sure that the data from the archive do not interfere with the data in the database. In order to avoid problems you should clean the database from any user data. To clear the database from any user data, choose  Data - Archive - Reset database... from the menu. A window as shown below will open listing all tables and the number of data within these tables. Click on the Reset database  button to remove any of these data including any data in the log tables.

 

Restoring an archive

To restore an archive choose Data - - Archive - Restore archive... from the menu. A window as shown below will open listing the tables in the database. To restore an archive click on the Choose archive directory button and select the directory containing the archive files. Next click on the Read data =\\\ button to import the data from the XML files into temporary tables.

With a click on the buttons you can inspect the content of the temporary tables. Finally click on the Restore from archiv button. If you select the option, the import will ask you for a stop in case of an error.

You can include the log data by selecting the option as described in the tutorial: Video starten.  

Planing

Plan schedule based archive creation

To administrate the schedule based creation of archives choose Data - Archive - Administrate archives... from the menu. A window as shown below will open listing the projects in the database. Select the project that should be included in the schedule based creation of archives. To create an archive for all selected projects, click on the Create archives button. The protocol of a previous archiving is shown as in the image below. Successful runs are indicated with a green color while failures have a red background (see below).

 

Creation of xsd schemata

Next to the data, the archive files contain a xsd description of the tables. To create xsd schemata independent of the content, select Data - Archive - Create schema from the menu. A windows as shown below will open with the list of all tables where the main tables of the database are preselected.

To change this selection you may use the  all and  none buttons resp. the Add to selection  and Remove from selection  options using * as a wildcard. Click on the Create schemata  button to create the schemata for the selected tables in the predefined directory. The  button will open this directory containing the created files. The schemata contain the name of the DiversityWorkbench module and its version, the definition of the table, the primary key and the colums together with their datatype and description (see the example below).

Creation of archives as a backgroud process

To archive the data in a scheduler based background process, you can start the application with the following arguments:

  • Archive
  • Server of the SQL-server database
  • Port of SQL-server
  • Database with the source data
  • Optional: Directory where the archive directories should be created

C:\DiversityWorkbench\DiversityAgents> DiversityAgents.exe Archive snsb.diversityworkbench.de 5432 DiversityAgents C:\DiversityWorkbench\DiversityAgents\Archive

The application will create the archives, generate the protocols as described above and quit automatically after the job is done. The user starting the process needs a Windows authentication with access to the SQL-Server database and proper rights to archive the data. If the last argument is not given the default directory …\Application directory\Archive\ will be used.

 

 

 

May 16, 2024

Diversity Agents

Replication

The chapter below refers to DiversityCollection but is valid for DiversityAgents as well.

Replication

If you wish to work with your data on a local database (called subscriber), e.g. on your laptop, not linked to a database on a central server (called publisher) and these data should be synchronized with the data in the database on the server, you may use the replication function of DiversityCollection. To install the database on your local computer see the installation section.

To use the replication function you require the roles Replicator or Administrator.

Add Publisher

To define a publishing database choose Data → Replication → Add Publisher from the menu. A window will open where you choose the publisher. After the publisher is set, you may transfer data between your local database (subscriber) and the publisher. This function is only available for administrators.

Remove Publisher

To remove a publisher from the list choose Data → Replication → [Publisher] → Remove from the menu (where [Publisher] is the name of the publishing database on the publishing server). This function is only available for administrators.

Clean database

Initially you may wish to remove all previous data from your local database (subscriber). Choose Data → Replication → Clean database … from the menu. A window will open as shown below where you may choose the ranges which should be cleared:

  • Definitions = the basic definitions within the database, e.g. the available taxonomic groups.
  • Descriptions = the descriptions and their translations of the tables and columns of the database.
  • Project, User = the available projects and users.
  • Basic data = basic data like the collection.
  • Data = the specimen, organisms etc.

Choose the data ranges you wish to clear and click on the button. All tables which contain data will be listed as shown below.

 

Choose the tables which should be cleared and click on the Clean database button. Please keep in mind that you can not delete data from a table as long as there is data in a related table depending on the data you wish to delete. The sequence of the tables is organized to avoid these problems.

Download

To download data from the publisher choose Data → Replication → [Publisher] → Download from the menu (where [Publisher] is the name of the publishing database on the publishing server). A form will open as shown below. Choose the project of the data and the data ranges (see above) which you wish to download. Click on the button to list the tables containing data. To start the download click on the Start download button. With the Force download, ignore conflicts option you can decide whether or not the data in your local database (= Subscriber) should be checked for changes before you download the data from the publisher.

If not all data should be included in the replication, you have the option to set a filter. Click on the button for the table where the data should be filtered to set this filter. A window as shown below will open.

All columns of the table will be listed and allow you to set the filter. To inspect the filtered data, click on the button. Click on the button to see the current filter. If a filter is set this will be indicated with a blue background .  

Merge

To merge data from your local subscriber database with the publisher you must first choose a project. Choose Data → Replication → [Publisher] → Merge** from the menu ([Publisher] is the name of the publishing database on the publishing server). As described for the download, choose the data ranges and click on the button. To start the upload click on the Start merge button.

 

Upload

To transfer data from your local subscriber database to the publisher you must first choose a project. Choose Data → Replication → [Publisher] → Upload from the menu ([Publisher] is the name of the publishing database on the publishing server). As described for the download, choose the data ranges and click on the

button. To start the upload click on the Start upload button. With the Force upload, ignore conflicts option you can decide whether or not the data in server (= Publisher) should be checked for changes before you upload the data from your local database (= Subscriber).

As described for the download, data may be filtered with a click on the button (see above).

 

Tools

To fix problems that may interfere with the replication you find some tools under the menu Data → Replication → [Publisher] → Tools… from the menu ([Publisher] is the name of the publishing database on the publishing server). A window will open as shown below.

You may synchronize the RowGUIDs between basic subscriber and publisher tables if for any reason these are differing, e.g. due to manual insert. Choose the table that should be synchronized. The tables will be compared for both publisher and subscriber. The datasets with identical key but different RowGUID will be listed (see above). Click on the Start update button to synchronize the RowGUIDs.  

Conflict

If the transfer of data was successful, the numbers of the transferred data will be shown as below.

During the download or upload a conflict may occur, if the data has been edited in both databases. This will be indicated as shown below.

Click on the button to open a window as shown below where you can choose between the two versions of the data as found in the publisher and the subscriber database.

The conflicting columns are marked red. For text values the program will create a combination of both values (see above) in a merged version of the data. Choose the preferred version of the data and click Solve conflict button. If you can not solve a conflict, use the Ignore conflict or Stop conflict resolution buttons respectively.

 

Report

At the end of each transfer a report will be created with a summary for every table which has been included.

 

May 16, 2024

Diversity Agents

Export

  • Archive
  • Export via Table editors as tab separated text file or SQLite database
  • Export as CSV file via bcp
  • Export via CacheDatabase in DiversityCollection and DiversityDescriptions
  • Create a backup Backup
May 16, 2024

Subsections of Export

Diversity Agents

Backup

The chapter below refers to the module DiversityCollection but is valid for DiversityAgents as well

If you want to create a backup of your database, there are 2 options. You may either export the data as csv files to your local computer or you may create a backup on the server.

Export data as csv

To export your data as csv files to your local computer, choose Data → Export → CSV(bcp) ... from the menu. A window will open as shown below, where you can select the tables that should be exported. Click on the Start Export button to export your data. If you choose the option as shown below 2 files will be created for every table. The first file (*.csv) contains the data while the second file (*.xml) contains the structure of the table.

Create backup on the server

To create a backup of your database on the server, choose Data → Backup database from the menu. This will create a SQL-Server backup on the server where the database is located. Ensure that there is enough space on the server.

Another option is to create a direct copy of the database files on the server. For this you have to use the functions provided by SQL-Server. However, you need administration rights for the database you want to create a backup of. Open the Enterprise Manager for SQL-Server, choose the database and detach it from the server as shown in the image below.

After detaching the database, you can save a copy of the ..._Data.MDF file to keep it as a backup.

After storing the backup you have to reattach the database.

A dialog will appear where you have to select the original database file in your directory.

May 16, 2024

Diversity Agents

Export CSV

The chapter below refers to the module DiversityCollection but is valid for DiversityAgents as well

To export the tables of the database in a tabulator, comma or semicolon separated format, choose Data → Export → Export CSV... from the menu. A window as shown below will open where you can select the tables to be exported in sections Selection criteria and in the Tables for export.

A prerequisite for this export is that the bcp program is installed on your computer. This has either been installed together with the installation of SQL-Server or you have to install the Microsoft Command Line Utilities for SQL Server.

To start the export click on the Start export button. By default the data will be exported into a directory ...\Export\<database_name> below your application directory. Click on the button to select a different target directory before starting export.

 

After export the tables are marked with green background, if table schema and data were exported successfully. If only the data were exported, this is marked with yellow background, if nothing was exported, the background is red. A detailed export report can be viewed by a click on the export result file name.  

May 16, 2024

Diversity Agents

Export Wizard

The export wizard provides a possibility to export the data selected in the main form. The data are exported as tab separated text file. The export may include transformations of the data as well as information provided by linked modules and webservices. Choose Data - Export - Export wizard from the menu and then select one of the export targets (Event, Specimen, ...). For a short introduction see the tutorial.  

Adding tables

There are the following ways to add tables:

  • One parallel table
  • Several parallel tables according to selected data
  • Dependent table

All options will include the depending tables as defined for the default table. The option for several tables will add as many tables as there are found in the data.

If you added parallel tables, you should set the sequence of the datasets within these tables: For the columns that should be used for sorting the data, set the ordering sequence to a value > 0 and choose if the ordering sequence should be ascending or descending .

Certain columns in the database may provide information linked to another table or a module resp. webservice . Click on the button to add a linked value.

Adding and editing file columns

To add columns to the exported file, use the buttons. In the textbox at the top of the file column, you can change the header for the column. To change the position of a file column use the resp. button. To fuse a column with the previous column, click in the gray bar on the left side of the column that will change to for fused columns. To remove a file column, use the button. Pre- and postfixes for the columns can directly be entered in the corresponding fields. To apply transformations on the data click on the button.  

Filter

To filter the exported data, use the filter function. Click on the button and enter the text for the filter. Only data matching the filter string will be exported. If a filter is set, the button will have a red background to remind you of the filter. The filter may be set for any number of columns you need for the restriction of the exported data.  

Rowfilter

This filter in contrast to the filter above strictly applies to the row according to the sequence of the data. For an explanation see a short tutorial Video starten.

 

Test

To test the export choose the Test tab, set the number of lines that should be included in the test and click on the Test export button. To inspect the result in a separate window, click on the button.

SQL

If you want to inspect the SQL commands created during the test check this option. To see the generated SQL click on the SQL button after the Test export. A window containing all commands including their corresponding tables will be shown.

 

Export

To export your data to a file, choose the Export tab. If you want to store the file in different place use the button to choose the directory and edit the name of the file if necessary. Check the include a schema option if you want to save a schema together with your export. To start the export, click on the Export data   button. To open the exported file, use the button.

 

Export to SQLite

To export your data into a SQLite database, choose the Export to SQLite tab. You may change the preset name of the database in order to keep previous exports. Otherwise you overwrite previous exports with the same filename. To start the export, click on the Export data   button. To view the exported data, use the button.

 

Schema

To handle the settings of your export, choose the Schema tab. To load a predefined schema, click on the button. To reset the settings to the default, click on the button. To save the current schema click on the button. With the button you can inspect the schema in a separate window.

May 16, 2024

Subsections of Export Wizard

Diversity Agents

Export Wizard Transformation

The exported data may be transformed e.g. to adapt them to a format demanded by the user. Click on the button to open a window as shown below. For an introduction see a short tutorial Video starten.

Here you can enter 6 types of transformation that should be applied to your data. Cut out parts,  Translate contents from the file, RegEx apply regular expressions or Replace text and apply Calculations Σ or Filters on the data from the file. All transformations will be applied in the sequence they had been entered. Finally, if a prefix and/or a postfix are defined, these will be added after the transformation. To remove a transformation, select it and click on the button.

 

Cut

With the cut transformation you can restrict the data taken from the file to a part of the text in the file. This is done by splitters and the position after splitting. In the example below, the month of a date should be extracted from the information. To achieve this, the splitter '.' is added and then the position set to 2. You can change the direction of the sequence with the button Seq starting at the first position and starting at the last position. Click on the button Test the transformation to see the result of your transformation.

With the Start at Pos. option the given splitters will be converted into space (' ') and the whole string starting with the given position will be used (see below).

 

Translate

The translate transformation translates values from the file into values entered by the user. In the example above, the values of the month should be translated from roman into numeric notation. To do this click on the button to add a translation transformation (see below). To list all different values present in the data, click on the button. A list as shown below will be created. You may as well use the and buttons to add or remove values from the list or the button to clear the list. Then enter the translations as shown below. Use the save button to save entries and the Test the transformation button to see the result. 

To load a predefined list for the transformation use the   button. A window as shown below will open. Choose the Encoding of the data in your translation source and indicate if the First line contains column definition. Click OK to use the values from the file for the translation.

 

Regular expression

The transformation using regular expressions will transform the values according to the entered Regular expression and Replace by values. For more details please see documentations about regular expressions.

 

Replacement

The replacement transformation replaces any text in the data by a text specified by the user. In the example shown below, the text "." is replaced by "-". 

 

Calculation 

The calculation transformation Σ performs a calculation on numeric value, dependent on an optional condition. In the example below, 2 calculations were applied to convert 2-digit values into 4 digit years.

 

Filter 

The filter transformation compares the values from the data with a value entered by the user. As a result you can either Export content into file or Export fixed value. To select another column that should be compared, click on the button and choose a column from the file in the window that will open. If the column that should be compared is not the column of the transformation, the number of the column will be shown instead of the symbol. To add further filter conditions use the button. For the combination of the conditions you can choose among AND and OR. 

 

 

 

 

 

 

May 16, 2024

Diversity Agents

Export Wizard Tutorial

This tutorial demonstrates the export of a small sample from the database. For an introduction see a short tutorial Video starten.

Choosing the data

In the main form, select the data that should be exported (only the data displayed in the query results are exported).

Exporting the data

Choose Data → Export → Wizard → Organism ... from the menu. A window as shown below will open where the available tables for export are listed in the upper left area. To show the data columns of a table, select this table in the list.

 

Adding additional tables

In this example, we want to add as many parallel identification tables as present in the data. To do this, click on the button of the Identification table. At the end of the list (depending on your data) the additional tables are added (see below).

 

Setting the sequence for the tables

To set the sequence of the Identifications, select the first table and for the column IdentificationSequence set sorting sequence to 1 and the direction for sorting to descending

 

Choosing data from linked modules

Some columns provide the possibility to add data from linked tables or modules. In this example we choose the column NameURI linking to the module DiversityTaxonNames (see below).

To provide linked values, click on the button. A window as shown below will open, where you can choose among the provided services.

After the service is selected, you will be asked for the value provided by the service (see below).

Now the selected link is added underneath the column as shown below. You can add as many links as you need for your export.

For some modules there are values that refer to other modules with a name like [Link to ...] as shown in the example below.

If you select one of theses values, you will be asked to select the service or database linked to this modul (see below)

... and then to select one of the provided columns (see below)

Within the form this linked values will be marked as shown below. If several results are retrieved these will be separated with by " | ".

 

Adding columns to the file

To add columns to the exported file, click on the buttons for the columns resp. linked values. In this example select all Family values and the TaxonomicName (see below).

 

Fusing columns

The families should appear as one column and as the sources can exist only once for each identification we can fuse these columns. To do so, click on the delimiters between these columns (see below).

 

Setting the headers

By default the headers for the exported data are set according to the names of the columns in the database. To change this, edit them as shown below where TaxonomicName has been changed to Taxon (see below). For fused columns only the header in the first column will be used.

 

Testing

To test the export, click on the Test export button. The result depends on the content in your data but should look similar as shown below.

 

Export

To finally export the data, choose the Export tab. By default the data will be exported into tab separated file in a directory in the application directory (see below). You can change the directory (click on the button). You can choose the Include schema option to create a schema that you may reuse in a later export.

May 16, 2024

Diversity Agents

Import

There are several import mechanisms available for most modules. Some modules like DiversityDescriptions have special import mechanisms.

Import wizard

for tab separated lists: Import data from foreign sources and attach further data to existing data sets in the database.

Replication

Replication with a local database.

Archive

Restoring an archiv.

May 16, 2024

Subsections of Import

Diversity Agents

Import wizard

The examples below are from the module DiversityAgents, but are valid for any other module as well.

With the current solution please ensure that there are no concurrent imports in the same database.

For some imports like e.g. for Collections in DiversityCollection you will be reminded to update the cache tables for the hierarchies.

With this import routine, you can import data from text files (as tab-separated lists) into the database. A short introduction is provided in a video Video starten. Choose Data Import Wizard Agent from the menu. A window as shown below will open that will lead you through the import of the data. The window is separated in 3 areas. On the left side, you see a list of possible data related import steps according to the type of data you choose for the import. On the right side you see the list of currently selected import steps. In the middle part the details of the selected import steps are shown.

Choosing the File and Settings

  • File: As a first step, choose the File from where the data should be imported. The currently supported format is tab-separated text. Choosing a file will automatically set the default directory for the import files. To avoid setting this directory, deselect the option Adapt default directory in the context menu of the button to open the file.
  • Encoding: Choose the Encoding of the file, e.g. Unicode. The preferred encoding is UTF8.
  • Lines: The Start line and End line will automatically be set according to your data. You may change these to restrict the data lines that should be imported. The not imported parts in the file are indicated as shown below with a gray background. If the
  • First line: The option First line contains the column definition decides if this line will not be imported.
  • Duplicates: To avoid duplicate imports you can Use the default duplicate check - see a video Video starten for an explanation.
  • Language: If your data contains e.g. date information where notations differ between countries (e.g. 31.4.2013 - 4.31.2013), choose the Language / Country to ensure a correct interpretation of your data.
  • Line break: With the option Translate \r\n to line break the character sequence \r\n in the data will be translated in a line break in the database.
  • SQL statements: To save all SQL statements that are generated during a test or import, you can check the option Record all SQL statements. Video starten
  • Schema: Finally you can select a prepared Schema (see chapter Schema below) for the import.

Choosing the data ranges

In the selection list on the left side of the window (see below) all possible import steps for the data are listed according to the type of data you want to import.

The import of certain tables can be paralleled. To add parallels click on the button (see below). To remove parallels, use the button. Only selected ranges will appear in the list of the steps on the right (see below).

To import information of logging columns like who created and changed the data, click on the include logging columns button in the header line. This will include additional substeps for every step containing the logging columns (see below). If you do not import these data, they will be automatically filled by default values like the current time and user.

Attaching data

You can either import your data as new data or Attach them to data in the database. Select the import step Attachment from the list. All tables that are selected and contain columns at which you can attach data are listed (see below). Either choose the first option Import as new data or one of the columns the attachment columns offered like SeriesCode in the table Series in the example below.

If you select a column for attachment, this column will be marked with a blue background (see below and chapter Table data).

Merging data

You can either import your data as new data or Merge them with data in the database. Select the import step Merge from the list. For every table you can choose between Insert, Merge, Update and Attach (see below).

The Insert option will import the data from the file independent of existing data in the database.

The Merge option will compare the data from the file with those in the database according to the Key columns (see below). If no matching data are found in the database, the data from the file will be imported. Otherwise the data will be updated.

The Update option will compare the data from the file with those in the database according to the Key columns. Only matching data found in the database will be updated.

The Attach option will compare the data from the file with those in the database according to the Key columns. The found data will not be changed, but used as a reference data in depending tables. 

Empty content will be ignored e.g. for the Merge or Update option. To remove content you have to enter the value NULL. As long as the column will allow emty values, the content will be removed using the NULL value.

Table data

To set the source for the columns in the file, select the step of a table listed underneath the Merge step. All columns available for importing data will be listed in the central part of the window. In the example shown below, the first column is used to attach the new data to data in the database.

A reminder in the header line will show you which actions are still needed to import the data into the table:

  • Please select at least one column   = No column has been selected so far.
  • Please select at least one decisive column   = If data will be imported depends on the content of decisive columns, so at least one must be selected.
  • Please select the position in the file   = The position in the file must be given if the data for a column should be taken from the file.
  • Please select at least one column for comparison   = For all merge types other than insert columns for comparison with data in the database are needed.
  • From file or For all   = For every you have to decide whether the data are taken from the file or a value is entered for all
  • Please select a value from the list   = You have to select a value from the provided list
  • Please enter a value   = You have to enter a value used for all datasets

The handling of the columns in described in the chapter columns.

Testing

- To test if all requirements for the import are met use the Testing step. You can use a certain line in the file for your test and then click on the Test data in line:  button. If there are still unmet requirements, these will be listed in a window as shown below.

If finally all requirements are met, the testing function will try to write the data into the database and display any errors that occurred as shown below. All datasets marked with a red background, produced some error.  

To see the list of all errors, double click in the error list window in the header line (see below).

If finally no errors are left, your data are ready for import. The colors in the table nodes in the tree indicate the handling of the datasets:

  • INSERT
  • MERGE
  • UPDATE,
  • No difference
  • Attach
  • No data

The colors of the table columns indicate whether a column is decisive , a key column or an attachment column .  

If you suspect, that the import file contains data already present in the database, you may test this and extract only the missing lines in a new file. Choose the attachment column (see chapter Attaching data) and click on the button Check for already present data. The data already present in the database will be marked red (see below). Click on the button Save missing data as text file to store the data not present in the database in a new file for the import. The import of agents contains the option Use default duplicate check for AgentName that is selected by default. To ensure the employment of this option the column AgentName must be filled according to the generation of the name by the insert trigger of the table Agent (InheritedNamePrefix + ' ' + Inheritedname + ', ' + GivenName  + ' ' + GivenNamePostfix + ', ' + InheritedNamePostfix + ', ' + AgentTitle - for details, see the documentation of the database).

If you happen to get a file with a content as shown below, you may have seleted the wrong encoding or the encoding is incompatible. Please try to save the original file as UTF8 and select this encoding for the import. 

Import

- With the last step you can finally start to import the data into the database. If you want to repeat the import with the same settings and data of the same structure, you can save a schema of the current settings (see below). You optionally can include a description of your schema and with the button you can generate a file containing only the description.


Schedule for import of tab-separated text files into DiversityAgents

  • Target within DiversityAgents: Agent
  • Database version: 02.01.13
  • Schedule version: 1
  • Use default duplicate check:
  • Lines: 2 - 7
  • First line contains column definition:
  • Encoding: UTF8
  • Language: US

Lines that could not be imported will be marked with a red background while imported lines are marked green (see below).

If you want to save lines that produce errors during the import in a separate file, use the Save failed lines option. The protocol of the import will contain all settings according to the used schema and an overview containing the number of inserted, updated, unchanged and failed lines (see below).

Description

- A description of the schema may be included in the schema itself or with a click on the Import button generated as a separate file. This file will be located in a separate directory Description to avoid confusion with import schemas. An example for a description file is shown below, containing common settings, the treatment of the file columns and interface settings as defined in the schema.