Import Wizard

Import wizard for tab separated lists
With this import routine, you can import data from text files (as
tab-separated lists) into the database. For an introduction see a short
tutorial
.
Choose Data -> Import ->
Wizard and then
the type of data that should be imported, e.g.
import Series ... from the menu. A
window as shown below will open that will lead you through the import of
the data. The window is separated in 3 areas. On the left side, you see
a list of possible data related import steps according to the type of
data you choose for the import. On the right side you see the list of
currently selected import steps. In the middle part the details of the
selected import steps are shown.

Choosing the File
As a first step, choose the
[File] from where the data should be imported. For an
introduction see a short tutorial
.
The currently supported format is [tab-separated text]. Then
choose the Encoding of the file, e.g. Unicode. The preferred
encoding is UTF8. The Start line and End line will automatically
be set according to your data. You may change these to restrict the data
lines that should be imported. The [not imported
parts] in the file
are indicated as shown below with a gray background. If the
First line contains the column definition
this line will not be imported as well. If your data contains e.g. date
information where notations differ between countries (e.g. 31.4.2013 -
4.31.2013), choose the Language / Country to ensure a correct
interpretation of your data. Finally you can select a prepared
Schema (see chapter Schema below) for the import.

Choosing the data ranges
In the selection list on the left side of the window (see below) all
possible import steps for the data are listed according to the type of
data you want to import.

The import of certain tables can be paralleled. To add parallels click
on the
button (see below). To remove parallels, use
the
button. Only selected ranges will appear in
the list of the steps on the right (see below).

To import information of logging columns like who created and changed
the data, click on
button in the header line.
This will include an additional substeps for every step containing the
logging columns (see below). If you do not import these data, they will
be automatically filled by default values like the current time and
user.

Attaching data
You can either import your data as new data or
[Attach]
them to data in the database. For an introduction see a short tutorial
.
Select the import step
[Attachment]
from the list. All tables that are selected and contain columns at which
you can attach data are listed (see below). Either choose the first
option
Import as new data or one of the
columns the attachment columns offered like SeriesCode in the table
Series in the example below.

If you select a column for attachment, this column will be marked with a
blue background (see below and chapter Table data).

Merging data
You can either import your data as new data or
[
Merge] them with data in the
database. For an introduction see a short tutorial
.
Select the import step
[
Merge] from the list. For
every table you can choose between
Insert,
Merge,
Update and
Attach (see below).
The
Insert option will import the data
from the file independent of existing data in the database.
The
Merge option will compare the data
from the file with those in the database according to the
Key columns (see below). If no matching data are
found in the database, the data from the file will be imported.
Otherwise the data will be updated.
The
Update option will compare the data
from the file with those in the database according to the
Key columns. Only matching data found in the
database will be updated.
The
Attach option will compare the data from
the file with those in the database according to the
Key columns. The found data will not be changed, but used as a
reference data in depending tables.

Empty content will be ignored e.g. for the
Merge or
Update option. To remove
content you have to enter the value NULL. As long as the column will
allow emty values, the content will be removed using the NULL value.
Table data
To set the source for the columns in the file, select the step of a
table listed underneath the
Merge step. All
columns available for importing data will be listed in the central part
of the window. In the example shown below, the first column is used to
attach the new data to data in the database.

A reminder in the header line will show you what actions are still
needed to import the data into the table:
- Please select at least one column
= No
column has been selected so far.
- Please select at least one decisive column
= If data will be imported depends on the content of decisive
columns, so at least one must be selected.
- Please select the position in the file
=
The position in the file must be given if the data for a column
should be taken from the file.
- Please select at least one column for comparison
= For all merge types other than insert columns
for comparison with data in the database are needed.
- From file or For all
= For every you
have to decide whether the data are taken from the file or a value
is entered for all
- Please select a value from the list
= You have
to select a value from the provided list
- Please enter a value
= You have to enter
a value used for all datasets
The handling of the columns in described in the chapter
columns.
Testing
To test if all requirements for the import are met use the
Testing step. You can use a certain line in
the file for you test and then click on the Test data in line:
button. If there are still unmet requirements, these will be listed in a
window as shown below.

If finally all requirements are met, the testing function will try to
write the data into the database and display any errors that occurred as
shown below. All datasets marked with a [red
background], produced some error.

To see the list of all errors, double click in the [error list
window] in the header line (see
below).

If finally no errors are left, your data are ready for import. The
colors in the table nodes in the tree indicate the handling of the
datasets: [INSERT],
[MERGE],
[UPDATE], [No
difference].
[Attach], [No
data]. The colors of the table
columns indicate whether a column is [decisive]
, a [key
column]
or an
[attachment column] [
].
If you suspect, that the import file contains data already present in
the database, you may test this and extract only the missing lines in a
new file. Choose the attachment column (see chapter Attaching data) and
click on the button Check for already present data. The data already
present in the database will be marked red
(see below). Click on the button
Save missing data as text file
to store the
data not present in the database in a new file for the import. The
import of specimen contains the option
Use
default duplicate check for AccessionNumber that is selected by
default.

If you happen to get a file with a content as shown below, you may have
seleted the wrong encoding or the encoding is incompatible. Please try
to save the original file as UTF8 and select this encoding for the
import.

Import
With the last step you can finally start to import the data into the
database. If you want to repeat the import with the same settings and
data of the same structure, you can save a schema of the current
settings (see below). You optionally can include a description of you
schema and with the
button you can
generate a file containing only the description.
Lines that could not be imported will be marked with a red background
while imported lines are marked green (see below).

If you want to save lines that produce errors during the import in a
separate file, use the Save failed lines option. The protocol of the
import will contain all settings according to the used schema and an
overview containing the number of inserted, updated, unchanged and
failed lines (see below).

Description
A description of the schema may be included in the schema itself or with
a click on the
button generated as a
separate file. This file will be located in a separate directory
Description to avoid confusion with import schemas. An example for a
description file is shown below, containing common settings, the
treatment of the file columns and interface settings as defined in the
schema.
Subsections of Import Wizard
Import Wizard Columns


If the content of a file should be imported into a certain column of a
table, mark it with the
checkbox.
Decisive columns
The import depends on the data found in the file where certain columns
can be selected as decisive. Only those lines will be imported where
data are found in [any]{.style1} of these
[decisive]{style=“color: #008000”} columns. To mark a column as
[decisive]{style=“color: #008000”}, click on the
icon at the beginning of the line (see below).

In the example shown below, the file column [Organims
2]{style=“color: #008000”} was marked as decisive. Therefore only the
two [lines containing content]{style=“background-color: #99CCFF”} in
this column will be imported.

Key columns
For the options
Merge,
Update and
Attach the import compares the data from the file with those already
present in the database. This comparison is done via key columns.
To make a column a key column, click on the
icon at
the beginning of the line. You can define as many key columns as you
need to ensure a valid comparison of the data.
Source
The data imported into the database can either be taken
From file or the same value that you
enter into the window or select from a list can be used
For all datasets. If you choose the
From file option, a window as shown below will pop up. Just click in
the column where the data for the column should be taken from and click
OK (see below).

If you choose the
For all option, you
can either enter text, select a value from a list or use a
checkbox for YES or NO.
The data imported may be
transformed e.g. to adapt them to a format
demanded by the database. For further details please see the chapter
Transformation.
Copy
If data in the source file are missing in subsequent lines as shown
below,

you can use the
Copy line option to fill in
missing data as shown below where the [blue
values]{style=“color: #00CCFF”} are copied into empty fields during the
import. Click on the
button to ensure that
missing values are filled in from previous lines.

Prefix and Postfix
In addition to the transformation of the values from the file, you may
add a pre- and a postfix. These will be added after the transformation
of the text. Double-click in the field to see or edit the content. The
pre- and a postfix values will [ only]{.style1} be used, if the [file
contains data]{.style1} for the current position.
Column selection
If for any reason, a column that should take its content from the
imported file misses the position of the file or you want to change the
position click on the
button. In case a
position is present, this button will show the number of the column. A
window as shown below will pop up where you can select and change the
position in the file.

Multi column
The content of a column can be composed from the content of several
columns in the file. To
add additional file columns, click on the
button. A window as shown below will pop up, showing
you the column selected so far, where the sequence is indicated in the
header line. The [first column]{style=“background-color: #CCFFFF”} is
marked with a blue background while the [added
columns]{style=“background-color: #8DE089”} are marked with a green
background (see below).

To remove an added column, use the
button (see
below).

The
button opens a window displaying the
information about the column. For certain datatypes additional options
are included (see Pre- and Postfix).

The data imported may be transformed e.g. to adapt them to a format
demanded by the database. A short introduction is provided in a video
.
Click on the
button to open a window as shown
below.

Here you can enter 4 types of transformation that should be applied to
your data.
Cut out parts,
Translate contents from the file, RegEx
apply regular expressions or
Replace text in the
data from the file. All transformations will be applied in the sequence
they had been entered. Finally, if a prefix and/or a postfix are
defined, these will be added after the transformation. To remove a
transformation, select it and click on the
button.
Cut
With the
cut transformation you can restrict the
data taken from the file to a part of the text in the file. This is done
by splitters and the position after splitting. In the example below, the
month of a date should be extracted from the information. To achieve
this, the splitter '.' is added and than the position set to 2. You
can change the direction of the sequence with the button
Seq starting at the first position and
starting at the last position. Click on
the button Test the transformation to see the result of your
transformation.

Translate
The
translate transformation translates
values from the file into values entered by the user. In the example
above, the values of the month cut out from the date string should be
translated from roman into numeric notation. To do this click on the
button to add a translation transformation
(see below). To list all different values present in the data, click on
the
button. A list as shown below will be created.
You may as well use the
and
buttons to add or remove values from the list or the
button to clear the list. Then enter the
translations as shown below. Use the
save button to
save entries and the Test the transformation button to see the
result.

To load a predefined list for the transformation use the
button. A window as shown below will open.
Choose the encoding of the data in you translation source, if the first
line contains the column definition and click on
the
button to open a file. Click OK to use
the values from the file for the translation.

Regular expression
The RegEx transformation using regular expressions will transform the values
according to the entered Regular expression and Replace by
vales. For more details please see documentations about regular
expressions.

Replacement
The
replacement transformation replaces any text in the data by a text
specified by the user. In the example shown below, the text "." is
replaced by "-".

Calculation
The Σ calculation transformation performs a calculation on numeric value,
dependent on an optional condition. In the example below, 2 calculations
were applied to convert 2-digit values into 4 digit years.

Filter
The
filter transformation compares the values from the file with a value
entered by the user. As a result you can either
Import content of column in file
or
Import a fixed value. To select
another column that should be compared, click on the
button and choose a column from the file in
the window that will open. If the column that should be compared is not
the column of the transformation, the number of the column will be shown
instead of the
symbol. To add further filter
conditions use the
button. For the combination of
the conditions you can choose among AND and OR.

Import Wizard Tutorial

Import wizard - tutorial
This tutorial demonstrates the import of a small file into the database.
The following data should be imported (the example file is included in
the software): At the end of this tutorial you will have imported
several datasets and practiced most of the possibilities provided by the
import wizard. The import is done in 2 steps to demonstrate the
attachment functionality of the wizard.
Step1 - Import of the collection events
Choose Data -> Import ->
Wizard ->
import Specimen ... from the menu. A window
as shown below will open. This will lead you through the import of the
data. The window is separated in 3 areas. On the left side, you see a
list of possible data related import steps according to the type of data
you choose for the import. On the right side you see the list of
currently selected import steps. In the middle part the details of the
selected import steps are shown.

Choosing the File
As a first step, choose the
[File]{.style1
style=“color: #0000FF”} from where the data should be imported. The
currently supported format is [tab-separated text]. Than choose
the Encoding of the file, e.g. Unicode. The Start line and End
line will automatically be set according to your data. You may change
these to restrict the data lines that should be imported. The [not
imported parts] in
the file are indicated as shown below with a gray background. If the
First line contains the column definition
this line will not be imported as well. If your data contains e.g. date
information where notations differ between countries (e.g. 31.4.2013 -
4.31.2013), choose the Language / Country to ensure a correct
interpretation of your data. Finally you can select a prepared
Schema (see chapter Schema below) for the import.

Choosing the data ranges
In the selection list on the left side of the window (see below) all
possible import steps for the data are listed according to the type of
data you want to import.

Certain tables can be imported in parallel. To add parallels click on
the
button (see below). To remove parallels, use the
button. Only selected ranges will appear in the
list of the steps on the right (see below).

To import information of logging columns like who created and changed
the data, click on
button in the header line.
This will include a additional substeps for every step containing the
logging columns (see below). If you do not import these data, they will
be automatically filled by default values like the current time and
user.

Attaching data
You can either import your data as new data or
[Attach]
them to data in the database. Select the import step
[Attachment]
from the list. All tables that are selected and contain columns at which
you can attach data are listed (see below). Either choose the first
option
Import as new data or one of the
columns the attachment columns offered like SeriesCode in the table
Series in the example below.

If you select a column for attachment, this column will be marked with a
blue backgroud (see below and chapter Table data).

Merging data
You can either import your data as new data or
Merge them wih data in the
database. Select the import step
Merge from the list. For
every table you can choose between
Insert,
Merge,
Update and
Attach (see below).
The
Insert option will import the data
from the file independent of existing data in the database.
The
Merge option will compare the data
from the file with those in the database according to the
Key columns (see below). If no matching data are
found in the database, the data from the file will be imported,
otherwise the data will be updated..
The
Update option will compare the data
from the file with those in the database according to the
Key columns. Only matching data found in the
database will be updated.
The
Attach option will compare the data from
the file with those in the database according to the
Key columns. The found data will not be changed, but used as a
reference data in depending tables.

Table data
To set the source for the columns in the file, select the step of a
table listed underneath the Merge step. All columns available for
importing data will be listed in the central part of the window. In the
example shown below, the first column is used to attach the new data to
data in the database.

A reminder in the header line will show you what actions are still
needed to import the data into the table:
- Please select at least one column
= No
column has been selected so far.
- Please select at least one decisive column
= If data will be imported depends on the content of decisive
colums, so at least one must be selected.
- Please select the position in the file
=
The position in the file must be given if the data for a column
should be taken from the file.
- Please select at least one column for comparision
= For all merge types other than insert columns
for comparision with data in the database are needed.
- From file or For all
= For every you
have to decide whether the data are taken from the file or a value
is entered for all
- Please select a value from the list
= You have
to select a value from the provided list
- Please enter a value
= You have to enter
a value used for all datasets
The handling of the columns in described in the chapter
columns.
Testing
To test if all requirements for the import are met use the
Testing step. You can use a certain line in
the file for you test and than click on the Test data in line:
button. If there are still unmet requirements, these will be listed in a
window as shown below.

If finally all requirements are met, the testing function will try to
write the data into the database and display you any errors that
occurred as shown below. All datasets marked with a [red
backgroud], produced some error.

To see the list of all errors, double click in the [error list
window] in the header line (see
below).

If finally no errors are left, your data are ready for import. The
colors in the table nodes in the tree indicate the handling of the
datasets: [INSERT],
[MERGE],
[UPDATE], [No
difference].
[Attach], [No
data]. The colors of the table colums
indicate whether a colums is [decisive] , a [key
column] or an [attachment
column].
In case you get an error because you can not specify the
analysis you may have to enter an analysis.
Choose Administration - Analysis from the menu. If no analysis is
available create a new analysis and link it to your project and the
taxonomic groups that are imported. For more datails see the chapter
Analysis.
Import
With the last step you can finally start to import the data into the
database. If you want to repeat the import with the same settings and
data of the same structure, you can save a schema of the current
settings.
Lines that could not be imported will be marked with a red background
while imported lines are marked green (see below).

If you want to save lines that produce errors during the import in a
separate file, use the Save failed lines option. The protocol of the
import will contain all settings acording to the used schema and an
overview containing the number of inserted, updated, unchanged and
failed lines (see below).
