Table Validator
This node ensures a certain table structure and table content. The base for a configuration is
given by a reference specification which must be connected to the input port during
configuration and provides the basic template for the output table. It is ensured that the
result table structure is mostly identical to the reference specification. That is done by
resorting of columns, the insertion of missing columns (filled with missing values) and
optional removal of additional columns. You can also choose for each column (or a group of
them) if it is required and if the data type or the domain should be checked/converted. To
make use of this second approach, select a column or a list of columns to be handling, drag
them to the appearing "+" button, and set the parameters. To remove this extra handling (and
instead use the default handling), click the "Remove" button for this column. If the
validation succeeds, data gets output to the first port (potentially renamed, sorted according
to the reference specification and with converted types). If the validation fails, the first
port is inactive and the second port contains a table that lists all conflicts or the node
fails. All options mentioned below marked with
Data
forces also a traversal of the input data.
Dialog Options
General settings
- Behavior on validation issues
-
Defines how validation faults should influence the following workflow.
-
Fail node
- Forces the node to fail; the exception carries a appropriate message containing
detailed descriptions about the validation faults. A traversal of the data is canceled
if the structural comparison already failed.
-
Deactivate first output port
- The node will never fail but the first output port is set inactive. Validation results
are presented at the second output port as a data table which contains the
Column
name, an
Error ID (one of: COLUMN_NOT_CONTAINED, CONTAINS_MISSING_VALUE, INVALID_DATATYPE,
CONVERTION_FAILED, OUT_OF_DOMAIN)
and an human readable
Description
for each validation fault. The data is completely traversed, independent of potential
structural differences. This option is useful if a complete validation of the input data
is desired. For example if the workflow is used within the WebPortal, to avoid try and
error passes.
- Handling of unkown colums
-
Removes columns which are not included in the reference table spec.
-
Don't allow unknown columns
- Unknown columns will force a validation issue.
-
Remove unkown columns
- Unknown columns will be removed.
-
Sort them to the end
- Unknown columns will shifted to the end of the table.
Validation Settings
- Fail if column is missing (Structure)
-
Ensures that the configured columns exist in the input table. If
case insensitive name matching
is selected the first matching column will satisfy this condition.
- Case insensitive name matching (Structure)
-
Also columns with an similar name will be considered to be validated according to this
configuration. Users should take attention if using this option as the assignment from a
column to a configuration is not trivial computed at runtime. The rules are explained in the
following.
-
Exact name match
- Assigns the configuration with the exact name. The name is marked as used and cannot
match any following input columns again.
-
First matching configuration
- Assigns the first configuration to the column with a matching name, the name is marked
as used and cannot match any following input columns again.
- Fail on missing value (Data)
- Fails if the columns contains any missing value.
- Check data type (Structure|Data)
-
Ensures a correct data type.
-
Fail if different
- Fails if the reference data type is not a super type of the input column spec. I.e. it
checks that the input column implements all DataValue classes that are also implemented
by the reference column's data type.
-
Try to convert; fail if not compatible
-
Try to convert; insert missing if not compatible
- Check possible values (Data)
-
Checks if each data object is contained in the possible values of the reference domain. The
option is only enabled if any configured column defines possible values.
-
Fail if out of domain
-
Replace with missing values
- Check min & max (Data)
-
Checks if each data object is between min and max defined by the domain of the reference
specification. The option is only enabled if any configured column defines possible values.
-
Fail if out of domain
-
Replace with missing values
- Set input table as reference
- Sets the input table specification as reference specification.
Reference Spec
- Reference Spec
- The reference specification.
- Input Spec
- The input specification. Only visible if it differs from the reference
specification.
Ports
Output Ports
0 |
Table with corrected and validated structure. Depending on the validation result and the
Behavior if validation fails
settings, this port may be inactive.
|
1 |
Table where missing values have been handled. Depending on the validation result and the
Behavior if validation fails
settings, this port may be inactive.
|
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.