CAIM Binner
This node implements the CAIM binning (discretization) algorithm according to
Kurgan and Cios (2004) URL:http://citeseer.ist.psu.edu/kurgan04caim.html.
The binning (discretization) is performed with respect to
a selected class column. CAIM creates all possible binning boundaries
and chooses those that minimize the class interdependancy measure. To
reduce the runtime, this implementation creates only those boundaries
where the value and the class changes.
The algorithm finds a minimum number of bins (guided by the number
of possible class values) and labels them "Interval_X". Only
columns compatible with double values are binned and the
column's type of the output table is changed to "String".
Dialog Options
- Class Column
-
The class column.
According to this column the binning is optimized.
- Column selection
-
Allows to include those columns which should be included in the
discretization. Just the included columns are discretized and
changed to "String" type.
- General Node Settings
-
To increase performance, select the memory policy 'Keep all in memory'
at the PREVIOUS node (if possible) from the General Node Settings Tab.
Ports
Input Ports
0 |
The data table to bin (discretize). |
Output Ports
0 |
The binned data table. |
1 |
The model representing the
binning. Contains the intervals for each bin of each column.
|
Views
- Binning Model
-
The view shows the column's binning scheme. For each column a ruler
is displayed on which the bin boundaries are marked.
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.