Auto-Binner
This node allows to group numeric data in intervals - called
bins.
There are two naming options for the bins and two methods which
define
the number and the range of values that fall in a bin.
Please
use the "Numeric Binner" node if you want to define custom bins.
Dialog Options
- Column Selection:
-
Columns in the include list are processed
separately. The columns in the
exclude list are omitted by the node.
- Binning Method:
-
Use
Fixed number of bins
for bins with equal
width
over the domain range or bins that have an equal
frequency
of element occurrences. Use
Sample quantiles
to produces
bins corresponding to the given list of probabilities. The
smallest
element corresponds to a probability of 0 and the largest do
probability
of 1. The applied estimation method is
Type 7
which is the default
method in R, S and Excel.
- Bin Naming:
-
Use
Numbered
for bins labeled by an integer with prefix "Bin",
Borders
for labels using '"(a,b]"' interval notation or
Midpoints
for labels that show the midpoint of the interval.
- Force integer bounds
-
Forces the bounds of the interval to be integers.
The decimal bounds
will be converted so that the lower bound of the
first interval will
be the floor of the lowest value and the upper
bound of the last
interval will be the ceiling of the highest value.
The edges that
separate the intervals will be the ceiling of the
decimal edges.
Duplicates of edges will be removed.
Examples:
[0.1,0.9], (0.9,1.8] -> [0,1], (1,2]
[3.9,4.1], (4.1,4.9], (4.9,5.1] -> [3,5], (5,6]
- Replace target column(s):
-
If set the columns in the include list are
replaced by the binned
columns
otherwise columns named with suffix
'[binned]' are appended.
- Advanced formatting
-
If enabled the format of the doubles in the labels
can be configured by
the options in this tab.
- Output format
-
Specify the output format. The number 0.00000035239 will be displayed
as 3.52E-7 with
Standard String
, 0.000000352 with
Plain String (no exponent)
and 352E-9 with
Engineering String
.
- Precision
-
The scale of the double values to round to. If the
scale is reduced
the specified rounding mode is applied.
- Precision mode
-
The type of precision to which the values are
rounded. Decimal
places, the default option rounds to the specified
decimal places,
whereas significant figures rounds to significant
figures or
numbers.
- Rounding mode
-
The rounding mode which is applied when double values are rounded.
The rounding mode specifies the rounding behavior. Seven different
rounding modes are available:
- UP: Rounding mode to round away from zero.
- DOWN: Rounding mode to round towards zero.
- CEILING: Rounding mode to round towards positive infinity.
- FLOOR: Rounding mode to round towards negative infinity.
- HALF_UP: Rounding mode to round towards "nearest neighbor"
unless both neighbors are equidistant, in which case round up.
- HALF_DOWN: Rounding mode to round towards "nearest neighbor"
unless both neighbors are equidistant, in which case round down.
- HALF_EVEN: Rounding mode to round towards the "nearest neighbor"
unless both neighbors are equidistant, in which case, round towards
the even neighbor.
For a detailed description of each rounding mode please see the
Java documentation
.
Ports
Output Ports
0 |
Data with bins defined |
1 |
The PMML Model fragment containing
information how to bin
|
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.