Naive Bayes Learner
The node creates a Bayesian model from
the given training data. It
calculates the number of rows per attribute value per class for
nominal attributes and the Gaussian distribution for numerical
attributes. The created model could be used in the naive Bayes
predictor to predict the class membership of unclassified data.
The node displays a warning message if any columns are ignored due to unsupported data types.
For example Bit Vector columns are ignored when the PMML compatibility flag is enabled since they are
not supported by the PMML standard.
Dialog Options
- Classification Column
-
The class value column.
- Maximum number of unique nominal values per attribute
-
All nominal columns with more unique values than the defined number
will be skipped during learning.
- Default probability
-
A probability of zero for a given attribute/class value pair requires special attention.
Without adjustment, a probability of zero would exercise an absolute veto over a likelihood in which that
probability appears as a factor. Therefore, the Bayes model incorporates a default probability parameter
that specifies a default (usually very small) probability to use in lieu of zero probability for a
given attribute/class value pair. Set to zero for no correction.
- Ignore missing values
-
By default the node uses the missing value information to improve the prediction result.
Since the PMML standard does not support this option and ignores missing values this option is disabled
if the PMML compatibility option is selected and missing values are ignored.
- Create PMML 4.2 compatible model
-
Select this option to create a model which is compliant with the
PMML 4.2 standard.
The PMML 4.2 standard ignores missing values and does not support bit vectors. Therefore bit vector columns
and missing values are ignored during learning and prediction if this option is selected.
Even if this option is not selected the node creates a valid PMML model. However the model contains
KNIME specific information to store missing value and bit vector information. This information is used in
the KNIME Naive Bayes Predictor to improve the prediction result but ignored by any other PMML compatible
predictor which might result in different prediction results.
Ports
Input Ports
0 |
Training data |
1 |
Optional PMML port object containing preprocessing operations.
|
Output Ports
0 |
Learned naive Bayes model. The model can be used to classify data with unknown target (class) attribute.
To do so, connect the model out port to the "Naive Bayes Predictor" node.
|
1 |
Data table with attribute statistics e.g. counts per attribute class pair, mean and standard deviation.
|
Views
- Naive Bayes Learner View
-
The view displays the learned model with the number of rows per class
attribute. The number of rows per attribute per class for nominal
attributes and the Gaussian distribution per class
for numerical attributes.
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.