Linear Regression Learner
Performs a multivariate linear regression. Select in the dialog a
target column (combo box on top), i.e. the response. The two
lists in the center of the dialog allow you to include only certain
columns which represent the (independent) variables.
Make sure the columns you want to have included being in the right
"include" list.
See article in wikipedia about
linear regression
for an overview about the topic.
If the optional PMML inport is connected and contains
preprocessing operations in the TransformationDictionary those are
added to the learned model.
Dialog Options
- Target
-
To select the target column. Only columns with numeric data are allowed.
- Values
-
To specify the independent columns the should be included in the regression model.
Numeric and nominal data can be included, whereby for nominal data dummy variables are automatically
created as described in section
Categorical variables in regression.
- Predefined Offset Value
-
By default, the regression model includes a constant term.
Selecting this option the given constant term is used. The value works like
a user defined intercept.
- Missing Values in Input Data
-
Define wether missing value in the input are ignored or whether the node execution should fail
on missing values.
- Scatter Plot View
-
Specify the rows that shall be available as data points in the scatter plot view.
Ports
Input Ports
0 |
Table on which to perform regression. |
1 |
Optional PMML port object containing preprocessing operations. |
Output Ports
0 |
Model to connect to a predictor node. |
1 |
Coefficients and statistics of the linear regression model. |
Views
- Linear Regression Result View
-
Displays the estimated coefficients and error statistics.
- Linear Regression Scatterplot View
-
Displays the input data along with the regression line in a
scatterplot. The y-coordinate is fixed to the response column
(the column that has been approximated) while the x-column can be
chosen among the independent variables with numerical values.
Note: If you have multiple input
variables, this view is only an approximation. It will fix the value
of each variable that is not shown in the view to its mean. Thus,
this view generally only makes sense if you only have a few input variables.
This node is contained in KNIME Base Nodes
provided by KNIME GmbH, Konstanz, Germany.