This section is pretty straightforward and - as you might have guessed - deals with predicting target values for new observations. It's implemented the same way as most of the other predict methods in R.
Classification methods can predict class labels (as factors) and probabilities (as matrices, where columns correspond to classes) if the algorithm supports it (you can check the wrapped learner description for this, see Wrapped learner ).
Normally you simply choose between the two by using the "type" parameter of the predict method, but in some rare cases you have to specify that you want to predict probabilities at the learning task creation (see example below).
Regression methods work the same way, but simply predict numerical values.
# Classification task with iris dataset ct <- make.classif.task(data = iris, formula = Species~.)# Some indices for training and test set definition, every second example train.set <- seq(from = 1, to = 150, by = 2) test.set <- seq(from = 2, to = 150, by = 2)# Train a Decision Tree on the training set m <- train("rpart.classif", ct, subset = train.set)# Predict classes for training data p1 = predict(m, newdata = iris[train.set,])# Predict classes for new data predict(m, newdata = iris[test.set,])2 4 6 8 10 12 setosa setosa setosa setosa setosa setosa ... 146 148 150 virginica virginica virginica Levels: setosa versicolor virginica # Let's predict some probabilities probs <- predict(m, newdata = iris[test.set,], type = "prob")setosa versicolor virginica 2 1 0.00 0.00 4 1 0.00 0.00 ... 150 0 0.04 0.96
We again use the BostonHousing data set and learn a Gradient Boosting Machine.
Every second observation we use for training/test.
The proceeding is analog to above.
# Regression task rt <- make.regr.task(data = BostonHousing, formula = medv~.)# Training and test set train.set <- seq(from = 1, to = 506, by = 2) test.set <- seq(from = 2, to = 506, by = 2)# Gradient Boosting Machine on training set m <- train("gbm.regr", rt, subset = train.set, parset = list(n.trees = 10000))# Predict test set data predict(m, newdata = BostonHousing[test.set,])[1] 22.395020 36.117794 25.031917 16.369151 17.798636 20.611659 22.606421 [8] 22.671347 20.158904 20.559315 20.347968 15.692052 16.319054 16.349748 ... [253] 23.055107