Predictions

This section is pretty straightforward and - as you might have guessed - deals with predicting target values for new observations. It's implemented the same way as most of the other predict methods in R.

Classification methods can predict class labels (as factors) and probabilities (as matrices, where columns correspond to classes) if the algorithm supports it (you can check the wrapped learner description for this, see Wrapped learner ).
Normally you simply choose between the two by using the "type" parameter of the predict method, but in some rare cases you have to specify that you want to predict probabilities at the learning task creation (see example below).
Regression methods work the same way, but simply predict numerical values.

Classification example

	# Classification task with iris dataset 
	ct <- make.classif.task(data = iris, formula = Species~.)
	
	# Some indices for training and test set definition, every second example
	train.set <- seq(from = 1, to = 150, by = 2)
	test.set <- seq(from = 2, to = 150, by = 2)

	# Train a Decision Tree on the training set 
	m <- train("rpart.classif", ct, subset = train.set)
 
	# Predict classes for training data
	p1 = predict(m, newdata = iris[train.set,])

	# Predict classes for new data
	predict(m, newdata = iris[test.set,])
	
         2          4          6          8         10         12 
    	setosa     setosa     setosa     setosa     setosa     setosa 
	...
        146        148        150 
 	virginica  virginica  virginica 
	Levels: setosa versicolor virginica
	
	
	# Let's predict some probabilities
	probs <- predict(m, newdata = iris[test.set,], type = "prob")
	
		setosa versicolor virginica
	2        1       0.00      0.00
	4        1       0.00      0.00
	...
	150      0       0.04      0.96	
	
	

	

Regression example

We again use the BostonHousing data set and learn a Gradient Boosting Machine. Every second observation we use for training/test.
The proceeding is analog to above.


	# Regression task
	rt <- make.regr.task(data = BostonHousing, formula = medv~.)

	# Training and test set 
	train.set <- seq(from = 1, to = 506, by = 2)
	test.set <- seq(from = 2, to = 506, by = 2)

	# Gradient Boosting Machine on training set 
	m <- train("gbm.regr", rt, subset = train.set, parset = list(n.trees = 10000)) 

	# Predict test set data 
	predict(m, newdata = BostonHousing[test.set,])
	
	[1] 22.395020 36.117794 25.031917 16.369151 17.798636 20.611659 22.606421
	[8] 22.671347 20.158904 20.559315 20.347968 15.692052 16.319054 16.349748
	...
	[253] 23.055107