Training

"Training" a learner just means fitting model to a given data set. We are not concerned with the specifics of the fitting process as such here - this will be taken care of by the underlying R methods that this package employs. Rather more important is, that training and all subsequent operations can be performed by using a unified interface.

This is in this case achieved by calling the method "train" on the classification task. The most important parameters are the indices of the subset which are used for training and a list of named elements which specify the hyperparameters of the learner. The parameters will generally be named the same way as in the underlying R method, if not, differences are documented on the R help for the learning method. The return value is always an object of class "model" which wraps the concrete model of the used R classification or regression method. It can subsequently be used to perform prediction for new observations.

Classification example

Let's have a look at the iris dataset:

	# Classification task: 
	ct <- make.classif.task(data = iris, target="Species")
	
	# Let's train some Decision Trees:
	#   on whole data set
	m1 <- train("rpart.classif", ct)
 
	#   on a subset (every second observation)
	m2 <- train("rpart.classif", ct, subset=seq(from=1, to=150, by=2))

	#   with hyperparameters
	m3 <- train("rpart.classif", ct, subset=seq(from=1, to=150, by=2), parset=list(minsplit=7, cp=0.03)) 
	
	# You can print some basic information of the model to the console
	m3
	
	Learner model for RPART
	Hyperparameters: minsplit=7 cp=0.03
	Trained on obs: 75
	
	
	# Some accessor
	m3["parset"]
	
	$minsplit
	[1] 7

	$cp
	[1] 0.03
				
	
	m3["subset"]
		
	 [1]   1   3   5   7   9  11  13  15  17  19  21  23  25  27  29  31  33  35  37
	[20]  39  41  43  45  47  49  51  53  55  57  59  61  63  65  67  69  71  73  75
	[39]  77  79  81  83  85  87  89  91  93  95  97  99 101 103 105 107 109 111 113
	[58] 115 117 119 121 123 125 127 129 131 133 135 137 139 141 143 145 147 149
	

	# access the wrapped rpart model - in most cases you won't need to...
	m3["learner.model"]
	
	n= 75 

	node), split, n, loss, yval, (yprob)
		  * denotes terminal node

	1) root 75 50 setosa (0.3333333 0.3333333 0.3333333)  
		2) Petal.Length< 2.45 25  0 setosa (1.0000000 0.0000000 0.0000000) *
		3) Petal.Length>=2.45 50 25 versicolor (0.0000000 0.5000000 0.5000000)  
			6) Petal.Width< 1.65 25  1 versicolor (0.0000000 0.9600000 0.0400000) *
			7) Petal.Width>=1.65 25  1 virginica (0.0000000 0.0400000 0.9600000) *
		

Regression example

As regression example we use the BostonHousing data set:

	# Regression task: 
	rt <- make.regr.task(data = BostonHousing, formula = medv~.)
	
	# Let's train some Gradient Boosting Machines:
	#   on whole data set
	m1 <- train("gbm.regr", rt) 
	
	#   on a subset (every second observation)
	m2 <- train("gbm.regr", rt, subset=seq(1, 506, 2))
	
	#   with a set of hyperparameters
	m3 <- train("gbm.regr", rt, subset = seq(1, 506, 2),
                                    parset = list(n.trees = 500,
                                                  distribution = "laplace",
                                                  interaction.depth = 3)) 
	
	
	# You can print some basic information of the model to the console 
	m3
	
	Learner model for Gradient Boosting Machine
	Trained on obs: 253
	Hyperparameters: n.trees=500 distribution=laplace interaction.depth=3
	
	
	# rest is analogous to example above

As you can read in section Wrapped learners there is another possibility to access the learning algorithm: We again take the regression example from above and show how we easily build two models with different hyperparameter sets:

	# Regression task: 
	rt <- make.regr.task(data = BostonHousing, formula = medv~.)
	
	# Construct the wrapped learner
	wl <- make.learner("gbm.regr")

	# First setting of hyperparameters 
	wl_1 <- set.train.par(wl, n.trees = 500, distribution = "laplace", interaction.depth = 3)

	# Second setting of hyperparameters 
	wl_2 <- set.train.par(wl, n.trees = 250, distribution = "laplace", interaction.depth = 5)


	# And merge the information in each case 
	model_1 <- train(wl_1, rt)
	model_2 <- train(wl_2, rt)

Normally you should better define your hyperparameters in parset. Use the set.train.par-function for technical parameters which you do not change anymore. See ---LINK EINFÜGEN---- for details.