Ensemble Learning is a type of Supervised Learning that uses multiple learning methods to achieve a result. The combined methods can include regression, classifications and functions. This allows the use of multiple types of learning methods on one dataset. And can get you a better result by averaging over all of the models. The advantages to Ensemble learning is that since you are averaging the models, you can get results that are less noisy and have smaller variance as well as prevent some of the over fitting that comes with a single model.
Here is an simple example of ensemble learning method called bagging:
- Take your training set and split up into 5 sets of 5 data points.
- Apply 3rd order polynomials to each of your 5 sets.
- Average them.
The result is , you get a better result at predicting the training set than you would applying a 3rd degree or 4th degree polynomial over the whole set. The graph below shows housing data according to time:
In the graph above, the red X shows training data set and the green X shows a test data point. The Red line is derived from the result of the Bagging algorithm and the blue line is derived from applying a 4th order polynomial over the whole training set. You can see that the red line predicts the test point closer than the blue line which indicates that in this situation bagging has a slightly better outcome than applying regression over the training set.