Alharthi, Amirah and Taylor, Charles C. and Voss, Jochen - Archives of Data Science, Series A

Article Details

Title Forests of Stumps
Authors Alharthi, Amirah and Taylor, Charles C. and Voss, Jochen
Year 2018
Volume 5(1)
Abstract Many numerical studies (Hansen and Salamon (1990), Schapire (1990)) indicate that bagged decision stumps perform more accurately than a single stump. In this work, we will investigate two approaches to create a forest of stumps for classification. The first method is bagging with stumps, that is growing a stump on different bootstrap sample size drawn from the training dataset. The second method is Gini-sampled stumps, where we sample split points with probability proportional to the Gini index. These two methods are combined with two aggregation methods: Majority vote and weighted vote. We use simulation studies to compare the performance and consumed time for these two methods. The computing time of generating split points by Gini-sampled stumps is less than half of the time needed to generate split points from bootstrap samples. Also, weighted vote aggregation results in more accurate performance than majority vote aggregation.