Statistics Seminar: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

תאריך:

ב', 09/11/201515:30-16:30

מיקום:

חברה 1710

Speaker: Prof. Adi Wyner, U. of Pennsylvania

Title: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

Abstract: There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature.

We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for essentially similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a self-averaging, interpolating algorithm which creates a "spikey-smooth” classifier. We show that AdaBoost has the same property. We conjecture that both AdaBoost and random forests therefore succeed because of this mechanism. We provide a number of examples and some theoretical justification to support this explanation. In the process, we question the conventional wisdom that suggests that boosting algorithms for classification require regularization or early stopping and should be limited to low complexity classes of learners, such as decision stumps. We conclude that boosting should be used like random forests: with large decision trees and without direct regularization or early stopping.

סמינר מחלקתי

המחלקה לסטטיסטיקה ומדע הנתונים

הפקולטה למדעי החברה

Statistics Seminar: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

6642ec5335333038c821d5ec7e8fd28a

58f346316074ab88159e26929565864f

80ad7660b11647bd22186ae09834057d

006f8629049e0c903fe53db16416907e

8e34947e82957b542b31f1086819e7da