check
Statistics Seminar: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers | המחלקה לסטטיסטיקה ומדע הנתונים

לוח שנה

א ב ג ד ה ו ש
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
 
 
 
 

Statistics Seminar: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

תאריך: 
ב', 09/11/201515:30-16:30
מיקום: 
חברה 1710

Speaker: Prof. Adi Wyner,  U. of Pennsylvania

 Title: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

Abstract: There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature.

We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for essentially similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a self-averaging, interpolating algorithm which creates a "spikey-smooth” classifier. We show that AdaBoost has the same property. We conjecture that both AdaBoost and random forests  therefore succeed because of this mechanism. We provide a number of examples and some theoretical justification to support this explanation. In the process, we question the conventional wisdom that suggests that boosting algorithms for classification require regularization or early stopping and should be limited to low complexity classes of learners, such as decision stumps. We conclude that boosting should be used like random forests: with large decision trees and without direct regularization or early stopping.