文档介绍:Weka[25] Baggi ng 源代码分析
作者:Koala++屈伟
先翻译一段 Bagging 的介绍,Breiman 的 bagging 算法,是 bootstrap aggregating 的缩写, 是最早的Ensemble算er 复制 m_Numlterations 份到 m_Classifiers 数组中去。 int bagSize = () * m_BagSize Percent / 100;
Random random = new Random( m_Seed );
boolean [][] inBag = null ;
if ( m_CalcOutOfBag )
for
inBag = new boolean [ m_Classifiers . length ][]; bagSize是一个Bag的大小,也就是它里面有多少样本。
.length ; j++) {
(int j = 0; j < m_Classifiers
Instances bagData = null ;
// create the in-bag dataset if ( m_CalcOutOfBag ) {
inBag[j] = new boolean
bagData = resa mpl eWithWeights(data, random, inBag[j]); } else {
bagData = mpl eWithWeights(random);
if
[()];
(bagSize < ()) { (random); Instances newBagData = bagData = newBagData;
new lnstances(bagData, 0, bagSize);
if
Classifier
m.
((Randomizable)
instanceof
m Classifiers
Randomizable) {
[j]).setSeed(());
// build the classifier
m_Classifiers [j].buildClassifier(bagData);
暂时不去看 m_CalcOutOfBag的情况,当然最关键的是 resampleWithWeights :
/**
Creates a new dataset of the same size using random samp ling with
repl acement according to the current instance weights. The weights of
the instances in the new dataset are set to one.
*/
p ublic Instances resa mpl eWithWeights(Random random) {
double [] weights = new double [numlnstances()];
f