文档介绍:CART:Classification and Regression Trees?Presented by;Pavla SmetanovaLütfiye ArslanStefan Lhachimi?Based on the book “Classification and Regression Trees”?by L. Breiman, J. Friedman, R. Olshen, andC. Stone (1984).Outline1- INTRODUCTION?What is CART??An example?Terminology?Strengths2- METHOD:3 steps in CART:?Tree building?Pruning?The final treeWhat is CART??A non-parametric technique,using the methodology of tree building.?Classifies objects or predicts es by selecting from a large number of variables the most important ones in determining the e variable.?CART analysis is a form of binary recursive example from Clinical research?Development of a reliable clinical decision rule to classify new patients into categories?19 measurements(age, bloodpressure, etc.)are taken from each heart-attack patients during the first 24 hours of their admittance to San Diego Hospital.?The goal: identify high-risk patientsClassification of Patients as High or No risk groups Is the minimum systolic blod pressure over the initial 24 hour> 91? yesnoIs age>?yes noIs sinus tachycardiapresent?yesnoGFGFTerminology?Theclassificationproblem: A systematic way of predicting the class of an object based on measurements.?C={1,...,J}: classes?x: measurement vector ?d(x): a classifying function assigning every x to one of the classes 1,...,??ss: split?learning sample (L): measurement data on Ncases observed in the past together with their actual classification.?R*(d):true misclassification rate R*(d)=P(d(x)=Y), Y? CStrengths?No distributional assumptions are required.?No assumption of homogeneity.?The explanatory variables can be a mixture of categorical, interval and continuous.?Especially good for high-dimensional and large data sets. Produce useful results by using a few important ?Sophisticated methods for dealing with missing variab