1 / 13
文档名称:

Statistic_Venter,J.H., Rey,T. - 2008 - Detecting Outliers Using Weights in Logistic Regression.pdf

格式:pdf   页数:13
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

Statistic_Venter,J.H., Rey,T. - 2008 - Detecting Outliers Using Weights in Logistic Regression.pdf

上传人:bolee65 2014/1/28 文件大小:0 KB

下载得到文件列表

Statistic_Venter,J.H., Rey,T. - 2008 - Detecting Outliers Using Weights in Logistic Regression.pdf

文档介绍

文档介绍:Detecting outliers using weights in logistic regression
JH Venter & T de la Reyy
1 Introduction
Logistic regression (LR) is concerned with explaining the probability of a specic response in
terms of a number of regressors using a sample of relevant data. Pregibon (1981) states that
the estimated LR relationship may be severely affected by outliers; this motivates the need for
robust logistic regression procedures. Studies in this direction have been reported by Pregibon
(1981), Copas (1988), Rousseeuw & Christmann (2003), Huber (1973), Rousseeuw & Leroy
(1987) and Yohai (1987). Trimming is a broad approach towards robustifying of statistical
procedures. It allows one to identify outliers and remove them from the data used in the
estimation process. Trimming has been developed extensively by a number of authors in least
squares regression, multivariate analysis and other elds (see for example Rousseeuw (1984),
Rousseeuw & Van Driessen (1999a,b), where further references can be found). At rst thought
it seems attractive to use trimming also in LR to identify outliers and to limit their effects. When
trimming, a subset of the data that is highly likely to be free from outliers is needed and a
method is required to select such a subset. One possibility is to use maximum likelihood
considerations, but this approach tends to run into the separation problem. The problem is
that those observations that are considered as outliers are usually the same observations that
will provide some overlap in the data. Therefore, trimming these observations removes the
overlap and may lead to non-existence (indeterminacy) of the maximum likelihood estimator
(MLE) applied to the remaining data as pointed out by Christmann and Rousseeuw (2001).
They produced methodology to measure this overlap, enabling the user to judge the closeness
to indeterminacy. In a further contribution Rousseeuw and Christmann (2003) overcame the
non-existence problem by introducing the