文档介绍:Applied Regression Analysis Using STATA
Josef Brüderl
Regression analysis is the statistical method most often used in
social research. The reason is that most social researchers are
interested in identifying ”causal” effects from non-experimental
data. Regression is the method for doing this.
The term ,,Regression“: 1889 Sir Francis Galton investigated
the relationship between body size of fathers and sons. Thereby
he ”invented” regression analysis. He estimated
Ss .
This means that the size of the son regresses towards the mean.
Therefore, he named his method regression. Thus, the term
regression stems from the first application of this method! In
most later applications, however, there is no regression towards
the mean.
1a) The Idea of a Regression
We consider two variables (Y, X). Data are realizations of these
variables
y1,x1,…,yn,xn
resp.
yi,xi, for i 1,…,n.
Y is the dependent variable, X is the independent variable
(regression of Y on X). The general idea of a regression is to
consider the conditional distribution
fY y | X x.
This is hard to interpret. The major function of statistical
methods, namely to reduce the information of the data to a few
numbers, is not fulfilled. Therefore one characterizes the
conditional distribution by some of its aspects:
Applied Regression Analysis, Josef Brüderl 2
• Y metric: conditional arithmetic mean
• Y metric, ordinal: conditional quantile
• Y nominal: conditional frequencies (cross tabulation!)
Thus, we can formulate a regression model for every level of
measurement of Y.
Regression with discrete X
In this case pute for every X-value an index number of
the conditional distribution.
Example: e and Education (ALLBUS 1994)
Y is the e. X is highest educational level. Y is
metric, so pute conditional means EY|x. Comparing
these means tells us something about the effect of education on
e (variance analysis).
The following graph is the scatterg