文档介绍:350 Resampling: The New Statistics
CHAPTER
Correlation and
23 Causation
Preview
Introduction to Correlation and Causation
Correlation: Sum of Products
Preview
The correlation (speaking in a loose way for now) between two
variables measures the strength of the relationship between
them. A positive “linear” correlation between two variables x
and y implies that high values of x are associated with high
values of y, and that low values of x are associated with low
values of y. A negative correlation implies the opposite; high
values of x are associated with low values of y. By definition a
“correlation coefficient” close to zero indicates little or no lin-
ear relationship between two variables; correlation coefficients
close to 1 and -1 denote a strong positive or negative relation-
ship. We will generally use a simpler measure of correlation
than the correlation coefficient, however.
One way to measure correlation with the resampling method
is to rank both variables from highest to lowest, and investi-
gate how often in randomly-generated samples the rankings
of the two variables are as close to each other as the rankings
in the observed variables. A better approach, because it uses
more of the quantitative information contained in the data
though it requires putation, is to multiply the val-
ues for the corresponding pairs of values for the two variables,
pare the sum of the resulting products to the analo-
gous sum for randomly-generated pairs of the observed vari-
able values. The last section of the chapter shows how the
strength of a relationship can be determined when the data
are counted, rather than measured. es some discus-
sion of the philosophical issues involved in correlation and
causation.
Chapter 23—Correlation and Causation 351
Introduction to correlation and causation
The questions in Examples 7-1 to 8-3 have been stated in the
following form: Does the independent variable (say