文档介绍:变量解释:
FM = Sex ( 1 = F,0 = M )
AG = Age, in years 年龄
SS = Socioeconomic status (1 = High,0 = Low ),determined by occupation of the household’s principal wage earner 社会经济地位(取决于)家庭主要职业收入水平
YR = Year of smoking prior to diagnosis or examination 诊断之前的吸烟年龄
CD = Average rate of smoking, in cigarettes per day 每日吸烟根数
BK = Indicator of birdkeeping (caged birds in the home for more than 6 consecutive moths from 5to 14 years before diagnosis (cases) or examination (controls) 在诊断前5到14年间是否拥有
YR~cancer+sex+wage+bird+AG+CD
二、R统计分析
在对R软件进行加载以后,我将CSV文件命名为””。然后就正式地进行各个步骤的运行。
(一)首先,要用R查看各个变量的初步描述
采用
data=('E:/R/')
attach(data)
library(AER)
summary(data)机器语言,所得结果如下:
LC SEX SS BK AG
LUNGCANCER:49 FEMALE: 36 HIGH: 45 BIRD :67 Min. :
NOCANCER :98 MALE :111 LOW :102 NOBIRD:80 1st Qu.:
Median :
Mean :
3rd Qu.:
Max. :
YR CD
Min. : Min. :
1st Qu.: 1st Qu.:
Median : Median :
Mean : Mean :
3rd Qu.: 3rd Qu.:
Max. : Max. :
> dim(data)
[1] 147 7
七种变量共147组数据
> head(data)
LC SEX SS BK AG YR CD
1 LUNGCANCER MALE LOW BIRD 37 19 12
2 LUNGCANCER MALE LOW BIRD 41 22 15
3 LUNGCANCER MALE HIGH NOBIRD 43 19 15
4 LUNGCANCER MALE LOW BIRD 46 24 15
5 LUNGCANCER MALE LOW BIRD 49 31 20
6 LUNGCANCER MALE HIGH NOBIRD 51 24 15
>
> tail(