文档介绍:第6章资料结构
Data Structures: Classifying the Various Types of Data Sets
基本术语
数据集合
Measurements of items
., Yearly sales volume for your 23 salespeople
., Cost and number produced, daily, for the past month
基本单元
The items being measured
., Salespeople, Days, Companies, Catalogs, …
变量
The type of measurement being done
., Sales volume, Cost, Productivity, Number of defects, …
Univariate data set: One variable measured for each elementary unit(单变量)
., Sales for the top panies.
Can do: Typical summary, diversity, special features
Bivariate data set: Two variables(双变量)
., Sales and Employees for top puter firms
Can also do: relationship, prediction
Multivariate data set: Three or more variables(多变量)
., Sales, Employees, Inventories, Profits, …
Can also do: predict one from all other variables
有哪些变量?
数值型或分类型(Categories)
Quantitative Variable: 计量型或尺度型
., Sales, # Employees
Can add, rank, count
Qualitative Variable: 分类型(有序、名义)
Ordinal Variable: Categories with meaningful ordering
., Bond rating (AA, A, B, …), Diamonds (VSI, SI, …)
Can rank, count
Nominal Variable: categories without meaningful ordering
., State, Type of business, Field of study
Can count
时间序列型或横截面型Time-Series or Cross-Sectional?
Time-Series Data: Data values recorded in meaningful sequence such as stock market index ..
Elementary units might be days or quarters or years
., D