文档介绍:Data Mining in Pharmaceutical Marketing and Sales Analysis
Pavel Brusilovskiy, PhD
Merck
1
Contents
What is Data Mining?
Data Mining vs. Statistics: what is the difference?
Why Data Mining is important tool in pharmaceutical marketing research and sales analysis?
Case Study
2
What is the Data Mining?
“The magic phrase to put in every funding proposal you write to NSF, DARPA, NASA, etc”
“Data Mining is a process of torturing the data until they confess”
“The magic phrase you use to sell your…..
- database software
- statistical analysis software
- puting hardware
- consulting services”
3
Data Mining
Data Mining
is a cutting edge technology to analyze diverse, multidisciplinary and plex data
is defined as the non-trivial iterative process of extracting implicit, previously unknown and potentially useful information from your data
Data mining could identify relationships in your multidimensional and heterogeneous data that cannot be identified in any other way
essful application of state-of-the-art data mining technology to marketing, sales, and es research problems (not to mention drug discovery) is indicative of analytic maturity and the ess of a pany
4
Data Mining and Related Fields
Visualization
Machine
Learning
Statistics
Database
Data Mining
Is Data Mining extension of Statistics?
5
Statistics vs. Data Mining: Concepts
Feature
Statistics
Data Mining
Type of Problem
Well structured
Unstructured / Semi-structured
Inference Role
Explicit inference plays great role in any analysis
No explicit inference
Objective of the Analysis and Data Collection
First – objective formulation, and then - data collection
Data rarely collected for objective of the analysis/modeling
Size of data set
Data set is small and hopefully homogeneous
Data set is large and data set is heterogeneous
Paradigm/Approach
Theory-based (deductive)
Synergy of theory-based and heuristic-based approaches (inductive)
Signal-to-Noise Ratio
STNR > 3
0 < ST