Data Mining: Classification Schemes

General functionality
- Descriptive data mining
- Predictive data mining
Different views, different classifications
- Kinds of data to be mined
- Kinds of knowledge to be discovered
- Kinds of techniques utilized
- Kinds of applications adapted
Data Mining Goals
Descriptive Goals
Uncover patterns and relationships
Predictive Goals
Produce models
Classification
Classification is the process of identifying characteristics that determine a (predefined) class for a sample whose class label is unknown. These characteristics are useful not only to understand the data but also to predict the class for a new instance.
For example, we may want to assign credit rates to customers upon their attributes, e.g., age, income, occupation, etc.
Clustering
Cluster analysis analyzes data samples without consulting a known class label. The records are grouped, based on the principle of maximizing the intra-cluster similarity while minimizing the inter-cluster similarity.
Clustering helps marketers discover distinct customer groups and their characteristics based on purchase patterns.
Web Search engine
- Google ranks Web pages by using a combination of contents and hyperlinks among web pages
Content filtering
- Deliver the contents matched a user’s profile
- My Yahoo, My Washington Post, MyInformIT
Study relationships among Web pages
Content Categorization
- yahoo.com, sanook.com