DWDM \ Association Mining Rules

Association Mining Rules in data mining are used to find relationships among data items in sets.

Example:
Market Based Analysis(It allows retailers to identify relationships between the items that people buy together frequently).

Association Rule
It is used to predict the occurrence of an item based on the occurrences of other items in the set of transactions.

i.e X → Y [Form of Implication expression], where X and Y are item sets.
Example: {Milk, Diaper} → {Beer}
I.e. persons who buy milk and diaper will automatically buy beer.

Rule Evaluation Metrics
An implication expression of the form X → Y, where X and Y are itemsets.
Example: {Milk, Diaper} → {Beer}.

Rule Evaluation Metrics
Itemset It is a collection of 1 / more items. Example: {Milk, Bread, Diaper}
k-itemset K-itemset consists of k items in the data set.
Support count (σ) It is the frequency of occurrence of an itemset. E.g. σ ({Milk, Bread, Diaper}) = 2
Support (s) It is a fraction of transactions that contain both X and Y
Confidence (c) Measures how often items in Y appear in transactions that contain X
Frequent Itemset An itemset whose support is greater than or equal to minsup threshold.

Example
TID ITEMS
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke


Example: {milk, Diaper} »Beer for suport and confidence calculation.
Support s = σ (milk, Diaper, Beer) / Total Number of Transactions (T) = 2 / 5= 0.4
Confidence = σ (milk, Diaper, Beer) / σ (milk, Diaper) = 2 / 3 = 0.67

Mining Association Rules
Two-step approach
1. Frequent Itemset Generation : i.e itemsets whose support ≥ minsup.
2. Rule Generation :Generate confidence(Rule is a binary partitioning of a frequent itemset.) rules from each frequent itemset.


Home     Back