What Is Support And Confidence in Data Mining

Support and confidence are crucial terms in implementing Market Basket Analysis (MBA), which is a common technique used in retail and e-commerce to uncover associations between products that are frequently purchased together.

  1. Support:
  • Support is a measure of how frequently an itemset (a set of items or attributes) appears in a dataset. It indicates the proportion of transactions or records in the dataset that contain the itemset.
  • Mathematically, support for an itemset X is calculated as Support(X) = (Number of transactions containing X) / (Total number of transactions)
  • Support values typically range from 0 to 1, with 1 indicating that the itemset X is present in all transactions and 0 indicating that it is not present in any transaction.
  • High support values suggest that the itemset is common in the dataset.

2. Confidence:

  • Confidence measures the strength of association between two itemsets, often referred to as the antecedent (X) and consequent (Y) of an association rule.
  • Confidence is calculated as the conditional probability of finding the consequent Y in a transaction given that the antecedent X is present in that transaction.
  • Mathematically, confidence for a rule X -> Y is calculated as: Confidence(X -> Y) = Support(X ∪ Y) / Support(X)
  • Confidence values range from 0 to 1, with 1 indicating a perfect association between X and Y, and 0 indicating no association.
  • High confidence values suggest that if the antecedent X is present in a transaction, there is a strong likelihood that the consequent Y will also be present.

Hridhya Manoj

Hello, I’m Hridhya Manoj. I’m passionate about technology and its ever-evolving landscape. With a deep love for writing and a curious mind, I enjoy translating complex concepts into understandable, engaging content. Let’s explore the world of tech together

Leave a Comment