Association rules: Introduction, Large Item sets, Apriori Algorithms and applications

Association rule learning is a type of unsupervised learning technique that checks for the dependency of one data item on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting relations or associations among the variables of dataset. It is based on different rules to discover the interesting relations between variables in the database.

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.

The association rule learning is one of the very important concepts of machine learning, and it is employed in Market Basket analysis, Web usage mining, continuous production, etc. Here market basket analysis is a technique used by the various big retailer to discover the associations between items. We can understand it by taking an example of a supermarket, as in a supermarket, all products that are purchased together are put together.

In addition to the above example from market basket analysis association rules are employed today in many application areas including Web usage mining, intrusion detection, continuous production, and bioinformatics. In contrast with sequence mining, association rule learning typically does not consider the order of items either within a transaction or across transactions.

Association rule learning can be divided into three types of algorithms:

Apriori

This algorithm uses frequent datasets to generate association rules. It is designed to work on the databases that contain transactions. This algorithm uses a breadth-first search and Hash Tree to calculate the itemset efficiently.

It is mainly used for market basket analysis and helps to understand the products that can be bought together. It can also be used in the healthcare field to find drug reactions for patients.

Eclat

Eclat algorithm stands for Equivalence Class Transformation. This algorithm uses a depth-first search technique to find frequent itemsets in a transaction database. It performs faster execution than Apriori Algorithm.

F-P Growth Algorithm

The F-P growth algorithm stands for Frequent Pattern, and it is the improved version of the Apriori Algorithm. It represents the database in the form of a tree structure that is known as a frequent pattern or tree. The purpose of this frequent tree is to extract the most frequent patterns.

Applications of Association Rule Learning

Market Basket Analysis: It is one of the popular examples and applications of association rule mining. This technique is commonly used by big retailers to determine the association between items.
Medical Diagnosis: With the help of association rules, patients can be cured easily, as it helps in identifying the probability of illness for a particular disease.
Protein Sequence: The association rules help in determining the synthesis of artificial Proteins.
It is also used for the Catalog Design and Loss-leader Analysis and many more other applications.

Working of Association Rule Learning work

Association rule learning works on the concept of If and Else Statement, such as if A then B.

If A -> Then B

Here the If element is called antecedent, and then statement is called as Consequent. These types of relationships where we can find out some association or relation between two items is known as single cardinality. It is all about creating rules, and if the number of items increases, then cardinality also increases accordingly. So, to measure the associations between thousands of data items, there are several metrics. These metrics are given below:

Support
Confidence
Lift

Let’s understand each of them:

Support

Support is the frequency of A or how frequently an item appears in the dataset. It is defined as the fraction of the transaction T that contains the itemset X. If there are X datasets, then for transactions T, it can be written as:

Supp (X) = Freq (X) / T

Confidence

Confidence indicates how often the rule has been found to be true. Or how often the items X and Y occur together in the dataset when the occurrence of X is already given. It is the ratio of the transaction that contains X and Y to the number of records that contain X.

Confidence = Freq (X,Y) / Freq (X)

Lift

It is the strength of any rule, which can be defined as below formula:

Lift = Supp (X,Y) / {Supp (X) * Supp (Y)}

It is the ratio of the observed support measure and expected support if X and Y are independent of each other. It has three possible values:

Lift= 1: The probability of occurrence of antecedent and consequent is independent of each other.

Lift>1: It determines the degree to which the two itemsets are dependent to each other.

Lift<1: It tells us that one item is a substitute for other items, which means one item has a negative effect on another.

How to Create a Good Landing Page for Your Coaching Business - espressocoder - Importance of Marketing as a Business Function and in the Economy

[…] So, nowadays, marketing matters more than ever: https://theintactone.com/2019/10/27/pom-u1-topic-2-importance-of-marketing-as-a-business-function-an… […]

How Businesses Can Ensure Smooth Collaboration with Their 3PL Fulfillment Partner: Key Strategies for Success - Transportation: Function, Cost & Mode of Transportation

[…] overall supply chain efficiency. Technological innovation also supports international shipping and transportation functions, promoting a more robust distribution […]

Tree planting, a hot topic… – Plant Cuttings - Forest resources, Use and over-exploitation, deforestation, Timber extraction, mining, dams and their effects on forest and tribal people, case studies

[…] trees to humankind has – sadly, but predictably – resulted in their unsustainable over-exploitation in nature (Kathleen Hermans-Neumann et…

The Service Concept: Design and Delivery – ctrade bot - Service Marketing Mix

[…] Image source: https://theintactone.com/2019/03/07/mm2-u1-topic-3-service-marketing-mix/ […]

Kayla Kapp - Foundations of individual behavior

please confirm who the author is for this article - I would like to put together a reference for my…

Leave a ReplyCancel reply

Dr. APJ Abdul Kalam Technical University MBA Notes (KMBN, KMB & RMB Series Notes)

Guru Gobind Singh Indraprastha University (BBA) Notes

Chaudhary Charan Singh University BBA Notes (Old & New Syllabus)

GGSIPU (MS113) Legal Aspects of Business

Difference between Memorandum and Articles of Association, Prospectus

Computation of Total Income and Tax Liability of individuals

Advance Payment of Tax

Deductions from Gross Total Income, Rebates and Reliefs

Aggregation of income

Computation of Taxable income from Salary

Share this:

Like this:

Related

You might also like

Leave a ReplyCancel reply