{"id":19997,"date":"2019-05-23T14:31:46","date_gmt":"2019-05-23T12:31:46","guid":{"rendered":"https:\/\/relaunch.striped-giraffe.com\/?p=19997"},"modified":"2026-05-25T14:26:17","modified_gmt":"2026-05-25T12:26:17","slug":"looking-for-similarities-with-association-rule-mining","status":"publish","type":"post","link":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/","title":{"rendered":"Looking for Similarities With Association Rule Mining"},"content":{"rendered":"<section class=\"wpb-content-wrapper\"><p>[vc_row][vc_column width=&#8221;1\/3&#8243;][\/vc_column][vc_column width=&#8221;2\/3&#8243;][vc_column_text]<\/p>\n<h3 style=\"color: #ef6c00; font-weight: bold;\">We like having our ten stripes in a row and we ask ourselves who else does?<\/h3>\n<p>&nbsp;<\/p>\n<p>Life is unfair! Some geniuses are acknowledged and honored during their lifetime, some even long after it, but there are a lot of fundamental research pioneers who have unfortunately fallen into obscurity in society\u2019s collective memory. Petr H\u00e1jek (6 Feb. 1940 \u2013 26 Dec. 2016), Professor and Director of Research at the Institute of Mathematics of the Czech Republic\u2019s Academy of Science, was one of those pioneers who built the foundation of modern machine learning theory, and it is he to whom I devote this article.<\/p>\n<p>More than 50 years ago, in 1966 \u2013 the year of the mini-skirt, Vietnam war protests and the space-race between United States and Soviet Union \u2013 Petr H\u00e1jek, together with his colleagues Tom\u00e1\u0161 Havr\u00e1nek, Ivan Havel and Metod\u011bj Chytil, formulated the GUHA (General Unary Hypotheses Automation) principle. The GUHA principle describes the idea of using computers to generate series of hypotheses that describe relations between the properties of objects. Ultimately, the GUHA principle forms the foundation for Knowledge Discovery in Databases (KDD) (better known as data mining or knowledge extraction) and for unsupervised machine learning techniques, the research areas being emerged several decades later.<\/p>\n<p>In this article, I would like to talk about association analysis, a simple but extremely useful unsupervised learning technique that traces back to H\u00e1jek\u2019s GUHA principle. Generally speaking, association analysis addresses the problem of finding highly probable subsets of data and the highly probable subset combinations.<\/p>\n<p>&nbsp;<\/p>\n<p>As we know, many businesses have enormous amounts of customer and customer purchasing data. For example, a grocery store definitely owns data containing the product names purchased in each transaction. Other retailers and the majority of online dealers also have similar data on purchases at their stores. Using these data, we might want to find subgroups of products that tend to co-occur, in either purchasing or in viewing behavior.<\/p>\n<p>Retailers might be interested in cross-promoting product combinations with different deals, if they know that these products are very highly correlated in such a way that people purchase them together. Online platforms can use this information to generate content recommendations. Another huge topic is strategic product placement within grocery stores, because certain product arrangements encourage consumers to purchase them both at the same time. And of course, the variety of use cases goes far beyond the grocery store example! For instance Netflix-like applications recommending content based on viewing behavior, even without any feedback at all from the customer. Further embodiments of these ideas could lead to use cases covering the analysis of social trends and even the promotion of political campaigns.<\/p>\n<p>&nbsp;<\/p>\n<p>Let\u2019s look at the simplest version of association analysis based on grocery basket analysis \u2013 the canonical way of thinking about finding associations among data. We\u2019re assuming that six people have a basket at a grocery store, and they have different objects in their basket at checkout time. Imagine now that we have millions of these checkout transactions and they\u2019re across thousands of products; we now might want to use this type of data to analyze\u00a0<strong>patterns of co-occurrence<\/strong>!<\/p>\n<p>Which products tend to appear in baskets at what rate, and what are the association rules? And\u00a0<strong>knowing those association rules, given that they\u2019ve got one object in their basket, we\u2019re going to predict that they are more likely to have a second object in their basket<\/strong>.<\/p>\n<p>For example, consider the following \u201cgrocery baskets\u201d of six customers:<\/p>\n<table class=\"table1\">\n<tbody>\n<tr>\n<td class=\"td1\"><strong>Basket ID<\/strong><\/td>\n<td class=\"td1\"><strong>Items<\/strong><\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">Cola, Butter, Bread, Cheese<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">2<\/td>\n<td class=\"td1\">Bread, Cheese, Milk<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">3<\/td>\n<td class=\"td1\">Bread, Fanta, Beer, Eggs, Butter<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">4<\/td>\n<td class=\"td1\">Eggs, Salami, Butter, Beer, Cola, Toast<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">5<\/td>\n<td class=\"td1\">Toast, Eggs, Butter, Fanta<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">6<\/td>\n<td class=\"td1\">Bread, Milk, Butter, Fanta, Cheese<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Based on these data, we want to analyze patterns of co-occurrence:<\/p>\n<p><span id=\"MathJax-Element-1-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-1\" class=\"math\"><span id=\"MathJax-Span-2\" class=\"mrow\"><span id=\"MathJax-Span-3\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-4\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-5\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-6\" class=\"mi\">\ud835\udc4e<\/span><span id=\"MathJax-Span-7\" class=\"mi\">\ud835\udc51<\/span><span id=\"MathJax-Span-8\" class=\"mo\">\u27f6<\/span><span id=\"MathJax-Span-9\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-10\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-11\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-12\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-13\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-14\" class=\"mi\">\ud835\udc5f<\/span><\/span><\/span><\/span><\/p>\n<p>Let\u2019s look at this problem from a more abstract perspective.<\/p>\n<p>First, imagine that we have\u00a0<em><span id=\"MathJax-Element-2-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-15\" class=\"math\"><span id=\"MathJax-Span-16\" class=\"mrow\"><span id=\"MathJax-Span-17\" class=\"mi\">\ud835\udc56<\/span><\/span><\/span><\/span><\/em>\u00a0different objects, for example, there could be\u00a0<em><span id=\"MathJax-Element-3-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-18\" class=\"math\"><span id=\"MathJax-Span-19\" class=\"mrow\"><span id=\"MathJax-Span-20\" class=\"mi\">\ud835\udc56<\/span><\/span><\/span><\/span><\/em>\u00a0different items for sale in a grocery store, and each of these items has a unique index from one to\u00a0<span id=\"MathJax-Element-4-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-21\" class=\"math\"><span id=\"MathJax-Span-22\" class=\"mrow\"><span id=\"MathJax-Span-23\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span><\/p>\n<p><span id=\"MathJax-Element-5-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-24\" class=\"math\"><span id=\"MathJax-Span-25\" class=\"mrow\"><span id=\"MathJax-Span-26\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-27\" class=\"mo\">\u2208<\/span><span id=\"MathJax-Span-28\" class=\"mo\">{<\/span><span id=\"MathJax-Span-29\" class=\"mn\">1<\/span><span id=\"MathJax-Span-30\" class=\"mo\">,<\/span><span id=\"MathJax-Span-31\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-32\" class=\"mo\">,<\/span><span id=\"MathJax-Span-33\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-34\" class=\"mo\">}<\/span><\/span><\/span><\/span><\/p>\n<p>Then let us assume we have a collection of subsets of these items, i.e.\u00a0<span id=\"MathJax-Element-6-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-35\" class=\"math\"><span id=\"MathJax-Span-36\" class=\"mrow\"><span id=\"MathJax-Span-37\" class=\"msubsup\"><span id=\"MathJax-Span-38\" class=\"mi\">\ud835\udc37<\/span><span id=\"MathJax-Span-39\" class=\"mi\">\ud835\udc5b<\/span><\/span><\/span><\/span><\/span>\u00a0for the\u00a0<span id=\"MathJax-Element-7-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-40\" class=\"math\"><span id=\"MathJax-Span-41\" class=\"mrow\"><span id=\"MathJax-Span-42\" class=\"msubsup\"><span id=\"MathJax-Span-43\" class=\"mi\">\ud835\udc5b<\/span><span id=\"MathJax-Span-44\" class=\"texatom\"><span id=\"MathJax-Span-45\" class=\"mrow\"><span id=\"MathJax-Span-46\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-47\" class=\"mi\">\u210e<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0subset would be a subset of items from one to\u00a0<span id=\"MathJax-Element-8-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-48\" class=\"math\"><span id=\"MathJax-Span-49\" class=\"mrow\"><span id=\"MathJax-Span-50\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span>. In other words, we should think of\u00a0<span id=\"MathJax-Element-9-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-51\" class=\"math\"><span id=\"MathJax-Span-52\" class=\"mrow\"><span id=\"MathJax-Span-53\" class=\"msubsup\"><span id=\"MathJax-Span-54\" class=\"mi\">\ud835\udc37<\/span><span id=\"MathJax-Span-55\" class=\"mi\">\ud835\udc5b<\/span><\/span><\/span><\/span><\/span>\u00a0as giving the list of things purchased by customer\u00a0<span id=\"MathJax-Element-10-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-56\" class=\"math\"><span id=\"MathJax-Span-57\" class=\"mrow\"><span id=\"MathJax-Span-58\" class=\"mi\">\ud835\udc5b<\/span><span id=\"MathJax-Span-59\" class=\"mo\">=<\/span><span id=\"MathJax-Span-60\" class=\"mn\">1<\/span><span id=\"MathJax-Span-61\" class=\"mo\">,<\/span><span id=\"MathJax-Span-62\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-63\" class=\"mo\">,<\/span><span id=\"MathJax-Span-64\" class=\"mspace\"><\/span><span id=\"MathJax-Span-65\" class=\"mi\">\ud835\udc41<\/span><\/span><\/span><\/span><\/p>\n<p><span id=\"MathJax-Element-11-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-66\" class=\"math\"><span id=\"MathJax-Span-67\" class=\"mrow\"><span id=\"MathJax-Span-68\" class=\"msubsup\"><span id=\"MathJax-Span-69\" class=\"mi\">\ud835\udc37<\/span><span id=\"MathJax-Span-70\" class=\"mi\">\ud835\udc5b<\/span><\/span><span id=\"MathJax-Span-71\" class=\"mo\">=<\/span><span id=\"MathJax-Span-72\" class=\"mspace\"><\/span><span id=\"MathJax-Span-73\" class=\"mo\">\u2282<\/span><span id=\"MathJax-Span-74\" class=\"mo\">{<\/span><span id=\"MathJax-Span-75\" class=\"mn\">1<\/span><span id=\"MathJax-Span-76\" class=\"mo\">,<\/span><span id=\"MathJax-Span-77\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-78\" class=\"mo\">,<\/span><span id=\"MathJax-Span-79\" class=\"mspace\"><\/span><span id=\"MathJax-Span-80\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-81\" class=\"mo\">}<\/span><\/span><\/span><\/span><\/p>\n<p>So now there are\u00a0<strong>two objectives<\/strong>\u00a0we want to accomplish here. The first is to perform an association analysis. This is simply about locating subsets of products that have a high probability of occurring together. For instance, let\u2019s agree on<\/p>\n<p><span id=\"MathJax-Element-12-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-82\" class=\"math\"><span id=\"MathJax-Span-83\" class=\"mrow\"><span id=\"MathJax-Span-84\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>, being\u00a0<span id=\"MathJax-Element-13-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-85\" class=\"math\"><span id=\"MathJax-Span-86\" class=\"mrow\"><span id=\"MathJax-Span-87\" class=\"mi\">\ud835\udc4f<\/span><span id=\"MathJax-Span-88\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-89\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-90\" class=\"mi\">\ud835\udc4e<\/span><span id=\"MathJax-Span-91\" class=\"mi\">\ud835\udc51<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-14-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-92\" class=\"math\"><span id=\"MathJax-Span-93\" class=\"mrow\"><span id=\"MathJax-Span-94\" class=\"mi\">\ud835\udc5a<\/span><span id=\"MathJax-Span-95\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-96\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-97\" class=\"mi\">\ud835\udc58<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-15-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-98\" class=\"math\"><span id=\"MathJax-Span-99\" class=\"mrow\"><span id=\"MathJax-Span-100\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-101\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-102\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-103\" class=\"mi\">\ud835\udc60<\/span><\/span><\/span><\/span><\/p>\n<p>and then we\u2019ll count how many of the grocery baskets contain those three products together. By dividing this figure by the total number of baskets, we get the fraction of customers that purchased bread, milk and eggs together:<\/p>\n<p><span id=\"MathJax-Element-16-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-104\" class=\"math\"><span id=\"MathJax-Span-105\" class=\"mrow\"><span id=\"MathJax-Span-106\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-107\" class=\"mo\">(<\/span><span id=\"MathJax-Span-108\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-109\" class=\"mo\">)<\/span><span id=\"MathJax-Span-110\" class=\"mo\">=<\/span><span id=\"MathJax-Span-111\" class=\"mfrac\"><span id=\"MathJax-Span-112\" class=\"mrow\"><span id=\"MathJax-Span-113\" class=\"mi\">#<\/span><span id=\"MathJax-Span-114\" class=\"mo\">{<\/span><span id=\"MathJax-Span-115\" class=\"mi\">\ud835\udc5b<\/span><span id=\"MathJax-Span-116\" class=\"mtext\">\u00a0<\/span><span id=\"MathJax-Span-117\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-118\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-119\" class=\"mi\">\ud835\udc50<\/span><span id=\"MathJax-Span-120\" class=\"mi\">\u210e<\/span><span id=\"MathJax-Span-121\" class=\"mtext\">\u00a0<\/span><span id=\"MathJax-Span-122\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-123\" class=\"mi\">\u210e<\/span><span id=\"MathJax-Span-124\" class=\"mi\">\ud835\udc4e<\/span><span id=\"MathJax-Span-125\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-126\" class=\"mtext\">\u00a0<\/span><span id=\"MathJax-Span-127\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-128\" class=\"mspace\"><\/span><span id=\"MathJax-Span-129\" class=\"mo\">\u2286<\/span><span id=\"MathJax-Span-130\" class=\"mspace\"><\/span><span id=\"MathJax-Span-131\" class=\"msubsup\"><span id=\"MathJax-Span-132\" class=\"mi\">\ud835\udc37<\/span><span id=\"MathJax-Span-133\" class=\"mi\">\ud835\udc5b<\/span><\/span><span id=\"MathJax-Span-134\" class=\"mo\">}<\/span><\/span><span id=\"MathJax-Span-135\" class=\"mi\">\ud835\udc41<\/span><\/span><\/span><\/span><\/span><\/p>\n<p>So, our goal is to find these subsets\u00a0<span id=\"MathJax-Element-17-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-136\" class=\"math\"><span id=\"MathJax-Span-137\" class=\"mrow\"><span id=\"MathJax-Span-138\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>, where\u00a0<span id=\"MathJax-Element-18-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-139\" class=\"math\"><span id=\"MathJax-Span-140\" class=\"mrow\"><span id=\"MathJax-Span-141\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-142\" class=\"mo\">(<\/span><span id=\"MathJax-Span-143\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-144\" class=\"mo\">)<\/span><\/span><\/span><\/span>\u00a0is a large number.<\/p>\n<p>Another objective is to discover association rules. This is the problem of finding objects that are highly correlated. Let\u00a0<span id=\"MathJax-Element-19-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-145\" class=\"math\"><span id=\"MathJax-Span-146\" class=\"mrow\"><span id=\"MathJax-Span-147\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-20-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-148\" class=\"math\"><span id=\"MathJax-Span-149\" class=\"mrow\"><span id=\"MathJax-Span-150\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0be two disjoint subsets\u00a0<span id=\"MathJax-Element-21-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-151\" class=\"math\"><span id=\"MathJax-Span-152\" class=\"mrow\"><span id=\"MathJax-Span-153\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-154\" class=\"mspace\"><\/span><span id=\"MathJax-Span-155\" class=\"mo\">\u2229<\/span><span id=\"MathJax-Span-156\" class=\"mspace\"><\/span><span id=\"MathJax-Span-157\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-158\" class=\"mspace\"><\/span><span id=\"MathJax-Span-159\" class=\"mo\">=<\/span><span id=\"MathJax-Span-160\" class=\"mspace\"><\/span><span id=\"MathJax-Span-161\" class=\"mi\">\u2205<\/span><\/span><\/span><\/span>\u00a0of the products\u00a0<span id=\"MathJax-Element-22-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-162\" class=\"math\"><span id=\"MathJax-Span-163\" class=\"mrow\"><span id=\"MathJax-Span-164\" class=\"mo\">{<\/span><span id=\"MathJax-Span-165\" class=\"mn\">1<\/span><span id=\"MathJax-Span-166\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-167\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-168\" class=\"mo\">}<\/span><\/span><\/span><\/span>. Then\u00a0<span id=\"MathJax-Element-23-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-169\" class=\"math\"><span id=\"MathJax-Span-170\" class=\"mrow\"><span id=\"MathJax-Span-171\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-172\" class=\"mo\">\u27f9<\/span><span id=\"MathJax-Span-173\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0can be interpreted in a way that purchasing\u00a0<span id=\"MathJax-Element-24-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-174\" class=\"math\"><span id=\"MathJax-Span-175\" class=\"mrow\"><span id=\"MathJax-Span-176\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0increases the likelihood of purchasing\u00a0<span id=\"MathJax-Element-25-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-177\" class=\"math\"><span id=\"MathJax-Span-178\" class=\"mrow\"><span id=\"MathJax-Span-179\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0as well.<\/p>\n<p>In order to learn all those figures, we need to represent our basket in a different way:<\/p>\n<table class=\"table1\">\n<tbody>\n<tr>\n<td class=\"td1\"><strong>Basket ID<\/strong><\/td>\n<td class=\"td1\"><strong>Beer<\/strong><\/td>\n<td class=\"td1\"><strong>Bread<\/strong><\/td>\n<td class=\"td1\"><strong>Butter<\/strong><\/td>\n<td class=\"td1\"><strong>Cheese<\/strong><\/td>\n<td class=\"td1\"><strong>Cola<\/strong><\/td>\n<td class=\"td1\"><strong>Eggs<\/strong><\/td>\n<td class=\"td1\"><strong>Boys<\/strong><\/td>\n<td class=\"td1\"><strong>Milk<\/strong><\/td>\n<td class=\"td1\"><strong>Salami<\/strong><\/td>\n<td class=\"td1\"><strong>Toast<\/strong><\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">2<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">3<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">4<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">5<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\">6<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">1<\/td>\n<td class=\"td1\">0<\/td>\n<td class=\"td1\">0<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Following our first objective, we\u2019re looking for a subset of those 10 items that co-occur with high probability. This probability could be defined by some threshold, let\u2019s say 49%. In this simple case, we could easily discover that eggs and butter appear in 50% of all baskets, which is beyond the threshold and therefore we\u2019re stating that these two items co-occur with high probability in this dataset.<\/p>\n<p>Unfortunately, or luckily \ud83d\ude09, real-life situations are much more sophisticated. There is a good probability that we have more than 6 customers and way more items for sale. So how about having\u00a0<span id=\"MathJax-Element-26-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-180\" class=\"math\"><span id=\"MathJax-Span-181\" class=\"mrow\"><span id=\"MathJax-Span-182\" class=\"mi\">\ud835\udc41<\/span><span id=\"MathJax-Span-183\" class=\"mo\">\u2248<\/span><span id=\"MathJax-Span-184\" class=\"msubsup\"><span id=\"MathJax-Span-185\" class=\"mn\">10<\/span><span id=\"MathJax-Span-186\" class=\"mn\">8<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-27-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-187\" class=\"math\"><span id=\"MathJax-Span-188\" class=\"mrow\"><span id=\"MathJax-Span-189\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-190\" class=\"mo\">\u2248<\/span><span id=\"MathJax-Span-191\" class=\"msubsup\"><span id=\"MathJax-Span-192\" class=\"mn\">10<\/span><span id=\"MathJax-Span-193\" class=\"mn\">4<\/span><\/span><\/span><\/span><\/span>?<\/p>\n<p>Using straightforward combinatorics would lead us to the conclusion that a simple brute-force search approach is not going to be possible in this case.<\/p>\n<p>The first question we could ask ourselves is how many product subsets\u00a0<span id=\"MathJax-Element-28-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-194\" class=\"math\"><span id=\"MathJax-Span-195\" class=\"mrow\"><span id=\"MathJax-Span-196\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-197\" class=\"mo\">\u2286<\/span><span id=\"MathJax-Span-198\" class=\"mo\">{<\/span><span id=\"MathJax-Span-199\" class=\"mn\">1<\/span><span id=\"MathJax-Span-200\" class=\"mo\">,<\/span><span id=\"MathJax-Span-201\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-202\" class=\"mo\">,<\/span><span id=\"MathJax-Span-203\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-204\" class=\"mo\">}<\/span><\/span><\/span><\/span>\u00a0are there at all? Each subset can be represented by a binary indicator vector of length\u00a0<span id=\"MathJax-Element-29-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-205\" class=\"math\"><span id=\"MathJax-Span-206\" class=\"mrow\"><span id=\"MathJax-Span-207\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span>\u00a0and the total number of possible vectors is\u00a0<span id=\"MathJax-Element-30-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-208\" class=\"math\"><span id=\"MathJax-Span-209\" class=\"mrow\"><span id=\"MathJax-Span-210\" class=\"msubsup\"><span id=\"MathJax-Span-211\" class=\"mn\">2<\/span><span id=\"MathJax-Span-212\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span><\/span>.<\/p>\n<p>And what if we didn\u2019t check all those possible product combinations, as only very few people (and I think I know them all personally \ud83d\ude0a) will have a basket with all of the grocery store items in it? How about if we only check up to\u00a0<span id=\"MathJax-Element-31-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-213\" class=\"math\"><span id=\"MathJax-Span-214\" class=\"mrow\"><span id=\"MathJax-Span-215\" class=\"mi\">\ud835\udc4f<\/span><\/span><\/span><\/span>\u00a0items? Hmm, the number of sets of size\u00a0<span id=\"MathJax-Element-32-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-216\" class=\"math\"><span id=\"MathJax-Span-217\" class=\"mrow\"><span id=\"MathJax-Span-218\" class=\"mi\">\ud835\udc4f<\/span><\/span><\/span><\/span>\u00a0picked from\u00a0<span id=\"MathJax-Element-33-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-219\" class=\"math\"><span id=\"MathJax-Span-220\" class=\"mrow\"><span id=\"MathJax-Span-221\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span>\u00a0items is<\/p>\n<p><span id=\"MathJax-Element-34-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-222\" class=\"math\"><span id=\"MathJax-Span-223\" class=\"mrow\"><span id=\"MathJax-Span-224\" class=\"mo\">(<\/span><span id=\"MathJax-Span-225\" class=\"texatom\"><span id=\"MathJax-Span-226\" class=\"mrow\"><span id=\"MathJax-Span-227\" class=\"msubsup\"><span id=\"MathJax-Span-228\" class=\"mi\"><\/span><span id=\"MathJax-Span-229\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-230\" class=\"mi\">\ud835\udc4f<\/span><\/span><\/span><\/span><span id=\"MathJax-Span-231\" class=\"mo\">)<\/span><span id=\"MathJax-Span-232\" class=\"mo\">=<\/span><span id=\"MathJax-Span-233\" class=\"mfrac\"><span id=\"MathJax-Span-234\" class=\"mrow\"><span id=\"MathJax-Span-235\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-236\" class=\"mo\">!<\/span><\/span><span id=\"MathJax-Span-237\" class=\"mrow\"><span id=\"MathJax-Span-238\" class=\"mi\">\ud835\udc4f<\/span><span id=\"MathJax-Span-239\" class=\"mo\">!<\/span><span id=\"MathJax-Span-240\" class=\"mo\">(<\/span><span id=\"MathJax-Span-241\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-242\" class=\"mspace\"><\/span><span id=\"MathJax-Span-243\" class=\"mo\">\u2013<\/span><span id=\"MathJax-Span-244\" class=\"mspace\"><\/span><span id=\"MathJax-Span-245\" class=\"mi\">\ud835\udc4f<\/span><span id=\"MathJax-Span-246\" class=\"mo\">)<\/span><span id=\"MathJax-Span-247\" class=\"mo\">!<\/span><\/span><\/span><\/span><\/span><\/span><\/p>\n<p>which might become a lifetime job with a lot of follow-up tasks for your descendants. If we had only\u00a0<span id=\"MathJax-Element-35-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-248\" class=\"math\"><span id=\"MathJax-Span-249\" class=\"mrow\"><span id=\"MathJax-Span-250\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-251\" class=\"mo\">=<\/span><span id=\"MathJax-Span-252\" class=\"msubsup\"><span id=\"MathJax-Span-253\" class=\"mn\">10<\/span><span id=\"MathJax-Span-254\" class=\"mn\">4<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-36-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-255\" class=\"math\"><span id=\"MathJax-Span-256\" class=\"mrow\"><span id=\"MathJax-Span-257\" class=\"mi\">\ud835\udc4f<\/span><span id=\"MathJax-Span-258\" class=\"mo\">=<\/span><span id=\"MathJax-Span-259\" class=\"mn\">5<\/span><\/span><\/span><\/span>\u00a0the total number of possible product combinations would be<\/p>\n<p><span id=\"MathJax-Element-37-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-260\" class=\"math\"><span id=\"MathJax-Span-261\" class=\"mrow\"><span id=\"MathJax-Span-262\" class=\"mo\">(<\/span><span id=\"MathJax-Span-263\" class=\"texatom\"><span id=\"MathJax-Span-264\" class=\"mrow\"><span id=\"MathJax-Span-265\" class=\"msubsup\"><span id=\"MathJax-Span-266\" class=\"mi\"><\/span><span id=\"MathJax-Span-267\" class=\"texatom\"><span id=\"MathJax-Span-268\" class=\"mrow\"><span id=\"MathJax-Span-269\" class=\"msubsup\"><span id=\"MathJax-Span-270\" class=\"mn\">10<\/span><span id=\"MathJax-Span-271\" class=\"mn\">4<\/span><\/span><\/span><\/span><span id=\"MathJax-Span-272\" class=\"mn\">5<\/span><\/span><\/span><\/span><span id=\"MathJax-Span-273\" class=\"mo\">)<\/span><span id=\"MathJax-Span-274\" class=\"mo\">=<\/span><span id=\"MathJax-Span-275\" class=\"msubsup\"><span id=\"MathJax-Span-276\" class=\"mn\">10<\/span><span id=\"MathJax-Span-277\" class=\"texatom\"><span id=\"MathJax-Span-278\" class=\"mrow\"><span id=\"MathJax-Span-279\" class=\"mn\">18<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/p>\n<p>As you can imagine, the solution to this tricky problem needs an algorithm that reduces the calculation effort and somehow reduces the count of subsets\u00a0<span id=\"MathJax-Element-38-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-280\" class=\"math\"><span id=\"MathJax-Span-281\" class=\"mrow\"><span id=\"MathJax-Span-282\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-283\" class=\"mo\">\u2286<\/span><span id=\"MathJax-Span-284\" class=\"mo\">{<\/span><span id=\"MathJax-Span-285\" class=\"mn\">1<\/span><span id=\"MathJax-Span-286\" class=\"mo\">,<\/span><span id=\"MathJax-Span-287\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-288\" class=\"mo\">,<\/span><span id=\"MathJax-Span-289\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-290\" class=\"mo\">}<\/span><\/span><\/span><\/span>\u00a0that need to be taken into consideration.<\/p>\n<p>But first, let us decide which terms need to be calculated. As already mentioned, we\u2019re still looking at all relevant product subsets\u00a0<span id=\"MathJax-Element-39-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-291\" class=\"math\"><span id=\"MathJax-Span-292\" class=\"mrow\"><span id=\"MathJax-Span-293\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-294\" class=\"mo\">\u2286<\/span><span id=\"MathJax-Span-295\" class=\"mo\">{<\/span><span id=\"MathJax-Span-296\" class=\"mn\">1<\/span><span id=\"MathJax-Span-297\" class=\"mo\">,<\/span><span id=\"MathJax-Span-298\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-299\" class=\"mo\">,<\/span><span id=\"MathJax-Span-300\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-301\" class=\"mo\">}<\/span><\/span><\/span><\/span>. And of course, we\u2019re still bearing in mind those disjoint product subsets\u00a0<span id=\"MathJax-Element-40-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-302\" class=\"math\"><span id=\"MathJax-Span-303\" class=\"mrow\"><span id=\"MathJax-Span-304\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-41-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-305\" class=\"math\"><span id=\"MathJax-Span-306\" class=\"mrow\"><span id=\"MathJax-Span-307\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0so that\u00a0<span id=\"MathJax-Element-42-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-308\" class=\"math\"><span id=\"MathJax-Span-309\" class=\"mrow\"><span id=\"MathJax-Span-310\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-311\" class=\"mo\">,<\/span><span id=\"MathJax-Span-312\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-313\" class=\"mo\">\u2282<\/span><span id=\"MathJax-Span-314\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-43-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-315\" class=\"math\"><span id=\"MathJax-Span-316\" class=\"mrow\"><span id=\"MathJax-Span-317\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-318\" class=\"mo\">\u222a<\/span><span id=\"MathJax-Span-319\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-320\" class=\"mo\">=<\/span><span id=\"MathJax-Span-321\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-44-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-322\" class=\"math\"><span id=\"MathJax-Span-323\" class=\"mrow\"><span id=\"MathJax-Span-324\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-325\" class=\"mo\">\u2229<\/span><span id=\"MathJax-Span-326\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-327\" class=\"mo\">=<\/span><span id=\"MathJax-Span-328\" class=\"mi\">\u2205<\/span><\/span><\/span><\/span>.<\/p>\n<p>So the first term of an interest is the probability of\u00a0<span id=\"MathJax-Element-45-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-329\" class=\"math\"><span id=\"MathJax-Span-330\" class=\"mrow\"><span id=\"MathJax-Span-331\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>, which can also be written as the joint probability of subsets\u00a0<span id=\"MathJax-Element-46-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-332\" class=\"math\"><span id=\"MathJax-Span-333\" class=\"mrow\"><span id=\"MathJax-Span-334\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-47-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-335\" class=\"math\"><span id=\"MathJax-Span-336\" class=\"mrow\"><span id=\"MathJax-Span-337\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>.<\/p>\n<p><span id=\"MathJax-Element-48-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-338\" class=\"math\"><span id=\"MathJax-Span-339\" class=\"mrow\"><span id=\"MathJax-Span-340\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-341\" class=\"mo\">(<\/span><span id=\"MathJax-Span-342\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-343\" class=\"mo\">)<\/span><span id=\"MathJax-Span-344\" class=\"mo\">=<\/span><span id=\"MathJax-Span-345\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-346\" class=\"mo\">(<\/span><span id=\"MathJax-Span-347\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-348\" class=\"mo\">,<\/span><span id=\"MathJax-Span-349\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-350\" class=\"mo\">)<\/span><\/span><\/span><\/span><\/p>\n<p>We call this term the prevalence or\u00a0<strong>support<\/strong>\u00a0of the items in\u00a0<span id=\"MathJax-Element-49-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-351\" class=\"math\"><span id=\"MathJax-Span-352\" class=\"mrow\"><span id=\"MathJax-Span-353\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>. In essence, we\u2019re looking for all subsets of\u00a0<span id=\"MathJax-Element-50-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-354\" class=\"math\"><span id=\"MathJax-Span-355\" class=\"mrow\"><span id=\"MathJax-Span-356\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0such that their combination in\u00a0<span id=\"MathJax-Element-51-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-357\" class=\"math\"><span id=\"MathJax-Span-358\" class=\"mrow\"><span id=\"MathJax-Span-359\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0co-occurs often.<\/p>\n<p>E.g. if\u00a0<span id=\"MathJax-Element-52-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-360\" class=\"math\"><span id=\"MathJax-Span-361\" class=\"mrow\"><span id=\"MathJax-Span-362\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-363\" class=\"mo\">=<\/span><span id=\"MathJax-Span-364\" class=\"mo\">{<\/span><span id=\"MathJax-Span-365\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-366\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-367\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-368\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-369\" class=\"mo\">,<\/span><span id=\"MathJax-Span-370\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-371\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-372\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-373\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-374\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-375\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-376\" class=\"mo\">,<\/span><span id=\"MathJax-Span-377\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-378\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-379\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-380\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-381\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-53-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-382\" class=\"math\"><span id=\"MathJax-Span-383\" class=\"mrow\"><span id=\"MathJax-Span-384\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-385\" class=\"mo\">=<\/span><span id=\"MathJax-Span-386\" class=\"mo\">{<\/span><span id=\"MathJax-Span-387\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-388\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-389\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-390\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-391\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-54-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-392\" class=\"math\"><span id=\"MathJax-Span-393\" class=\"mrow\"><span id=\"MathJax-Span-394\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-395\" class=\"mo\">=<\/span><span id=\"MathJax-Span-396\" class=\"mo\">{<\/span><span id=\"MathJax-Span-397\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-398\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-399\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-400\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-401\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-402\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-403\" class=\"mo\">,<\/span><span id=\"MathJax-Span-404\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-405\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-406\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-407\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-408\" class=\"mo\">}<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-55-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-409\" class=\"math\"><span id=\"MathJax-Span-410\" class=\"mrow\"><span id=\"MathJax-Span-411\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-412\" class=\"mo\">(<\/span><span id=\"MathJax-Span-413\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-414\" class=\"mo\">)<\/span><span id=\"MathJax-Span-415\" class=\"mo\">=<\/span><span id=\"MathJax-Span-416\" class=\"mn\">0<\/span><span id=\"MathJax-Span-417\" class=\"mo\">,<\/span><span id=\"MathJax-Span-418\" class=\"mn\">2<\/span><\/span><\/span><\/span>\u00a0then eggs, butter and milk will be purchased together in 20% of all cases (baskets).<\/p>\n<p>And then we\u2019d like to learn the conditional probability of set\u00a0<span id=\"MathJax-Element-56-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-419\" class=\"math\"><span id=\"MathJax-Span-420\" class=\"mrow\"><span id=\"MathJax-Span-421\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0given set\u00a0<span id=\"MathJax-Element-57-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-422\" class=\"math\"><span id=\"MathJax-Span-423\" class=\"mrow\"><span id=\"MathJax-Span-424\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>. In other words, if you know that\u00a0<span id=\"MathJax-Element-58-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-425\" class=\"math\"><span id=\"MathJax-Span-426\" class=\"mrow\"><span id=\"MathJax-Span-427\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0is in the grocery basket, what is the probability that\u00a0<span id=\"MathJax-Element-59-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-428\" class=\"math\"><span id=\"MathJax-Span-429\" class=\"mrow\"><span id=\"MathJax-Span-430\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0will also appear in the basket. We call it\u00a0<strong>confidence<\/strong>, which will be used as the basis for assumption\u00a0<span id=\"MathJax-Element-60-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-431\" class=\"math\"><span id=\"MathJax-Span-432\" class=\"mrow\"><span id=\"MathJax-Span-433\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-434\" class=\"mo\">\u27f9<\/span><span id=\"MathJax-Span-435\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0(purchasing\u00a0<span id=\"MathJax-Element-61-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-436\" class=\"math\"><span id=\"MathJax-Span-437\" class=\"mrow\"><span id=\"MathJax-Span-438\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0increases the likelihood of also purchasing\u00a0<span id=\"MathJax-Element-62-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-439\" class=\"math\"><span id=\"MathJax-Span-440\" class=\"mrow\"><span id=\"MathJax-Span-441\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>).<\/p>\n<p><span id=\"MathJax-Element-63-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-442\" class=\"math\"><span id=\"MathJax-Span-443\" class=\"mrow\"><span id=\"MathJax-Span-444\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-445\" class=\"mo\">(<\/span><span id=\"MathJax-Span-446\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-447\" class=\"mo\">|<\/span><span id=\"MathJax-Span-448\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-449\" class=\"mo\">)<\/span><span id=\"MathJax-Span-450\" class=\"mo\">=<\/span><span id=\"MathJax-Span-451\" class=\"mfrac\"><span id=\"MathJax-Span-452\" class=\"mrow\"><span id=\"MathJax-Span-453\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-454\" class=\"mo\">(<\/span><span id=\"MathJax-Span-455\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-456\" class=\"mo\">,<\/span><span id=\"MathJax-Span-457\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-458\" class=\"mo\">)<\/span><\/span><span id=\"MathJax-Span-459\" class=\"mrow\"><span id=\"MathJax-Span-460\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-461\" class=\"mo\">(<\/span><span id=\"MathJax-Span-462\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-463\" class=\"mo\">)<\/span><\/span><\/span><span id=\"MathJax-Span-464\" class=\"mo\">=<\/span><span id=\"MathJax-Span-465\" class=\"mfrac\"><span id=\"MathJax-Span-466\" class=\"mrow\"><span id=\"MathJax-Span-467\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-468\" class=\"mo\">(<\/span><span id=\"MathJax-Span-469\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-470\" class=\"mo\">)<\/span><\/span><span id=\"MathJax-Span-471\" class=\"mrow\"><span id=\"MathJax-Span-472\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-473\" class=\"mo\">(<\/span><span id=\"MathJax-Span-474\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-475\" class=\"mo\">)<\/span><\/span><\/span><\/span><\/span><\/span><\/p>\n<p>e.g. if\u00a0<span id=\"MathJax-Element-64-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-476\" class=\"math\"><span id=\"MathJax-Span-477\" class=\"mrow\"><span id=\"MathJax-Span-478\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-479\" class=\"mo\">=<\/span><span id=\"MathJax-Span-480\" class=\"mo\">{<\/span><span id=\"MathJax-Span-481\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-482\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-483\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-484\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-485\" class=\"mo\">,<\/span><span id=\"MathJax-Span-486\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-487\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-488\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-489\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-490\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-491\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-492\" class=\"mo\">,<\/span><span id=\"MathJax-Span-493\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-494\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-495\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-496\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-497\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-65-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-498\" class=\"math\"><span id=\"MathJax-Span-499\" class=\"mrow\"><span id=\"MathJax-Span-500\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-501\" class=\"mo\">=<\/span><span id=\"MathJax-Span-502\" class=\"mo\">{<\/span><span id=\"MathJax-Span-503\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-504\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-505\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-506\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-507\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-66-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-508\" class=\"math\"><span id=\"MathJax-Span-509\" class=\"mrow\"><span id=\"MathJax-Span-510\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-511\" class=\"mo\">=<\/span><span id=\"MathJax-Span-512\" class=\"mo\">{<\/span><span id=\"MathJax-Span-513\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-514\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-515\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-516\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-517\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-518\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-519\" class=\"mo\">,<\/span><span id=\"MathJax-Span-520\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-521\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-522\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-523\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-524\" class=\"mo\">}<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-67-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-525\" class=\"math\"><span id=\"MathJax-Span-526\" class=\"mrow\"><span id=\"MathJax-Span-527\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-528\" class=\"mo\">(<\/span><span id=\"MathJax-Span-529\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-530\" class=\"mo\">|<\/span><span id=\"MathJax-Span-531\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-532\" class=\"mo\">)<\/span><span id=\"MathJax-Span-533\" class=\"mo\">=<\/span><span id=\"MathJax-Span-534\" class=\"mn\">0<\/span><span id=\"MathJax-Span-535\" class=\"mo\">,<\/span><span id=\"MathJax-Span-536\" class=\"mn\">65<\/span><\/span><\/span><\/span>\u00a0then eggs were purchased in 65% of all checkouts where butter and milk were also purchased.<\/p>\n<p>Finally, we want to know how much more certain we are of getting\u00a0<span id=\"MathJax-Element-68-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-537\" class=\"math\"><span id=\"MathJax-Span-538\" class=\"mrow\"><span id=\"MathJax-Span-539\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0in the basket if\u00a0<span id=\"MathJax-Element-69-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-540\" class=\"math\"><span id=\"MathJax-Span-541\" class=\"mrow\"><span id=\"MathJax-Span-542\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0is already there, and that compared to the simple prevalence of\u00a0<span id=\"MathJax-Element-70-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-543\" class=\"math\"><span id=\"MathJax-Span-544\" class=\"mrow\"><span id=\"MathJax-Span-545\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>.<\/p>\n<p><span id=\"MathJax-Element-71-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-546\" class=\"math\"><span id=\"MathJax-Span-547\" class=\"mrow\"><span id=\"MathJax-Span-548\" class=\"mi\">\ud835\udc3f<\/span><span id=\"MathJax-Span-549\" class=\"mo\">(<\/span><span id=\"MathJax-Span-550\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-551\" class=\"mo\">|<\/span><span id=\"MathJax-Span-552\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-553\" class=\"mo\">)<\/span><span id=\"MathJax-Span-554\" class=\"mo\">=<\/span><span id=\"MathJax-Span-555\" class=\"mfrac\"><span id=\"MathJax-Span-556\" class=\"mrow\"><span id=\"MathJax-Span-557\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-558\" class=\"mo\">(<\/span><span id=\"MathJax-Span-559\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-560\" class=\"mo\">,<\/span><span id=\"MathJax-Span-561\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-562\" class=\"mo\">)<\/span><\/span><span id=\"MathJax-Span-563\" class=\"mrow\"><span id=\"MathJax-Span-564\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-565\" class=\"mo\">(<\/span><span id=\"MathJax-Span-566\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-567\" class=\"mo\">)<\/span><span id=\"MathJax-Span-568\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-569\" class=\"mo\">(<\/span><span id=\"MathJax-Span-570\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-571\" class=\"mo\">)<\/span><\/span><\/span><span id=\"MathJax-Span-572\" class=\"mo\">=<\/span><span id=\"MathJax-Span-573\" class=\"mfrac\"><span id=\"MathJax-Span-574\" class=\"mrow\"><span id=\"MathJax-Span-575\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-576\" class=\"mo\">(<\/span><span id=\"MathJax-Span-577\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-578\" class=\"mo\">)<\/span><span id=\"MathJax-Span-579\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-580\" class=\"mo\">(<\/span><span id=\"MathJax-Span-581\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-582\" class=\"mo\">)<\/span><\/span><span id=\"MathJax-Span-583\" class=\"mrow\"><span id=\"MathJax-Span-584\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-585\" class=\"mo\">(<\/span><span id=\"MathJax-Span-586\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-587\" class=\"mo\">)<\/span><span id=\"MathJax-Span-588\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-589\" class=\"mo\">(<\/span><span id=\"MathJax-Span-590\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-591\" class=\"mo\">)<\/span><\/span><\/span><span id=\"MathJax-Span-592\" class=\"mo\">=<\/span><span id=\"MathJax-Span-593\" class=\"mfrac\"><span id=\"MathJax-Span-594\" class=\"mrow\"><span id=\"MathJax-Span-595\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-596\" class=\"mo\">(<\/span><span id=\"MathJax-Span-597\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-598\" class=\"mo\">)<\/span><\/span><span id=\"MathJax-Span-599\" class=\"mrow\"><span id=\"MathJax-Span-600\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-601\" class=\"mo\">(<\/span><span id=\"MathJax-Span-602\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-603\" class=\"mo\">)<\/span><\/span><\/span><\/span><\/span><\/span><\/p>\n<p>This is called the\u00a0<strong>lift<\/strong>\u00a0of rule\u00a0<span id=\"MathJax-Element-72-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-604\" class=\"math\"><span id=\"MathJax-Span-605\" class=\"mrow\"><span id=\"MathJax-Span-606\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-607\" class=\"mo\">\u27f9<\/span><span id=\"MathJax-Span-608\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0(purchasing\u00a0<span id=\"MathJax-Element-73-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-609\" class=\"math\"><span id=\"MathJax-Span-610\" class=\"mrow\"><span id=\"MathJax-Span-611\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0increases the likelihood of also purchasing\u00a0<span id=\"MathJax-Element-74-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-612\" class=\"math\"><span id=\"MathJax-Span-613\" class=\"mrow\"><span id=\"MathJax-Span-614\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>).<\/p>\n<p>E.g. if\u00a0<span id=\"MathJax-Element-75-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-615\" class=\"math\"><span id=\"MathJax-Span-616\" class=\"mrow\"><span id=\"MathJax-Span-617\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-618\" class=\"mo\">=<\/span><span id=\"MathJax-Span-619\" class=\"mo\">{<\/span><span id=\"MathJax-Span-620\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-621\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-622\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-623\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-624\" class=\"mo\">,<\/span><span id=\"MathJax-Span-625\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-626\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-627\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-628\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-629\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-630\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-631\" class=\"mo\">,<\/span><span id=\"MathJax-Span-632\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-633\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-634\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-635\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-636\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-76-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-637\" class=\"math\"><span id=\"MathJax-Span-638\" class=\"mrow\"><span id=\"MathJax-Span-639\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-640\" class=\"mo\">=<\/span><span id=\"MathJax-Span-641\" class=\"mo\">{<\/span><span id=\"MathJax-Span-642\" class=\"mi\">\ud835\udc38<\/span><span id=\"MathJax-Span-643\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-644\" class=\"mi\">\ud835\udc54<\/span><span id=\"MathJax-Span-645\" class=\"mi\">\ud835\udc60<\/span><span id=\"MathJax-Span-646\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-77-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-647\" class=\"math\"><span id=\"MathJax-Span-648\" class=\"mrow\"><span id=\"MathJax-Span-649\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-650\" class=\"mo\">=<\/span><span id=\"MathJax-Span-651\" class=\"mo\">{<\/span><span id=\"MathJax-Span-652\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-653\" class=\"mi\">\ud835\udc62<\/span><span id=\"MathJax-Span-654\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-655\" class=\"mi\">\ud835\udc61<\/span><span id=\"MathJax-Span-656\" class=\"mi\">\ud835\udc52<\/span><span id=\"MathJax-Span-657\" class=\"mi\">\ud835\udc5f<\/span><span id=\"MathJax-Span-658\" class=\"mo\">,<\/span><span id=\"MathJax-Span-659\" class=\"mi\">\ud835\udc40<\/span><span id=\"MathJax-Span-660\" class=\"mi\">\ud835\udc56<\/span><span id=\"MathJax-Span-661\" class=\"mi\">\ud835\udc59<\/span><span id=\"MathJax-Span-662\" class=\"mi\">\ud835\udc58<\/span><span id=\"MathJax-Span-663\" class=\"mo\">}<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-78-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-664\" class=\"math\"><span id=\"MathJax-Span-665\" class=\"mrow\"><span id=\"MathJax-Span-666\" class=\"mi\">\ud835\udc3f<\/span><span id=\"MathJax-Span-667\" class=\"mo\">(<\/span><span id=\"MathJax-Span-668\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-669\" class=\"mo\">|<\/span><span id=\"MathJax-Span-670\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-671\" class=\"mo\">)<\/span><span id=\"MathJax-Span-672\" class=\"mo\">=<\/span><span id=\"MathJax-Span-673\" class=\"mn\">1<\/span><span id=\"MathJax-Span-674\" class=\"mo\">,<\/span><span id=\"MathJax-Span-675\" class=\"mn\">6<\/span><\/span><\/span><\/span>\u00a0then it is 1,6 times more likely that butter and milk will be purchased given that eggs are in the grocery basket.<\/p>\n<p>Again, remember that we agreed on setting a threshold\u00a0<span id=\"MathJax-Element-79-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-676\" class=\"math\"><span id=\"MathJax-Span-677\" class=\"mrow\"><span id=\"MathJax-Span-678\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0and that the empirical probability of\u00a0<span id=\"MathJax-Element-80-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-679\" class=\"math\"><span id=\"MathJax-Span-680\" class=\"mrow\"><span id=\"MathJax-Span-681\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0needs to exceed this threshold. The GUHA principle based on the\u00a0<strong>Apriori<\/strong>\u00a0algorithm provides us with the solution to find all subsets\u00a0<span id=\"MathJax-Element-81-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-682\" class=\"math\"><span id=\"MathJax-Span-683\" class=\"mrow\"><span id=\"MathJax-Span-684\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0without having to search in all\u00a0<span id=\"MathJax-Element-82-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-685\" class=\"math\"><span id=\"MathJax-Span-686\" class=\"mrow\"><span id=\"MathJax-Span-687\" class=\"msubsup\"><span id=\"MathJax-Span-688\" class=\"mn\">2<\/span><span id=\"MathJax-Span-689\" class=\"mi\">\ud835\udc3c<\/span><\/span><\/span><\/span><\/span>\u00a0product combinations.<\/p>\n<p>First, we need to set\u00a0<span id=\"MathJax-Element-83-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-690\" class=\"math\"><span id=\"MathJax-Span-691\" class=\"mrow\"><span id=\"MathJax-Span-692\" class=\"mn\">0<\/span><span id=\"MathJax-Span-693\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-694\" class=\"mi\">\ud835\udf0f<\/span><span id=\"MathJax-Span-695\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-696\" class=\"mn\">1<\/span><\/span><\/span><\/span>\u00a0such that only a relatively small fraction of all subsets\u00a0<span id=\"MathJax-Element-84-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-697\" class=\"math\"><span id=\"MathJax-Span-698\" class=\"mrow\"><span id=\"MathJax-Span-699\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0satisfies\u00a0<span id=\"MathJax-Element-85-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-700\" class=\"math\"><span id=\"MathJax-Span-701\" class=\"mrow\"><span id=\"MathJax-Span-702\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-703\" class=\"mo\">(<\/span><span id=\"MathJax-Span-704\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-705\" class=\"mo\">)<\/span><span id=\"MathJax-Span-706\" class=\"mo\">&gt;<\/span><span id=\"MathJax-Span-707\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>. The Apriori algorithm restricts the number of subsets\u00a0<span id=\"MathJax-Element-86-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-708\" class=\"math\"><span id=\"MathJax-Span-709\" class=\"mrow\"><span id=\"MathJax-Span-710\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0that need to be checked by comparing their probabilities with the above-mentioned threshold. And of course, if\u00a0<span id=\"MathJax-Element-87-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-711\" class=\"math\"><span id=\"MathJax-Span-712\" class=\"mrow\"><span id=\"MathJax-Span-713\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0satisfies\u00a0<span id=\"MathJax-Element-88-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-714\" class=\"math\"><span id=\"MathJax-Span-715\" class=\"mrow\"><span id=\"MathJax-Span-716\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-717\" class=\"mo\">(<\/span><span id=\"MathJax-Span-718\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-719\" class=\"mo\">)<\/span><span id=\"MathJax-Span-720\" class=\"mo\">&gt;<\/span><span id=\"MathJax-Span-721\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0it appears in\u00a0<span id=\"MathJax-Element-89-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-722\" class=\"math\"><span id=\"MathJax-Span-723\" class=\"mrow\"><span id=\"MathJax-Span-724\" class=\"mi\">\ud835\udc41<\/span><span id=\"MathJax-Span-725\" class=\"mo\">\u22c5<\/span><span id=\"MathJax-Span-726\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0of\u00a0<span id=\"MathJax-Element-90-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-727\" class=\"math\"><span id=\"MathJax-Span-728\" class=\"mrow\"><span id=\"MathJax-Span-729\" class=\"mi\">\ud835\udc41<\/span><\/span><\/span><\/span>\u00a0baskets.<\/p>\n<p>Before we come to the definition of the Apriori algorithm, let me remind you of some basic probability rules that form its foundation:<\/p>\n<p>If\u00a0<span id=\"MathJax-Element-91-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-730\" class=\"math\"><span id=\"MathJax-Span-731\" class=\"mrow\"><span id=\"MathJax-Span-732\" class=\"msup\"><span id=\"MathJax-Span-733\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-734\" class=\"mo\">\u2032<\/span><\/span><span id=\"MathJax-Span-735\" class=\"mo\">=<\/span><span id=\"MathJax-Span-736\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-737\" class=\"mo\">\u222a<\/span><span id=\"MathJax-Span-738\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-92-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-739\" class=\"math\"><span id=\"MathJax-Span-740\" class=\"mrow\"><span id=\"MathJax-Span-741\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-742\" class=\"mo\">\u2282<\/span><span id=\"MathJax-Span-743\" class=\"mo\">{<\/span><span id=\"MathJax-Span-744\" class=\"mn\">1<\/span><span id=\"MathJax-Span-745\" class=\"mo\">,<\/span><span id=\"MathJax-Span-746\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-747\" class=\"mo\">,<\/span><span id=\"MathJax-Span-748\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-749\" class=\"mo\">}<\/span><\/span><\/span><\/span>,\u00a0<span id=\"MathJax-Element-93-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-750\" class=\"math\"><span id=\"MathJax-Span-751\" class=\"mrow\"><span id=\"MathJax-Span-752\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-753\" class=\"mo\">\u2282<\/span><span id=\"MathJax-Span-754\" class=\"mo\">{<\/span><span id=\"MathJax-Span-755\" class=\"mn\">1<\/span><span id=\"MathJax-Span-756\" class=\"mo\">,<\/span><span id=\"MathJax-Span-757\" class=\"mo\">\u2026<\/span><span id=\"MathJax-Span-758\" class=\"mo\">,<\/span><span id=\"MathJax-Span-759\" class=\"mi\">\ud835\udc3c<\/span><span id=\"MathJax-Span-760\" class=\"mo\">}<\/span><\/span><\/span><\/span><br \/>\nthen\u00a0<span id=\"MathJax-Element-94-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-761\" class=\"math\"><span id=\"MathJax-Span-762\" class=\"mrow\"><span id=\"MathJax-Span-763\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-764\" class=\"mo\">(<\/span><span id=\"MathJax-Span-765\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-766\" class=\"mo\">)<\/span><span id=\"MathJax-Span-767\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-768\" class=\"mi\">\ud835\udf0f<\/span><span id=\"MathJax-Span-769\" class=\"mo\">\u27f9<\/span><span id=\"MathJax-Span-770\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-771\" class=\"mo\">(<\/span><span id=\"MathJax-Span-772\" class=\"msup\"><span id=\"MathJax-Span-773\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-774\" class=\"mo\">\u2032<\/span><\/span><span id=\"MathJax-Span-775\" class=\"mo\">)<\/span><span id=\"MathJax-Span-776\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-777\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span><br \/>\nbecause\u00a0<span id=\"MathJax-Element-95-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-778\" class=\"math\"><span id=\"MathJax-Span-779\" class=\"mrow\"><span id=\"MathJax-Span-780\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-781\" class=\"mo\">(<\/span><span id=\"MathJax-Span-782\" class=\"msup\"><span id=\"MathJax-Span-783\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-784\" class=\"mo\">\u2032<\/span><\/span><span id=\"MathJax-Span-785\" class=\"mo\">)<\/span><span id=\"MathJax-Span-786\" class=\"mo\">=<\/span><span id=\"MathJax-Span-787\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-788\" class=\"mo\">(<\/span><span id=\"MathJax-Span-789\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-790\" class=\"mo\">,<\/span><span id=\"MathJax-Span-791\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-792\" class=\"mo\">)<\/span><span id=\"MathJax-Span-793\" class=\"mo\">=<\/span><span id=\"MathJax-Span-794\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-795\" class=\"mo\">(<\/span><span id=\"MathJax-Span-796\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-797\" class=\"mo\">|<\/span><span id=\"MathJax-Span-798\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-799\" class=\"mo\">)<\/span><span id=\"MathJax-Span-800\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-801\" class=\"mo\">(<\/span><span id=\"MathJax-Span-802\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-803\" class=\"mo\">)<\/span><span id=\"MathJax-Span-804\" class=\"mo\">\u2264<\/span><span id=\"MathJax-Span-805\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-806\" class=\"mo\">(<\/span><span id=\"MathJax-Span-807\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-808\" class=\"mo\">)<\/span><span id=\"MathJax-Span-809\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-810\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span><\/p>\n<p>And conversely, if\u00a0<span id=\"MathJax-Element-96-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-811\" class=\"math\"><span id=\"MathJax-Span-812\" class=\"mrow\"><span id=\"MathJax-Span-813\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-814\" class=\"mo\">(<\/span><span id=\"MathJax-Span-815\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-816\" class=\"mo\">)<\/span><span id=\"MathJax-Span-817\" class=\"mo\">&gt;<\/span><span id=\"MathJax-Span-818\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-97-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-819\" class=\"math\"><span id=\"MathJax-Span-820\" class=\"mrow\"><span id=\"MathJax-Span-821\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-822\" class=\"mo\">\u2282<\/span><span id=\"MathJax-Span-823\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0then\u00a0<span id=\"MathJax-Element-98-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-824\" class=\"math\"><span id=\"MathJax-Span-825\" class=\"mrow\"><span id=\"MathJax-Span-826\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-827\" class=\"mo\">(<\/span><span id=\"MathJax-Span-828\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-829\" class=\"mo\">)<\/span><span id=\"MathJax-Span-830\" class=\"mo\">&gt;<\/span><span id=\"MathJax-Span-831\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-832\" class=\"mo\">(<\/span><span id=\"MathJax-Span-833\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-834\" class=\"mo\">)<\/span><span id=\"MathJax-Span-835\" class=\"mo\">&gt;<\/span><span id=\"MathJax-Span-836\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span><\/p>\n<p>The problem can be represented as a lattice diagram:<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-23810\" src=\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/lattice-diagram-frequent-subsets-vs-infrequent-subsets-of-products.png\" alt=\"Lattice diagram - frequent subsets vs. infrequent subsets of products\" width=\"1200\" height=\"978\" srcset=\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/lattice-diagram-frequent-subsets-vs-infrequent-subsets-of-products.png 1200w, https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/lattice-diagram-frequent-subsets-vs-infrequent-subsets-of-products-300x245.png 300w, https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/lattice-diagram-frequent-subsets-vs-infrequent-subsets-of-products-1024x835.png 1024w, https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/lattice-diagram-frequent-subsets-vs-infrequent-subsets-of-products-768x626.png 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>And here is the simplified version of the Apriori algorithm:<\/p>\n<ul>\n<li>Set threshold\u00a0<span id=\"MathJax-Element-99-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-837\" class=\"math\"><span id=\"MathJax-Span-838\" class=\"mrow\"><span id=\"MathJax-Span-839\" class=\"mi\">\ud835\udc41<\/span><span id=\"MathJax-Span-840\" class=\"mo\">\u22c5<\/span><span id=\"MathJax-Span-841\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0where\u00a0<span id=\"MathJax-Element-100-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-842\" class=\"math\"><span id=\"MathJax-Span-843\" class=\"mrow\"><span id=\"MathJax-Span-844\" class=\"mn\">0<\/span><span id=\"MathJax-Span-845\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-846\" class=\"mi\">\ud835\udf0f<\/span><span id=\"MathJax-Span-847\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-848\" class=\"mn\">1<\/span><\/span><\/span><\/span>\u00a0but needs to be reasonably small<\/li>\n<li><span id=\"MathJax-Element-101-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-849\" class=\"math\"><span id=\"MathJax-Span-850\" class=\"mrow\"><span id=\"MathJax-Span-851\" class=\"texatom\"><span id=\"MathJax-Span-852\" class=\"mrow\"><span id=\"MathJax-Span-853\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-854\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-855\" class=\"texatom\"><span id=\"MathJax-Span-856\" class=\"mrow\"><span id=\"MathJax-Span-857\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-858\" class=\"mo\">=<\/span><span id=\"MathJax-Span-859\" class=\"mn\">1<\/span><\/span><\/span><\/span>\u00a0for each item.\u00a0<span id=\"MathJax-Element-102-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-860\" class=\"math\"><span id=\"MathJax-Span-861\" class=\"mrow\"><span id=\"MathJax-Span-862\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-863\" class=\"mo\">(<\/span><span id=\"MathJax-Span-864\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-865\" class=\"mo\">)<\/span><span id=\"MathJax-Span-866\" class=\"mo\">\u2265<\/span><span id=\"MathJax-Span-867\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0i.e. it needs to be in\u00a0<span id=\"MathJax-Element-103-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-868\" class=\"math\"><span id=\"MathJax-Span-869\" class=\"mrow\"><span id=\"MathJax-Span-870\" class=\"mo\">\u2265<\/span><span id=\"MathJax-Span-871\" class=\"mi\">\ud835\udc41<\/span><span id=\"MathJax-Span-872\" class=\"mo\">\u22c5<\/span><span id=\"MathJax-Span-873\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0baskets. Those with\u00a0<span id=\"MathJax-Element-104-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-874\" class=\"math\"><span id=\"MathJax-Span-875\" class=\"mrow\"><span id=\"MathJax-Span-876\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-877\" class=\"mo\">(<\/span><span id=\"MathJax-Span-878\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-879\" class=\"mo\">)<\/span><span id=\"MathJax-Span-880\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-881\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0need to be dropped<\/li>\n<li><span id=\"MathJax-Element-105-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-882\" class=\"math\"><span id=\"MathJax-Span-883\" class=\"mrow\"><span id=\"MathJax-Span-884\" class=\"texatom\"><span id=\"MathJax-Span-885\" class=\"mrow\"><span id=\"MathJax-Span-886\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-887\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-888\" class=\"texatom\"><span id=\"MathJax-Span-889\" class=\"mrow\"><span id=\"MathJax-Span-890\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-891\" class=\"mo\">=<\/span><span id=\"MathJax-Span-892\" class=\"mn\">2<\/span><\/span><\/span><\/span>\u00a0for all\u00a0<span id=\"MathJax-Element-106-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-893\" class=\"math\"><span id=\"MathJax-Span-894\" class=\"mrow\"><span id=\"MathJax-Span-895\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0that survived the previous step recheck\u00a0<span id=\"MathJax-Element-107-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-896\" class=\"math\"><span id=\"MathJax-Span-897\" class=\"mrow\"><span id=\"MathJax-Span-898\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-899\" class=\"mo\">(<\/span><span id=\"MathJax-Span-900\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-901\" class=\"mo\">)<\/span><span id=\"MathJax-Span-902\" class=\"mo\">\u2265<\/span><span id=\"MathJax-Span-903\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>. Those with\u00a0<span id=\"MathJax-Element-108-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-904\" class=\"math\"><span id=\"MathJax-Span-905\" class=\"mrow\"><span id=\"MathJax-Span-906\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-907\" class=\"mo\">(<\/span><span id=\"MathJax-Span-908\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-909\" class=\"mo\">)<\/span><span id=\"MathJax-Span-910\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-911\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0need to be dropped<\/li>\n<li>\u2026<\/li>\n<li><span id=\"MathJax-Element-109-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-912\" class=\"math\"><span id=\"MathJax-Span-913\" class=\"mrow\"><span id=\"MathJax-Span-914\" class=\"texatom\"><span id=\"MathJax-Span-915\" class=\"mrow\"><span id=\"MathJax-Span-916\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-917\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-918\" class=\"texatom\"><span id=\"MathJax-Span-919\" class=\"mrow\"><span id=\"MathJax-Span-920\" class=\"mo\">|<\/span><\/span><\/span><span id=\"MathJax-Span-921\" class=\"mo\">=<\/span><span id=\"MathJax-Span-922\" class=\"mi\">\ud835\udefe<\/span><\/span><\/span><\/span>\u00a0for all\u00a0<span id=\"MathJax-Element-110-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-923\" class=\"math\"><span id=\"MathJax-Span-924\" class=\"mrow\"><span id=\"MathJax-Span-925\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0that survived the previous step recheck\u00a0<span id=\"MathJax-Element-111-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-926\" class=\"math\"><span id=\"MathJax-Span-927\" class=\"mrow\"><span id=\"MathJax-Span-928\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-929\" class=\"mo\">(<\/span><span id=\"MathJax-Span-930\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-931\" class=\"mo\">)<\/span><span id=\"MathJax-Span-932\" class=\"mo\">\u2265<\/span><span id=\"MathJax-Span-933\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>. Those with\u00a0<span id=\"MathJax-Element-112-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-934\" class=\"math\"><span id=\"MathJax-Span-935\" class=\"mrow\"><span id=\"MathJax-Span-936\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-937\" class=\"mo\">(<\/span><span id=\"MathJax-Span-938\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-939\" class=\"mo\">)<\/span><span id=\"MathJax-Span-940\" class=\"mo\">&lt;<\/span><span id=\"MathJax-Span-941\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0need to be dropped and the rest needs to be kept!<\/li>\n<\/ul>\n<p>As\u00a0<span id=\"MathJax-Element-113-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-942\" class=\"math\"><span id=\"MathJax-Span-943\" class=\"mrow\"><span id=\"MathJax-Span-944\" class=\"mi\">\ud835\udefe<\/span><\/span><\/span><\/span>\u00a0increases, the number of sets that survive will decrease! At a certain\u00a0<span id=\"MathJax-Element-114-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-945\" class=\"math\"><span id=\"MathJax-Span-946\" class=\"mrow\"><span id=\"MathJax-Span-947\" class=\"mi\">\ud835\udefe<\/span><\/span><\/span><\/span>, no sets will survive, and we\u2019re done!<\/p>\n<p>Coming back to the above-mentioned association rules \u2013 confidence, support and lift \u2013 we\u2019ll immediately recognize that having all\u00a0<span id=\"MathJax-Element-115-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-948\" class=\"math\"><span id=\"MathJax-Span-949\" class=\"mrow\"><span id=\"MathJax-Span-950\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>\u00a0such that\u00a0<span id=\"MathJax-Element-116-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-951\" class=\"math\"><span id=\"MathJax-Span-952\" class=\"mrow\"><span id=\"MathJax-Span-953\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-954\" class=\"mo\">(<\/span><span id=\"MathJax-Span-955\" class=\"mi\">\ud835\udc3e<\/span><span id=\"MathJax-Span-956\" class=\"mo\">)<\/span><span id=\"MathJax-Span-957\" class=\"mo\">\u2265<\/span><span id=\"MathJax-Span-958\" class=\"mi\">\ud835\udf0f<\/span><\/span><\/span><\/span>\u00a0where\u00a0<span id=\"MathJax-Element-117-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-959\" class=\"math\"><span id=\"MathJax-Span-960\" class=\"mrow\"><span id=\"MathJax-Span-961\" class=\"mi\">\ud835\udc34<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-118-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-962\" class=\"math\"><span id=\"MathJax-Span-963\" class=\"mrow\"><span id=\"MathJax-Span-964\" class=\"mi\">\ud835\udc35<\/span><\/span><\/span><\/span>\u00a0are partitions of\u00a0<span id=\"MathJax-Element-119-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-965\" class=\"math\"><span id=\"MathJax-Span-966\" class=\"mrow\"><span id=\"MathJax-Span-967\" class=\"mi\">\ud835\udc3e<\/span><\/span><\/span><\/span>, both\u00a0<span id=\"MathJax-Element-120-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-968\" class=\"math\"><span id=\"MathJax-Span-969\" class=\"mrow\"><span id=\"MathJax-Span-970\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-971\" class=\"mo\">(<\/span><span id=\"MathJax-Span-972\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-973\" class=\"mo\">)<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-121-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-974\" class=\"math\"><span id=\"MathJax-Span-975\" class=\"mrow\"><span id=\"MathJax-Span-976\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-977\" class=\"mo\">(<\/span><span id=\"MathJax-Span-978\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-979\" class=\"mo\">)<\/span><\/span><\/span><\/span>\u00a0are already there, which means\u00a0<span id=\"MathJax-Element-122-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-980\" class=\"math\"><span id=\"MathJax-Span-981\" class=\"mrow\"><span id=\"MathJax-Span-982\" class=\"mi\">\ud835\udc43<\/span><span id=\"MathJax-Span-983\" class=\"mo\">(<\/span><span id=\"MathJax-Span-984\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-985\" class=\"mo\">|<\/span><span id=\"MathJax-Span-986\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-987\" class=\"mo\">)<\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-123-Frame\" class=\"MathJax\" tabindex=\"0\"><span id=\"MathJax-Span-988\" class=\"math\"><span id=\"MathJax-Span-989\" class=\"mrow\"><span id=\"MathJax-Span-990\" class=\"mi\">\ud835\udc3f<\/span><span id=\"MathJax-Span-991\" class=\"mo\">(<\/span><span id=\"MathJax-Span-992\" class=\"mi\">\ud835\udc34<\/span><span id=\"MathJax-Span-993\" class=\"mo\">|<\/span><span id=\"MathJax-Span-994\" class=\"mi\">\ud835\udc35<\/span><span id=\"MathJax-Span-995\" class=\"mo\">)<\/span><\/span><\/span><\/span>\u00a0can be easily calculated without any additional effort.<\/p>\n<p>There are a lot of brilliant implementations of the Apriori algorithm and association rules. In the following coding example, we\u2019d like to provide you with a Python-native implementation of the Apriori algorithm, which is based on a successive reduction of the initial dataset. For association rules we use the\u00a0<a href=\"http:\/\/rasbt.github.io\/mlxtend\/\">MLxtend framework<\/a>. For this example we use a subset of data from the\u00a0<a href=\"https:\/\/www.kaggle.com\/c\/instacart-market-basket-analysis\/data\">Insta-Cart Market Basket Analysis dataset on Kaggle<\/a>. We will be using purchase order data specifying which products were purchased in which order.<\/p>\n<p>order_products train.csv contains extracted previous order contents for all customers:<\/p>\n<table class=\"table1\">\n<tbody>\n<tr>\n<td class=\"td1\" width=\"25%\">order_id,<\/td>\n<td class=\"td1\" width=\"25%\">product_id,<\/td>\n<td class=\"td1\" width=\"25%\">add_to_cart_order,<\/td>\n<td class=\"td1\" width=\"25%\">reordered<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\" width=\"25%\">\u00a01,<\/td>\n<td class=\"td1\" width=\"25%\">49302,<\/td>\n<td class=\"td1\" width=\"25%\">1,<\/td>\n<td class=\"td1\" width=\"25%\">1<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\" width=\"25%\">\u00a01,<\/td>\n<td class=\"td1\" width=\"25%\">11109,<\/td>\n<td class=\"td1\" width=\"25%\">2,<\/td>\n<td class=\"td1\" width=\"25%\">1<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\" width=\"25%\">\u00a01,<\/td>\n<td class=\"td1\" width=\"25%\">10246,<\/td>\n<td class=\"td1\" width=\"25%\">3,<\/td>\n<td class=\"td1\" width=\"25%\">0<\/td>\n<\/tr>\n<tr>\n<td class=\"td1\" width=\"25%\">\u00a0\u2026<\/td>\n<td class=\"td1\" width=\"25%\"><\/td>\n<td class=\"td1\" width=\"25%\"><\/td>\n<td class=\"td1\" width=\"25%\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The initial dataset contains 131 209 unique orders.<\/p>\n<p>There are two questions we are going to answer in this example:<\/p>\n<p><strong>What are the sets of the three items found together more than 393 times in the above dataset?<\/strong><\/p>\n<p><strong>What are the association rules between the articles within those subsets?<\/strong><\/p>\n<p>Code section:<\/p>\n<p>&nbsp;<\/p>\n<div class=\"code-toolbar\">\n<pre class=\" language-java line-numbers\" data-start=\"1\"><code class=\" language-java\"><span class=\"token keyword\">import<\/span> numpy as np\r\n<span class=\"token keyword\">import<\/span> pandas as pd\r\n<span class=\"token keyword\">import<\/span> <span class=\"token namespace\">matplotlib<span class=\"token punctuation\">.<\/span>pyplot<\/span> as plt\r\n<span class=\"token keyword\">import<\/span> itertools as it\r\nfrom mlxtend<span class=\"token punctuation\">.<\/span>frequent_patterns <span class=\"token keyword\">import<\/span> association_rules\r\n\r\n#  <span class=\"token class-name\">The<\/span> below function finds all of the product groupings of a specified size<span class=\"token punctuation\">,<\/span> \r\n#  and counts how many times they appear\r\ndef <span class=\"token function\">find_groups_of_size_n<\/span><span class=\"token punctuation\">(<\/span>data<span class=\"token punctuation\">,<\/span> size<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n    group_by <span class=\"token operator\">=<\/span> data<span class=\"token punctuation\">.<\/span><span class=\"token function\">groupby<\/span><span class=\"token punctuation\">(<\/span><span class=\"token string\">\"order_id\"<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">[<\/span><span class=\"token string\">'product_id'<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span><span class=\"token function\">unique<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span>\r\n    group_by <span class=\"token operator\">=<\/span> group_by<span class=\"token punctuation\">.<\/span><span class=\"token function\">apply<\/span><span class=\"token punctuation\">(<\/span>lambda x<span class=\"token operator\">:<\/span> <span class=\"token function\">sorted<\/span><span class=\"token punctuation\">(<\/span>x<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    group_by <span class=\"token operator\">=<\/span> pd<span class=\"token punctuation\">.<\/span><span class=\"token class-name\">DataFrame<\/span><span class=\"token punctuation\">(<\/span>group_by<span class=\"token punctuation\">)<\/span>\r\n    def <span class=\"token function\">groupings<\/span><span class=\"token punctuation\">(<\/span>x<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n        <span class=\"token keyword\">return<\/span> <span class=\"token function\">list<\/span><span class=\"token punctuation\">(<\/span>it<span class=\"token punctuation\">.<\/span><span class=\"token function\">combinations<\/span><span class=\"token punctuation\">(<\/span>x<span class=\"token punctuation\">,<\/span>size<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    \r\n    group_by<span class=\"token punctuation\">[<\/span><span class=\"token string\">'groups'<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token operator\">=<\/span> group_by<span class=\"token punctuation\">[<\/span><span class=\"token string\">'product_id'<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span><span class=\"token function\">apply<\/span><span class=\"token punctuation\">(<\/span>groupings<span class=\"token punctuation\">)<\/span>\r\n    counts <span class=\"token operator\">=<\/span> pd<span class=\"token punctuation\">.<\/span><span class=\"token class-name\">Series<\/span><span class=\"token punctuation\">(<\/span><span class=\"token function\">list<\/span><span class=\"token punctuation\">(<\/span>it<span class=\"token punctuation\">.<\/span>chain<span class=\"token punctuation\">.<\/span><span class=\"token function\">from_iterable<\/span><span class=\"token punctuation\">(<\/span>group_by<span class=\"token punctuation\">[<\/span><span class=\"token string\">'groups'<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span>values<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">.<\/span><span class=\"token function\">value_counts<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span>\r\n    <span class=\"token keyword\">return<\/span> counts\r\n\r\n# <span class=\"token class-name\">This<\/span> functions returnes product names <span class=\"token keyword\">for<\/span> all products ids in the list\r\ndef <span class=\"token function\">product_lookup<\/span><span class=\"token punctuation\">(<\/span>product_ids<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n    <span class=\"token keyword\">try<\/span><span class=\"token operator\">:<\/span>\r\n        <span class=\"token function\">len<\/span><span class=\"token punctuation\">(<\/span>product_ids<span class=\"token punctuation\">)<\/span>\r\n        names <span class=\"token operator\">=<\/span> <span class=\"token punctuation\">[<\/span>products<span class=\"token punctuation\">[<\/span>products<span class=\"token punctuation\">.<\/span>product_id <span class=\"token operator\">==<\/span> pid<span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span>iloc<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">,<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token keyword\">for<\/span> pid in product_ids<span class=\"token punctuation\">]<\/span>\r\n    except<span class=\"token operator\">:<\/span>\r\n        names <span class=\"token operator\">=<\/span> products<span class=\"token punctuation\">[<\/span>products<span class=\"token punctuation\">.<\/span>product_id <span class=\"token operator\">==<\/span> product_ids<span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span>iloc<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">,<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">]<\/span>\r\n    \r\n    <span class=\"token keyword\">return<\/span> names\r\n\r\n# <span class=\"token class-name\">This<\/span> is a service function <span class=\"token keyword\">for<\/span> the <span class=\"token class-name\">Apriori<\/span> algorithm removing dupicates from the subset of produts\r\ndef <span class=\"token function\">getListOfProducts<\/span><span class=\"token punctuation\">(<\/span>ser<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n    products<span class=\"token operator\">=<\/span><span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">]<\/span>\r\n    <span class=\"token keyword\">for<\/span> element in ser<span class=\"token punctuation\">.<\/span>index<span class=\"token operator\">:<\/span>\r\n        products<span class=\"token operator\">+=<\/span><span class=\"token punctuation\">(<\/span><span class=\"token function\">list<\/span><span class=\"token punctuation\">(<\/span>element<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    uniquelist <span class=\"token operator\">=<\/span> <span class=\"token function\">set<\/span><span class=\"token punctuation\">(<\/span>products<span class=\"token punctuation\">)<\/span>\r\n    <span class=\"token keyword\">return<\/span> <span class=\"token function\">list<\/span><span class=\"token punctuation\">(<\/span>uniquelist<span class=\"token punctuation\">)<\/span> \r\n\r\n\r\n\r\n# <span class=\"token class-name\">This<\/span> is a simple <span class=\"token keyword\">native<\/span> implementation of <span class=\"token class-name\">Apriory<\/span> algorithm \r\n#  <span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">--<\/span><span class=\"token operator\">-<\/span>\r\n# <span class=\"token class-name\">Find<\/span> all combinations of three items appearing in at least <span class=\"token punctuation\">(<\/span><span class=\"token class-name\">Numbero<\/span> of orders <span class=\"token operator\">*<\/span> threshold<span class=\"token punctuation\">)<\/span> grocery # baskets \r\n# <span class=\"token class-name\">Parameters<\/span><span class=\"token operator\">:<\/span>\r\n#     data <span class=\"token operator\">:<\/span> pandas<span class=\"token punctuation\">.<\/span><span class=\"token class-name\">DataFrame<\/span><span class=\"token operator\">:<\/span>   order_id <span class=\"token operator\">:<\/span>  product_id\r\n#     threshold <span class=\"token operator\">:<\/span> <span class=\"token number\">0<\/span> <span class=\"token operator\">&lt;<\/span> real <span class=\"token operator\">&lt;<\/span> <span class=\"token number\">1<\/span> \r\n# <span class=\"token class-name\">Returns<\/span><span class=\"token operator\">:<\/span>\r\n#     pandas<span class=\"token punctuation\">.<\/span><span class=\"token class-name\">Series<\/span><span class=\"token operator\">:<\/span>  <span class=\"token function\">INDEX<\/span><span class=\"token punctuation\">(<\/span>list of frequent items<span class=\"token punctuation\">)<\/span>  <span class=\"token function\">VALUE<\/span><span class=\"token punctuation\">(<\/span>count of baskets<span class=\"token punctuation\">)<\/span> \r\n#\r\ndef <span class=\"token function\">findItemSets<\/span><span class=\"token punctuation\">(<\/span>data<span class=\"token punctuation\">,<\/span> threshold<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n    subset<span class=\"token operator\">=<\/span>data<span class=\"token punctuation\">.<\/span><span class=\"token function\">groupby<\/span><span class=\"token punctuation\">(<\/span><span class=\"token string\">\"order_id\"<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">[<\/span><span class=\"token string\">'product_id'<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span><span class=\"token function\">count<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span>\r\n    orders<span class=\"token operator\">=<\/span> subset<span class=\"token punctuation\">[<\/span>subset<span class=\"token operator\">&gt;=<\/span><span class=\"token number\">3<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span><span class=\"token function\">count<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span>\r\n              \r\n    subset_reduced<span class=\"token operator\">=<\/span> data<span class=\"token punctuation\">[<\/span>data<span class=\"token punctuation\">.<\/span>order_id<span class=\"token punctuation\">.<\/span><span class=\"token function\">isin<\/span><span class=\"token punctuation\">(<\/span>subset<span class=\"token punctuation\">[<\/span>subset<span class=\"token operator\">&gt;=<\/span><span class=\"token number\">3<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span>index<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">]<\/span>\r\n    <span class=\"token function\">print<\/span><span class=\"token punctuation\">(<\/span>subset_reduced<span class=\"token punctuation\">.<\/span><span class=\"token function\">head<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    <span class=\"token keyword\">for<\/span> group_size in <span class=\"token function\">range<\/span><span class=\"token punctuation\">(<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token number\">4<\/span><span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n        result<span class=\"token operator\">=<\/span><span class=\"token function\">find_groups_of_size_n<\/span><span class=\"token punctuation\">(<\/span>subset_reduced<span class=\"token punctuation\">,<\/span>group_size<span class=\"token punctuation\">)<\/span>\r\n        <span class=\"token function\">print<\/span><span class=\"token punctuation\">(<\/span><span class=\"token string\">\"run \"<\/span><span class=\"token punctuation\">,<\/span>group_size<span class=\"token punctuation\">,<\/span> <span class=\"token string\">\"count of elements \"<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token function\">len<\/span><span class=\"token punctuation\">(<\/span>result<span class=\"token punctuation\">[<\/span>result<span class=\"token operator\">&gt;=<\/span>orders<span class=\"token operator\">*<\/span>threshold<span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n\r\n        subset_reduced<span class=\"token operator\">=<\/span> subset_reduced<span class=\"token punctuation\">[<\/span>subset_reduced<span class=\"token punctuation\">.<\/span>product_id<span class=\"token punctuation\">.<\/span><span class=\"token function\">isin<\/span><span class=\"token punctuation\">(<\/span><span class=\"token function\">getListOfProducts<\/span><span class=\"token punctuation\">(<\/span>result<span class=\"token punctuation\">[<\/span>result<span class=\"token operator\">&gt;=<\/span>orders<span class=\"token operator\">*<\/span>threshold<span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">]<\/span>\r\n    <span class=\"token keyword\">return<\/span>   result<span class=\"token punctuation\">[<\/span>result<span class=\"token operator\">&gt;=<\/span>orders<span class=\"token operator\">*<\/span>threshold<span class=\"token punctuation\">]<\/span>\r\n    \r\n\r\n# <span class=\"token class-name\">This<\/span> functions returnes product names <span class=\"token keyword\">for<\/span> all products ids in the <span class=\"token number\">2D<\/span> array\r\ndef <span class=\"token function\">product_lookup_2D<\/span><span class=\"token punctuation\">(<\/span>product_ids<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n   names <span class=\"token operator\">=<\/span><span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">]<\/span>\r\n   <span class=\"token keyword\">for<\/span> element in product_ids<span class=\"token operator\">:<\/span>\r\n         name <span class=\"token operator\">=<\/span>  <span class=\"token punctuation\">[<\/span>products<span class=\"token punctuation\">[<\/span>products<span class=\"token punctuation\">.<\/span>product_id <span class=\"token operator\">==<\/span> pid<span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">.<\/span>iloc<span class=\"token punctuation\">[<\/span><span class=\"token number\">0<\/span><span class=\"token punctuation\">,<\/span><span class=\"token number\">1<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token keyword\">for<\/span> pid in element<span class=\"token punctuation\">]<\/span>\r\n         names<span class=\"token punctuation\">.<\/span><span class=\"token function\">append<\/span><span class=\"token punctuation\">(<\/span>name<span class=\"token punctuation\">)<\/span>\r\n   <span class=\"token keyword\">return<\/span> names \r\n\r\n# <span class=\"token class-name\">Generates<\/span> a <span class=\"token class-name\">DataFrame<\/span> of association rules including the metrics <span class=\"token string\">'score'<\/span><span class=\"token punctuation\">,<\/span> <span class=\"token string\">'confidence'<\/span><span class=\"token punctuation\">,<\/span> and <span class=\"token string\">'lift'<\/span>\r\n#    <span class=\"token class-name\">For<\/span> usage examples<span class=\"token punctuation\">,<\/span> please see http<span class=\"token operator\">:<\/span><span class=\"token operator\">\/<\/span><span class=\"token operator\">\/<\/span>rasbt<span class=\"token punctuation\">.<\/span>github<span class=\"token punctuation\">.<\/span>io<span class=\"token operator\">\/<\/span>mlxtend<span class=\"token operator\">\/<\/span>user_guide<span class=\"token operator\">\/<\/span>frequent_patterns<span class=\"token operator\">\/<\/span>association_rules<span class=\"token operator\">\/<\/span>\r\n# <span class=\"token class-name\">Returns<\/span>\r\n#   pandas <span class=\"token class-name\">DataFrame<\/span> <span class=\"token keyword\">with<\/span> columns <span class=\"token string\">\"antecedents\"<\/span> and <span class=\"token string\">\"consequents\"<\/span> that store itemsets<span class=\"token punctuation\">,<\/span> plus the scoring metric columns\r\n#\r\ndef <span class=\"token function\">association<\/span><span class=\"token punctuation\">(<\/span>resultFrame<span class=\"token punctuation\">,<\/span> threshold<span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n    rules<span class=\"token operator\">=<\/span> <span class=\"token function\">association_rules<\/span><span class=\"token punctuation\">(<\/span>resultFrame<span class=\"token punctuation\">,<\/span> metric<span class=\"token operator\">=<\/span><span class=\"token string\">\"confidence\"<\/span><span class=\"token punctuation\">,<\/span> min_threshold<span class=\"token operator\">=<\/span>threshold<span class=\"token punctuation\">,<\/span> support_only<span class=\"token operator\">=<\/span><span class=\"token class-name\">True<\/span><span class=\"token punctuation\">)<\/span>\r\n    <span class=\"token function\">print<\/span><span class=\"token punctuation\">(<\/span>rules<span class=\"token punctuation\">[<\/span><span class=\"token punctuation\">[<\/span><span class=\"token string\">'antecedents'<\/span><span class=\"token punctuation\">,<\/span><span class=\"token string\">'consequents'<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">]<\/span><span class=\"token punctuation\">)<\/span>\r\n\r\n#<span class=\"token operator\">--<\/span><span class=\"token operator\">-<\/span>\r\n#<span class=\"token operator\">--<\/span><span class=\"token operator\">-<\/span>  ENTRY POINT MAIN\r\n#<span class=\"token operator\">--<\/span><span class=\"token operator\">-<\/span>\r\n<span class=\"token keyword\">if<\/span> __name__ <span class=\"token operator\">==<\/span> <span class=\"token string\">'__main__'<\/span><span class=\"token operator\">:<\/span>\r\n  \r\n    file_path <span class=\"token operator\">=<\/span> <span class=\"token string\">'.\/resources\/grocery\/order_products__train.csv'<\/span>\r\n    data <span class=\"token operator\">=<\/span> pd<span class=\"token punctuation\">.<\/span><span class=\"token function\">read_csv<\/span><span class=\"token punctuation\">(<\/span>file_path<span class=\"token punctuation\">)<\/span>\r\n    prod_filepath <span class=\"token operator\">=<\/span> <span class=\"token string\">\".\/resources\/grocery\/products.csv\"<\/span>\r\n    products <span class=\"token operator\">=<\/span> pd<span class=\"token punctuation\">.<\/span><span class=\"token function\">read_csv<\/span><span class=\"token punctuation\">(<\/span>prod_filepath<span class=\"token punctuation\">)<\/span>\r\n                       \r\n    unique_orders <span class=\"token operator\">=<\/span> <span class=\"token function\">len<\/span><span class=\"token punctuation\">(<\/span>data<span class=\"token punctuation\">.<\/span>order_id<span class=\"token punctuation\">.<\/span><span class=\"token function\">unique<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    <span class=\"token function\">print<\/span><span class=\"token punctuation\">(<\/span><span class=\"token string\">\"Overall count of unique orders \"<\/span><span class=\"token punctuation\">,<\/span> unique_orders<span class=\"token punctuation\">)<\/span>\r\n    ser<span class=\"token operator\">=<\/span> <span class=\"token function\">findItemSets<\/span><span class=\"token punctuation\">(<\/span>data<span class=\"token punctuation\">,<\/span> <span class=\"token number\">0.003<\/span> <span class=\"token punctuation\">)<\/span>\r\n   \r\n\r\n    <span class=\"token keyword\">for<\/span> index<span class=\"token punctuation\">,<\/span> values in ser<span class=\"token punctuation\">.<\/span><span class=\"token function\">items<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token operator\">:<\/span>\r\n        <span class=\"token function\">print<\/span><span class=\"token punctuation\">(<\/span><span class=\"token function\">product_lookup<\/span><span class=\"token punctuation\">(<\/span>index<span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">,<\/span><span class=\"token string\">\"           occured in \"<\/span><span class=\"token punctuation\">,<\/span>  values<span class=\"token punctuation\">,<\/span> <span class=\"token string\">\" baskets\"<\/span><span class=\"token punctuation\">)<\/span>\r\n   \r\n    resultFrame<span class=\"token operator\">=<\/span> ser<span class=\"token punctuation\">.<\/span><span class=\"token function\">to_frame<\/span><span class=\"token punctuation\">(<\/span>name<span class=\"token operator\">=<\/span><span class=\"token string\">'support'<\/span><span class=\"token punctuation\">)<\/span>\r\n    resultFrame<span class=\"token punctuation\">[<\/span><span class=\"token string\">'itemsets'<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token operator\">=<\/span> <span class=\"token function\">product_lookup_2D<\/span><span class=\"token punctuation\">(<\/span>resultFrame<span class=\"token punctuation\">.<\/span>index<span class=\"token punctuation\">.<\/span><span class=\"token function\">tolist<\/span><span class=\"token punctuation\">(<\/span><span class=\"token punctuation\">)<\/span><span class=\"token punctuation\">)<\/span>\r\n    resultFrame<span class=\"token punctuation\">[<\/span><span class=\"token string\">'support'<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token operator\">=<\/span> <span class=\"token punctuation\">(<\/span>resultFrame<span class=\"token punctuation\">[<\/span><span class=\"token string\">'support'<\/span><span class=\"token punctuation\">]<\/span> <span class=\"token operator\">\/<\/span> unique_orders<span class=\"token punctuation\">)<\/span><span class=\"token operator\">*<\/span><span class=\"token number\">100<\/span>\r\n   \r\n    <span class=\"token function\">association<\/span><span class=\"token punctuation\">(<\/span>resultFrame<span class=\"token punctuation\">,<\/span> <span class=\"token number\">0.5<\/span><span class=\"token punctuation\">)<\/span><\/code><\/pre>\n<div class=\"toolbar\">\n<div class=\"toolbar-item\"><a>Copy<\/a><\/div>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>And the console output looks as follows:<\/p>\n<pre>Overall count of unique orders  131209\r\n    order_id    product_id    add_to_cart_order    reordered\r\n0          1         49302                    1            1\r\n1          1         11109                    2            1\r\n2          1         10246                    3            0\r\n3          1         49683                    4            0\r\n4          1         43633                    5            1\r\n\r\nrun  1 count of elements  589\r\nrun  2 count of elements  406\r\nrun  3 count of elements  18\r\n\r\n['Bag of Organic Bananas', 'Organic Strawberries', 'Organic Hass Avocado']     occured in 710 baskets\r\n['Bag of Organic Bananas', 'Organic Strawberries', 'Organic Raspberries']      occured in 649 baskets\r\n['Bag of Organic Bananas', 'Organic Strawberries', 'Organic Baby Spinach']     occured in 587 baskets\r\n['Bag of Organic Bananas', 'Organic Raspberries', 'Organic Hass Avocado']      occured in 531 baskets\r\n['Bag of Organic Bananas', 'Organic Baby Spinach', 'Organic Hass Avocado']     occured in 497 baskets\r\n['Organic Baby Spinach', 'Banana', 'Organic Avocado']                          occured in 484 baskets\r\n['Banana', 'Large Lemon', 'Organic Avocado']                                   occured in 477 baskets\r\n['Banana', 'Limes', 'Large Lemon']                                             occured in 452 baskets\r\n['Bag of Organic Bananas', 'Organic Strawberries', 'Organic Cucumber']         occured in 424 baskets\r\n['Limes', 'Large Lemon', 'Organic Avocado']                                    occured in 389 baskets\r\n['Organic Strawberries', 'Organic Raspberries', 'Organic Hass Avocado']        occured in 381 baskets\r\n['Organic Strawberries', 'Banana', 'Organic Avocado']                          occured in 379 baskets\r\n['Organic Strawberries', 'Organic Baby Spinach', 'Banana']                     occured in 376 baskets\r\n['Bag of Organic Bananas', 'Organic Strawberries', 'Organic Blueberries']      occured in 374 baskets\r\n['Organic Baby Spinach', 'Banana', 'Large Lemon']                              occured in 371 baskets\r\n['Bag of Organic Bananas', 'Organic Cucumber', 'Organic Hass Avocado']         occured in 366 baskets\r\n['Organic Lemon', 'Bag of Organic Bananas', 'Organic Hass Avocado']            occured in 353 baskets\r\n['Banana', 'Limes', 'Organic Avocado']                                         occured in 352 baskets\r\n\r\n                                     antecedents                                      consequents\r\n\r\n0  (Bag of Organic Bananas, Organic Hass Avocado)                          (Organic Strawberries)\r\n1  (Bag of Organic Bananas, Organic Strawberries)                          (Organic Hass Avocado)\r\n2    (Organic Hass Avocado, Organic Strawberries)                        (Bag of Organic Bananas)\r\n3                        (Bag of Organic Bananas)    (Organic Hass Avocado, Organic Strawberries)\r\n4                          (Organic Hass Avocado)  (Bag of Organic Bananas, Organic Strawberries)\r\n5                          (Organic Strawberries)  (Bag of Organic Bananas, Organic Hass Avocado)\r\n\r\nPress any key to continue . . .<\/pre>\n<p>&nbsp;<\/p>\n<p>As you can see, these couple of lines of code could deliver us answers to the question regarding most frequent product subsets, as well as about association rules between groups of products frequently occurring together. I would encourage you to think about other interesting use cases where association analysis could provide you with incredible and profound insights.<\/p>\n<p>\u2014\u2014\u2014\u2014\u2014\u2014-<\/p>\n<p><strong>Reference for GUHA method:<\/strong><\/p>\n<p><a href=\"http:\/\/www.cs.cas.cz\/hajek\/guhabook\/guhabook.pdf\">Mechanizing Hypothesis Formation. Mathematical Foundations for a General Theory<\/a>; Petr H\u00e1jek, Tom\u00e1\u0161 Havr\u00e1nek, Springer-Verlag 1978 (ISBN 3-540-08738-9, 0-387-08738-9)[\/vc_column_text][\/vc_column][\/vc_row]<\/p>\n<p>[vc_row][vc_column width=&#8221;1\/4&#8243;][\/vc_column][vc_column width=&#8221;3\/4&#8243;][vc_btn title=&#8221;ALL BLOG ARTICLES&#8221; align=&#8221;center&#8221; link=&#8221;url:https%3A%2F%2Fwww.striped-giraffe.com%2Fen%2Fblog%2F|title:All blog articles&#8221;][\/vc_column][\/vc_row]<\/p>\n<\/section>","protected":false},"excerpt":{"rendered":"<p>[vc_row][vc_column width=&#8221;1\/3&#8243;][\/vc_column][vc_column width=&#8221;2\/3&#8243;][vc_column_text] We like having our ten stripes in a row and we ask ourselves who else does? &nbsp; [&hellip;]<\/p>\n","protected":false},"author":12,"featured_media":19999,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[61],"tags":[],"class_list":["post-19997","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","wpbf-post"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v20.5 (Yoast SEO v20.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Looking for Similarities With Association Rule Mining - Striped Giraffe<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Looking for Similarities With Association Rule Mining\" \/>\n<meta property=\"og:description\" content=\"[vc_row][vc_column width=&#8221;1\/3&#8243;][\/vc_column][vc_column width=&#8221;2\/3&#8243;][vc_column_text] We like having our ten stripes in a row and we ask ourselves who else does? &nbsp; [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\" \/>\n<meta property=\"og:site_name\" content=\"Striped Giraffe\" \/>\n<meta property=\"article:published_time\" content=\"2019-05-23T12:31:46+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-05-25T12:26:17+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/grocery-basket-analysis.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Striped Giraffe Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Striped Giraffe Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"15 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\"},\"author\":{\"name\":\"Striped Giraffe Team\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/ec927d63d1ba2192e5547709e188c6d5\"},\"headline\":\"Looking for Similarities With Association Rule Mining\",\"datePublished\":\"2019-05-23T12:31:46+00:00\",\"dateModified\":\"2026-05-25T12:26:17+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\"},\"wordCount\":2005,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#organization\"},\"articleSection\":[\"Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\",\"url\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\",\"name\":\"Looking for Similarities With Association Rule Mining - Striped Giraffe\",\"isPartOf\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#website\"},\"datePublished\":\"2019-05-23T12:31:46+00:00\",\"dateModified\":\"2026-05-25T12:26:17+00:00\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/\"]}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#website\",\"url\":\"https:\/\/www.striped-giraffe.com\/de\/\",\"name\":\"Striped Giraffe\",\"description\":\"Ihr zuverl\u00e4ssiger Anbieter f\u00fcr digitale Enterprise-L\u00f6sungen\",\"publisher\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.striped-giraffe.com\/de\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#organization\",\"name\":\"Striped Giraffe\",\"url\":\"https:\/\/www.striped-giraffe.com\/de\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/giraffe_white.svg\",\"contentUrl\":\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/giraffe_white.svg\",\"caption\":\"Striped Giraffe\"},\"image\":{\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/ec927d63d1ba2192e5547709e188c6d5\",\"name\":\"Striped Giraffe Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/avatar_giraffe_v2-96x96.jpg\",\"contentUrl\":\"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/avatar_giraffe_v2-96x96.jpg\",\"caption\":\"Striped Giraffe Team\"},\"url\":\"https:\/\/www.striped-giraffe.com\/en\/blog\/author\/striped-giraffe\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Looking for Similarities With Association Rule Mining - Striped Giraffe","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/","og_locale":"en_US","og_type":"article","og_title":"Looking for Similarities With Association Rule Mining","og_description":"[vc_row][vc_column width=&#8221;1\/3&#8243;][\/vc_column][vc_column width=&#8221;2\/3&#8243;][vc_column_text] We like having our ten stripes in a row and we ask ourselves who else does? &nbsp; [&hellip;]","og_url":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/","og_site_name":"Striped Giraffe","article_published_time":"2019-05-23T12:31:46+00:00","article_modified_time":"2026-05-25T12:26:17+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2019\/05\/grocery-basket-analysis.jpg","type":"image\/jpeg"}],"author":"Striped Giraffe Team","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Striped Giraffe Team","Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/#article","isPartOf":{"@id":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/"},"author":{"name":"Striped Giraffe Team","@id":"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/ec927d63d1ba2192e5547709e188c6d5"},"headline":"Looking for Similarities With Association Rule Mining","datePublished":"2019-05-23T12:31:46+00:00","dateModified":"2026-05-25T12:26:17+00:00","mainEntityOfPage":{"@id":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/"},"wordCount":2005,"commentCount":0,"publisher":{"@id":"https:\/\/www.striped-giraffe.com\/de\/#organization"},"articleSection":["Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/","url":"https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/","name":"Looking for Similarities With Association Rule Mining - Striped Giraffe","isPartOf":{"@id":"https:\/\/www.striped-giraffe.com\/de\/#website"},"datePublished":"2019-05-23T12:31:46+00:00","dateModified":"2026-05-25T12:26:17+00:00","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.striped-giraffe.com\/en\/blog\/looking-for-similarities-with-association-rule-mining\/"]}]},{"@type":"WebSite","@id":"https:\/\/www.striped-giraffe.com\/de\/#website","url":"https:\/\/www.striped-giraffe.com\/de\/","name":"Striped Giraffe","description":"Ihr zuverl\u00e4ssiger Anbieter f\u00fcr digitale Enterprise-L\u00f6sungen","publisher":{"@id":"https:\/\/www.striped-giraffe.com\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.striped-giraffe.com\/de\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.striped-giraffe.com\/de\/#organization","name":"Striped Giraffe","url":"https:\/\/www.striped-giraffe.com\/de\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/giraffe_white.svg","contentUrl":"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/giraffe_white.svg","caption":"Striped Giraffe"},"image":{"@id":"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/ec927d63d1ba2192e5547709e188c6d5","name":"Striped Giraffe Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.striped-giraffe.com\/de\/#\/schema\/person\/image\/","url":"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/avatar_giraffe_v2-96x96.jpg","contentUrl":"https:\/\/www.striped-giraffe.com\/wp-content\/uploads\/2021\/01\/avatar_giraffe_v2-96x96.jpg","caption":"Striped Giraffe Team"},"url":"https:\/\/www.striped-giraffe.com\/en\/blog\/author\/striped-giraffe\/"}]}},"_links":{"self":[{"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/posts\/19997","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/comments?post=19997"}],"version-history":[{"count":5,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/posts\/19997\/revisions"}],"predecessor-version":[{"id":43569,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/posts\/19997\/revisions\/43569"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/media\/19999"}],"wp:attachment":[{"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/media?parent=19997"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/categories?post=19997"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.striped-giraffe.com\/en\/wp-json\/wp\/v2\/tags?post=19997"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}