# 7. Bayes’ based Algorithm : Naïve and Gaussian

Bayes theorem (alternatively Bayes’ law or Bayes’ rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

For example, if cancer is related to age, then, using Bayes’ theorem, a person’s age can be used to more accurately assess the probability that they have cancer, compared to the assessment of the probability of cancer made without knowledge of the person’s age.

Bayes’ based models work on the Bayes’ theorem.

Following are the 2 Bayes based algorithms:

1. Naïve bayes
2. Gaussian

# 1. Naïve bayes:

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:

Naïve: It is called Naïve because it makes assumptions that may or may not turn out to be correct. It assumes that the occurrence of a certain feature is independent of the occurrence of other features. Such as if the fruit is identified on the basis of color, shape, and taste, then red, spherical, and sweet fruit is recognized as an apple. Hence each feature individually contributes to identifying that it is an apple without depending on each other.

Bayes: It is called Bayes because it depends on the principle of Bayes’ Theorem.

The Bayes Rule provides the formula for the probability of Y given X. But, in real-world problems, you typically have multiple X variables. When the features are independent, we can extend the Bayes Rule to what is called Naive Bayes.

Formula:

Understanding Naive Bayes Classifier:

Based on the Bayes theorem, the Naive Bayes Classifier gives the conditional probability of an event A given event B. Let us use the following demo to understand the concept of a Naive Bayes classifier:

Shopping Example:

Problem statement: To predict whether a person will purchase a product on a specific combination of day, discount, and free delivery using a Naive Bayes classifier.

Under the day, look for variables, like weekday, weekend, and holiday. For any given day, check if there are a discount and free delivery. Based on this information, we can predict if a customer would buy the product or not.

Step 1: Convert the data set into a frequency table

Based on the dataset containing the three input types—day, discount, and free delivery— the frequency table for each attribute is populated.

For Bayes theorem, let the event ‘buy’ be A and the independent variables (discount, free delivery, and day) be B.

Step 2: Create a Likelihood table by finding the probabilities

Let us calculate the likelihood for one of the “day” variables, which includes weekday, weekend, and holiday variables.

The likelihood tables can be used to calculate whether a customer will purchase a product on a specific combination of the day when there is a discount and whether there is free delivery. Consider a combination of the following factors where B equals:

• Day = Holiday
• Discount = Yes
• Free Delivery = Yes

Let us find the probability of them not purchasing based on the conditions above.

A = No Purchase

Applying Bayes Theorem, we get P (A | B) as shown:

Similarly, let us find the probability of them purchasing a product under the conditions above.

Applying Bayes Theorem, we get P (A | B) as shown:

From the two calculations above, we find that:

Probability of purchase = 0.986  |   Probability of no purchase = 0.178

Finally, we have a conditional probability of purchase on this day.

Next, normalize these probabilities to get the likelihood of the events:

Sum of probabilities = 0.986 + 0.178 = 1.164

Likelihood of purchase = 0.986 / 1.164 = 84.71 %

Likelihood of no purchase = 0.178 / 1.164 = 15.29 %

Result: As 84.71 percent is greater than 15.29 percent, we can conclude that an average customer will buy on holiday with a discount and free delivery.

There are several types of Naive Bayes one of which is Gaussian Naive Bayes.

1. Optimal Naive Bayes

This classifier chooses the class that has the greatest a posteriori probability of occurrence (so-called maximum a posteriori estimation, or MAP). As follows from the name, it really is optimal but going through all possible options is rather slow and time-consuming.

#### 2. Gaussian Naive Bayes

Gaussian Bayes is based on Gaussian or normal distribution. It significantly speeds up the search and, under some non-strict conditions; the error is only two times higher than in Optimal Bayes (that’s good!).

3. Multinomial Naive Bayes

It is usually applied to document classification problems. It bases its decisions on discrete features (integers), for example, on the frequency of the words present in the document.

4. Bernoulli Naive Bayes

Bernoulli is similar to the previous type but the predictors are Boolean variables. Therefore, the parameters used to predict the class variable can only have yes or no values, for example, if a word occurs in the text or not.