# Chi-squared Score

This is another statistical method that’s commonly used for **testing relationships between categorical variables**. Therefore, it’s suited for categorical variables and binary targets only, and the variables should be non-negative and typically Boolean frequencies or counts.

What it does is simply compare the observed distribution between various features in the dataset and the target variable.

**How do you calculate the chi-square:**

**Let’s learn the use of chi-square with an intuitive example using the Titanic dataset.**

**1 — Get the sum of Male and Female with the Survived and not Survived Categories**

Expected frequency is the sum of male and Female

**2 — Calculate the frequencies by observations/total in each column**

**3- In the Green is the expected Frequency and we can clearly see that the Female and Male Real Frequencies don’t match that.**

Hence the Hypothesis that Male and females had equal survival rates is false

**4 – Sum of eg (0.19 -0.38) squared / 0.38 + (0.81 -0.62) squared / 0.62……….. n numbers.**

**5 — Once you have this you can put it in distribution and compare it with a known distribution of chi-square**

Best Used for – Categorical which are Boolean, Frequency and Counts that are non-negative