Correlation



You might have come across situations where two variables are related to each other. Like in the situations; demand and supply, height vs weight, share prices of stocks A and B, etc.

In the statistics, it is there is a technique to learn how strong or how weak these variables are associated. This technique is called correlation. Moving ahead we can use the relation to predict one variable if we know the observation in another variable. This is known as regression analysis.

In this article, you will learn what correlation is and how to perform a correlation test.


Definition

The correlation coefficient is the degree of association between two variables. The correlation coefficient always lies between -1 and 1.

The negative correlation means both these variables move in the opposite direction. That is, as one variable increases other decreases and vice versa. 

The positive correlation means both these variables go in the same direction. That is, as one variable increases other also increases and vice versa. 


Types of correlation

There are several types of correlation coefficients to measure the degree of association, depending upon the kind of data, whether it is a measurement or ordinal data, or categorical data.

  1. Pearson correlation coefficient
    1. The Pearson coefficient of correlation measures the extent of the linear relationship between two variables x and y. We denote the correlation coefficient by, r.
    2. If the absolute value of r, | r |, is close to 1 then this indicates that there is a strong correlation between two variables and if the absolute value of r, | r |, is close to 0 then this indicates that there is a weak correlation between two variables.
  2. Intra-class correlation
    1. When we have categorical data then we calculate intra-class correlation to check how strongly units in the same group resemble each other.
  3. Rank correlation
    1. Rank correlation measures the relationship between the rankings of two variables. It measures the strength of the ordinal association of two variables.


Pearson correlation test

We can find the Pearson correlation coefficient (r) for the sample using the following formula.

Correlation formula

If the population correlation coefficient (ρ) is significantly high enough then this indicates that there is a strong linear relationship between two variables x and y. If there is a strong linear relationship between two variables, then we can use regression analysis.

The Pearson correlation test is the special case of hypothesis testing. The procedure for carrying out the correlation is as follows.


Procedure

The null and alternative hypotheses are as follows.

H0: The correlation coefficient ρ is not significant OR ρ=0.

Vs

H1: The correlation coefficient ρ is significant (ρ ≠ 0) OR the correlation coefficient ρ is significantly positive (ρ > 0) OR the correlation coefficient ρ is significantly negative (ρ < 0).

If we have calculated sample correlation coefficient r. Then to test the claim about ρ, either we use the traditional method of finding test statistics or we use the Pearson correlation table of critical values. In this article, we will understand the use of the Pearson correlation table of critical values. The procedure is as follows.

  1. Find the tail of the test. We decide the tail of the test based on an alternative hypothesis. 
    1. If an alternative hypothesis contains the">" sign, then it is a right-tailed test (One-tailed test).
    2. If an alternative hypothesis contains the "<" sign, then it is a left-tailed test (One-tailed test).
    3. If an alternative hypothesis contains the"≠" sign, then it is a two-tailed test.
  1. Find the degrees of freedom (df) = n-2.
  2. Chose the significance level (α).
  3. Search for the critical value in the body of the table corresponding to degrees of freedom.


Decision rule

After finding the critical value we use the decision rule to make a conclusion. The decision rule is as follows.

    1. Two-tailed test: If | r | > critical value then we reject the null hypothesis and conclude that the correlation coefficient ρ is significant otherwise, we conclude that ρ is not significant.
    2. Right-tailed test: If r > critical value then we reject the null hypothesis and conclude that the correlation coefficient ρ is significantly positive otherwise we conclude that ρ is not significantly positive.
    3. Left-tailed test: If -r > critical value then we reject the null hypothesis and conclude that the correlation coefficient ρ is significantly negative otherwise, we conclude that ρ is not significantly negative.

Refer to the following table of Pearson critical values.

Pearson Correlation Table