The formula of the Spearman coefficient is analogous to Pearsonâ€™s coefficient, but it uses the ** ranks** of the values in each variable instead of the values themselves. It is usually given the Greek letter (theta). I will use the letter

**s**to write it in Latin characters. The following formula gives the equation of the Spearman correlation coefficient

**:**

*s*When we compare this formula with Pearsonâ€™s correlation coefficient ** r**, we discover that it only replaces the values of x and y by their ranks U and V. One could say that the Spearman coefficient is Pearsonâ€™s coefficient using the ranks! Thatâ€™s why it is called the Spearmanâ€™s rank-correlation coefficient. Also, because it is computed using the ranks and not the values, it is also classified as

*nonparametric.*

Like the case of Pearsonâ€™s coefficient, the p-value is calculated from t-distribution with the t-value given by the following formula:

Table 3 shows the ranks U and V of the variables x and y in Table 2.

In this case, the Spearman coefficient will be exactly 1, indicating a 100% correlation between the variables x and y ranks.

Now comes the question: when do we use ranks (Spearman), and when do we use the values (Pearson)? We can summarize the answer in the following two situations:

(1) When we expect that the values of the two variables in question donâ€™t have outliers or significant errors, we should select the Pearsonâ€™s coefficient.

(2) We use the Spearman coefficient when we donâ€™t care about the values and only need to know the direction of the relationship between the two variables and when there is a high likelihood of outliers and errors.

The Pearsonâ€™s coefficient is usually a good choice for measurements originating from physical systems and variables where the values matter. On the other hand, data from social studies originating from questionnaires, for example, when we ask respondents to give ranked answers, are good candidates for the Spearmanâ€™s coefficient.