Let’s learn the main methods from scipy.stats module in Python.
In the last post, the Closer Look at Scipy Stats—Part 1, we learned about distributions, statistics and hypothesis tests with one sample.
Now, we will move on learning about this powerful module and also check a couple of more complex functions available in this package.
In this post, we will learn about Statistical tests comparing two samples, Bootstraping, Monte Carlo simulations and a couple of transformations using Scipy.
Let’s go.
Comparing two samples is a common task for data scientists. In Scipy, we can use the two independent samples test when we want to check if two different samples were drawn from the same distribution, thus have statistically similar averages.
# Two samples test: Comparison of means# Sample 1
samp1 = scs.norm.rvs(loc=2, scale=5, size=100, random_state=12)
# Sample 2
samp2 = scs.norm.rvs(loc=3, scale=3, size=100, random_state=12)
# Hypothesis test
scs.ttest_ind(samp1, samp2, equal_var=False)
TtestResult(statistic=-2.1022782237188657, pvalue=0.03679301172995361, df=198.0)