Note
Go to the end to download the full example code.
Non-parametric significance testing with permutations
This example demonstrates how netneurotools can help perform null
hypothesis significance testing (NHST) via non-parametric permutation testing.
Many of the functions described here mirror functionality from the
scipy.stats toolbox, but use non-parametric methods for deriving
significance (i.e., p-values).
One-sample permutation tests
Similar to a one-sample t-test, one-sample permutation tests are designed to estimate whether a group of values is different from some pre-specified null.
First, we’ll generate a random array with a mean and standard deviation of approximately five:
import numpy as np
np.random.seed(1234)
rvs = np.random.normal(loc=5, scale=5, size=(100, 2))
We can use scipy.stats for a standard parametric test to assess whether
the array is different from zero:
import scipy.stats as sstats
print(sstats.ttest_1samp(rvs, 0.0))
TtestResult(statistic=array([12.69296682, 8.00422448]), pvalue=array([1.72977734e-22, 2.35081560e-12]), df=array([99, 99]))
And can do the same thing with permutations using netneurotools.stats:
from netneurotools import stats as nnstats
print(nnstats.permtest_1samp(rvs, 0.0))
(array([5.64544455, 4.27934625]), array([0.000999, 0.000999]))
Note that rather than returning a T-statistic with the p-values, the function returns the difference in means for each column from the null alongside the two-sided p-value.
The permtest_1samp() function uses 1000 permutations to generate a
null distribution. For each permutation it flips the signs of a random number
of entries in the array and recomputes the difference in means from the null
population mean. The return p-value assess the two-sided test of whether the
absolute value of original difference in means is greater than the absolute
value of the permuted differences.
Just like with scipy, we can test each column against an independent
null mean:
print(nnstats.permtest_1samp(rvs, [5.0, 0.0]))
(array([0.64544455, 4.27934625]), array([0.15284715, 0.000999 ]))
We can also provide an axis parameter (by default, axis=0):
print(nnstats.permtest_1samp(rvs.T, [5.0, 0.0], axis=1))
(array([0.64544455, 4.27934625]), array([0.16783217, 0.000999 ]))
Finally, we can change the number of permutations we want to calculate (by default, n_perm=1000) and set a seed for reproducibility:
print(nnstats.permtest_1samp(rvs, 0.0, n_perm=500, seed=2222))
(array([5.64544455, 4.27934625]), array([0.00199601, 0.00199601]))
Note that the lowest p-value that can be obtained from a permutation test in
netneurotools is equal to 1 / (n_perm + 1).
Permutation tests for correlations
Sometimes rather than assessing differences in means we want to assess the strength of a relationship between two variables. While we might normally do this with a Pearson (or Spearman) correlation, we can assess the significance of this relationship via permutation tests.
First, we’ll generate two correlated variables:
x, y = nnstats.make_correlated_xy(corr=0.2, size=100)
We can generate the Pearson correlation with the standard parametric p-value:
print(sstats.pearsonr(x, y))
PearsonRResult(statistic=np.float64(0.19982034684460143), pvalue=np.float64(0.04623612509934489))
Or use permutation testing to derive the p-value:
print(nnstats.permtest_pearsonr(x, y))
(np.float64(0.1998203468446014), np.float64(0.028971028971028972))
All the same arguments as with permtest_1samp() and
permtest_rel() apply here, so you can provide same-sized arrays and
correlations will only be calculated for paired columns:
a, b = nnstats.make_correlated_xy(corr=0.9, size=100)
arr1, arr2 = np.column_stack([x, a]), np.column_stack([y, b])
print(nnstats.permtest_pearsonr(arr1, arr2))
(array([0.19982035, 0.89939585]), array([0.02897103, 0.000999 ]))
Or you can change the number of permutations and set a seed for reproducibility:
print(nnstats.permtest_pearsonr(arr1, arr2, n_perm=500, seed=2222))
(array([0.19982035, 0.89939585]), array([0.0499002 , 0.00199601]))
Note that currently the axis parameter does not apply to
permtest_pearsonr().
Total running time of the script: (0 minutes 1.722 seconds)