Non-parametric significance testing with permutations

This example demonstrates how netneurotools can help perform null hypothesis significance testing (NHST) via non-parametric permutation testing. Many of the functions described here mirror functionality from the scipy.stats toolbox, but use non-parametric methods for deriving significance (i.e., p-values).

One-sample permutation tests

Similar to a one-sample t-test, one-sample permutation tests are designed to estimate whether a group of values is different from some pre-specified null.

First, we’ll generate a random array with a mean and standard deviation of approximately five:

import numpy as np
np.random.seed(1234)
rvs = np.random.normal(loc=5, scale=5, size=(100, 2))

We can use scipy.stats for a standard parametric test to assess whether the array is different from zero:

import scipy.stats as sstats
print(sstats.ttest_1samp(rvs, 0.0))
TtestResult(statistic=array([12.69296682,  8.00422448]), pvalue=array([1.72977734e-22, 2.35081560e-12]), df=array([99, 99]))

And can do the same thing with permutations using netneurotools.stats:

from netneurotools import stats as nnstats
print(nnstats.permtest_1samp(rvs, 0.0))
(array([5.64544455, 4.27934625]), array([0.000999, 0.000999]))

Note that rather than returning a T-statistic with the p-values, the function returns the difference in means for each column from the null alongside the two-sided p-value.

The permtest_1samp() function uses 1000 permutations to generate a null distribution. For each permutation it flips the signs of a random number of entries in the array and recomputes the difference in means from the null population mean. The return p-value assess the two-sided test of whether the absolute value of original difference in means is greater than the absolute value of the permuted differences.

Just like with scipy, we can test each column against an independent null mean:

print(nnstats.permtest_1samp(rvs, [5.0, 0.0]))
(array([0.64544455, 4.27934625]), array([0.15284715, 0.000999  ]))

We can also provide an axis parameter (by default, axis=0):

print(nnstats.permtest_1samp(rvs.T, [5.0, 0.0], axis=1))
(array([0.64544455, 4.27934625]), array([0.16783217, 0.000999  ]))

Finally, we can change the number of permutations we want to calculate (by default, n_perm=1000) and set a seed for reproducibility:

print(nnstats.permtest_1samp(rvs, 0.0, n_perm=500, seed=2222))
(array([5.64544455, 4.27934625]), array([0.00199601, 0.00199601]))

Note that the lowest p-value that can be obtained from a permutation test in netneurotools is equal to 1 / (n_perm + 1).

Permutation tests for correlations

Sometimes rather than assessing differences in means we want to assess the strength of a relationship between two variables. While we might normally do this with a Pearson (or Spearman) correlation, we can assess the significance of this relationship via permutation tests.

First, we’ll generate two correlated variables:

x, y = nnstats.make_correlated_xy(corr=0.2, size=100)

We can generate the Pearson correlation with the standard parametric p-value:

print(sstats.pearsonr(x, y))
PearsonRResult(statistic=np.float64(0.19982034684460145), pvalue=np.float64(0.04623612509934489))

Or use permutation testing to derive the p-value:

(np.float64(0.19982034684460143), np.float64(0.028971028971028972))

All the same arguments as with permtest_1samp() and permtest_rel() apply here, so you can provide same-sized arrays and correlations will only be calculated for paired columns:

(array([0.19982035, 0.89939585]), array([0.02897103, 0.000999  ]))

Or you can change the number of permutations and set a seed for reproducibility:

print(nnstats.permtest_pearsonr(arr1, arr2, n_perm=500, seed=2222))
(array([0.19982035, 0.89939585]), array([0.0499002 , 0.00199601]))

Note that currently the axis parameter does not apply to permtest_pearsonr().

Total running time of the script: (0 minutes 0.713 seconds)

Gallery generated by Sphinx-Gallery