spectralbrain.statistics.analysis#

Statistical analysis toolkit for spectral morphometry.

Covers the full analytical pipeline from vertex-wise group comparison to connectome-level network analysis, including dimension-collapsing methods for converting per-vertex descriptors into per-shape global vectors.

Sections#

§1 Vertex-wise group comparison (t-test, Mann-Whitney, TFCE, FDR, permutation) §2 Effect sizes (Cohen’s d, Hedges’ g — vertex-wise maps) §3 Vertex-wise correlation with clinical scores §4 Surprise / anomaly maps (z-score against normative) §5 Classification (SVM, LogReg + CV) §6 Dimension collapsing (Fisher vectors, Bag-of-Spectral-Words, kernel mean embedding) §7 Dissimilarity measures (EMD, KL, JS divergence, energy distance) §8 RSA — Representational Similarity Analysis §9 Connectome & network analysis (modularity, participation, NBS, Mantel) §10 Asymmetry analysis (lateralisation indices) §11 Dimensionality reduction (PCA, MDS, UMAP)

Functions

asymmetry_test(left, right, *[, test])

Test whether L and R descriptors differ significantly.

bag_of_spectral_words(descriptor, codebook, *)

Bag-of-Words encoding of per-vertex descriptors.

classify(features, labels, *[, model, ...])

Cross-validated classification with feature importance.

cohens_d_map(group_a, group_b)

Vertex-wise Cohen's d effect-size map.

emd_distance(a, b)

Earth Mover's Distance (1D Wasserstein) between distributions.

energy_distance(a, b)

Energy distance between two multivariate samples.

fisher_vector(descriptor, gmm_means, ...)

Fisher vector encoding of per-vertex descriptors.

fit_gmm_codebook(all_descriptors[, ...])

Fit a GMM codebook on pooled descriptors from a population.

hedges_g_map(group_a, group_b)

Vertex-wise Hedges' g (bias-corrected Cohen's d).

intra_inter_ratio(connectome, community_labels)

Intra- vs inter-community connectivity ratio.

js_divergence(a, b, **kwargs)

Jensen-Shannon divergence (symmetric, bounded [0, ln2]).

kernel_mean_embedding(descriptor, *[, ...])

Kernel mean embedding of a descriptor distribution.

kl_divergence(a, b, *[, bins])

KL divergence estimated via histogram binning.

lateralisation_index(left, right)

Lateralisation Index: LI = (L - R) / (|L| + |R|).

mantel_test(matrix_a, matrix_b, *[, ...])

Mantel test — correlation between two distance matrices.

modularity(connectome, community_labels, *)

Newman's modularity Q for a given community partition.

participation_coefficient(connectome, ...)

Participation coefficient per node.

rdm(features, *[, metric])

Representational Dissimilarity Matrix.

rsa_compare(rdm_a, rdm_b, *[, method, ...])

Compare two RDMs via Representational Similarity Analysis.

spectral_mds(distance_matrix[, ...])

Classical MDS embedding from a distance matrix.

spectral_pca(features[, n_components])

PCA on spectral features.

spectral_umap(features[, n_components, ...])

UMAP embedding of spectral features.

surprise_map(subject_descriptor, ...)

Z-score anomaly map against a normative distribution.

surprise_map_percentile(subject_descriptor, ...)

Percentile-based anomaly map.

tfce(statistic_map, adjacency, *[, E, H, ...])

Threshold-Free Cluster Enhancement (Smith & Nichols 2009).

vertexwise_correlation(descriptors, scores, *)

Correlate a per-vertex descriptor with a clinical score.

vertexwise_mannwhitney(group_a, group_b, *)

Non-parametric Mann-Whitney U test at each vertex.

vertexwise_permutation(group_a, group_b, *)

Permutation test at each vertex with multiple-comparison control.

vertexwise_ttest(group_a, group_b, *[, ...])

Independent two-sample t-test at each vertex (vectorised).

Classes

ClassificationResult(accuracy, accuracy_std, ...)

Output of a classification analysis.

VertexWiseResult(statistic, p_values, ...[, ...])

Results of a vertex-wise statistical test.

class spectralbrain.statistics.analysis.ClassificationResult(accuracy, accuracy_std, auc, auc_std, feature_importance, confusion_matrix, model_name)[source]#

Bases: object

Output of a classification analysis.

Parameters:
accuracy: float#
accuracy_std: float#
auc: float#
auc_std: float#
confusion_matrix: ndarray | None#
feature_importance: ndarray | None#
model_name: str#
class spectralbrain.statistics.analysis.VertexWiseResult(statistic, p_values, p_corrected, correction, significant, alpha, effect_size=None)[source]#

Bases: object

Results of a vertex-wise statistical test.

Parameters:
alpha: float#
correction: str#
effect_size: ndarray | None = None#
property n_significant: int#

Return the count of significant results after correction.

p_corrected: ndarray#
p_values: ndarray#
significant: ndarray#
statistic: ndarray#
spectralbrain.statistics.analysis.asymmetry_test(left, right, *, test='wilcoxon')[source]#

Test whether L and R descriptors differ significantly.

Parameters:
  • left (ndarray, shape (S,) — per-subject global descriptors)

  • right (ndarray, shape (S,) — per-subject global descriptors)

  • test (str)

Returns:

statistic, p_value (float)

Return type:

tuple[float, float]

spectralbrain.statistics.analysis.bag_of_spectral_words(descriptor, codebook, *, soft=True, sigma=None)[source]#

Bag-of-Words encoding of per-vertex descriptors.

Parameters:
  • descriptor (ndarray, shape (N, T))

  • codebook (ndarray, shape (K, T)) – Cluster centres (from k-means on pooled descriptors).

  • soft (bool) – Soft assignment (Gaussian weighted) vs hard assignment.

  • sigma (float, optional) – Bandwidth for soft assignment. None = auto.

Returns:

ndarray, shape (K,) – Normalised histogram over codebook words.

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.classify(features, labels, *, model='svm', n_folds=5, seed=42)[source]#

Cross-validated classification with feature importance.

Parameters:
  • features (ndarray, shape (S, d))

  • labels (ndarray, shape (S,))

  • model (str)

  • n_folds (int)

  • seed (int)

Returns:

ClassificationResult

Return type:

ClassificationResult

spectralbrain.statistics.analysis.cohens_d_map(group_a, group_b)[source]#

Vertex-wise Cohen’s d effect-size map.

Parameters:
  • group_a (ndarray, shape (n, N))

  • group_b (ndarray, shape (n, N))

Returns:

ndarray, shape (N,)

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.emd_distance(a, b)[source]#

Earth Mover’s Distance (1D Wasserstein) between distributions.

For multi-dimensional descriptors, averages across columns.

Parameters:
  • a (ndarray, shape (N,) or (N, T))

  • b (ndarray, shape (N,) or (N, T))

Returns:

float

Return type:

float

spectralbrain.statistics.analysis.energy_distance(a, b)[source]#

Energy distance between two multivariate samples.

Parameters:
  • a (ndarray, shape (N_a, d))

  • b (ndarray, shape (N_b, d))

Returns:

float

Return type:

float

spectralbrain.statistics.analysis.fisher_vector(descriptor, gmm_means, gmm_covs, gmm_weights)[source]#

Fisher vector encoding of per-vertex descriptors.

Projects a per-vertex descriptor distribution onto the gradient of a Gaussian Mixture Model, producing a fixed-length global vector regardless of the number of vertices.

Parameters:
  • descriptor (ndarray, shape (N, T)) – Per-vertex descriptor matrix.

  • gmm_means (ndarray, shape (K, T)) – GMM component means.

  • gmm_covs (ndarray, shape (K, T)) – GMM diagonal covariances.

  • gmm_weights (ndarray, shape (K,)) – GMM component weights (sum to 1).

Returns:

ndarray, shape (2·K·T,) – Fisher vector (concatenation of first and second order gradient statistics).

Return type:

ndarray[tuple[Any, …], dtype[floating]]

References

Perronnin F, Dance C. Fisher kernels on visual vocabularies for image categorization. CVPR 2007. Sánchez J, Perronnin F, Mensink T, Verbeek J. Image classification with the Fisher vector. IJCV 105(3):222–245, 2013.

spectralbrain.statistics.analysis.fit_gmm_codebook(all_descriptors, n_components=32, *, seed=42)[source]#

Fit a GMM codebook on pooled descriptors from a population.

Parameters:
  • all_descriptors (ndarray, shape (N_total, T)) – Pooled descriptors from all subjects.

  • n_components (int) – Number of GMM components.

  • seed (int)

Returns:

  • means (ndarray, shape (K, T))

  • covariances (ndarray, shape (K, T)) – Diagonal covariances.

  • weights (ndarray, shape (K,))

Return type:

tuple[ndarray, ndarray, ndarray]

spectralbrain.statistics.analysis.hedges_g_map(group_a, group_b)[source]#

Vertex-wise Hedges’ g (bias-corrected Cohen’s d).

Parameters:
  • group_a (ndarray, shape (n, N))

  • group_b (ndarray, shape (n, N))

Returns:

ndarray, shape (N,)

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.intra_inter_ratio(connectome, community_labels)[source]#

Intra- vs inter-community connectivity ratio.

Parameters:
  • connectome (ndarray, shape (R, R))

  • community_labels (ndarray, shape (R,))

Returns:

dict"intra_mean", "inter_mean", "ratio".

Return type:

dict[str, float]

spectralbrain.statistics.analysis.js_divergence(a, b, **kwargs)[source]#

Jensen-Shannon divergence (symmetric, bounded [0, ln2]).

Parameters:
  • a (ndarray)

  • b (ndarray)

Returns:

float

Return type:

float

spectralbrain.statistics.analysis.kernel_mean_embedding(descriptor, *, kernel='rbf', sigma=None, n_landmarks=100, seed=None)[source]#

Kernel mean embedding of a descriptor distribution.

Embeds the empirical distribution of per-vertex descriptors into an RKHS, approximated by random Fourier features (Rahimi & Recht 2007) for scalability.

Parameters:
  • descriptor (ndarray, shape (N, T))

  • kernel (str)

  • sigma (float, optional)

  • n_landmarks (int) – Number of random Fourier features.

  • seed (int)

Returns:

ndarray, shape (n_landmarks,) – Approximate kernel mean embedding.

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.kl_divergence(a, b, *, bins=50)[source]#

KL divergence estimated via histogram binning.

Parameters:
  • a (ndarray, shape (N,))

  • b (ndarray, shape (N,))

  • bins (int)

Returns:

float – D_KL(a || b).

Return type:

float

spectralbrain.statistics.analysis.lateralisation_index(left, right)[source]#

Lateralisation Index: LI = (L - R) / (|L| + |R|).

Parameters:
  • left (ndarray) – Matching descriptor values for L and R hemispheres.

  • right (ndarray) – Matching descriptor values for L and R hemispheres.

Returns:

ndarray – LI ∈ [-1, +1]. Positive = left > right.

Return type:

ndarray

spectralbrain.statistics.analysis.mantel_test(matrix_a, matrix_b, *, n_permutations=5000, method='spearman', seed=None)[source]#

Mantel test — correlation between two distance matrices.

Tests whether two distance matrices are correlated by comparing the observed correlation to a null distribution generated by row/column permutation.

Parameters:
  • matrix_a (ndarray, shape (N, N))

  • matrix_b (ndarray, shape (N, N))

  • n_permutations (int)

  • method (str)

  • seed (int)

Returns:

  • r (float)

  • p_value (float)

Return type:

tuple[float, float]

spectralbrain.statistics.analysis.modularity(connectome, community_labels, *, gamma=1.0)[source]#

Newman’s modularity Q for a given community partition.

Q = (1/2m) Σ_{ij} [A_{ij} - γ·k_i·k_j/(2m)] · δ(c_i, c_j)

Parameters:
  • connectome (ndarray, shape (R, R)) – Similarity matrix (higher = more similar).

  • community_labels (ndarray, shape (R,)) – Community assignment per node.

  • gamma (float) – Resolution parameter.

Returns:

float – Modularity Q.

Return type:

float

spectralbrain.statistics.analysis.participation_coefficient(connectome, community_labels)[source]#

Participation coefficient per node.

PC_i = 1 - Σ_k (s_{ik} / s_i)²

High PC → hub connected to multiple communities. Low PC → provincial node within one community.

Parameters:
  • connectome (ndarray, shape (R, R))

  • community_labels (ndarray, shape (R,))

Returns:

ndarray, shape (R,)

Return type:

ndarray

spectralbrain.statistics.analysis.rdm(features, *, metric='correlation')[source]#

Representational Dissimilarity Matrix.

Parameters:
  • features (ndarray, shape (S, d)) – S items × d features.

  • metric (str)

Returns:

ndarray, shape (S, S)

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.rsa_compare(rdm_a, rdm_b, *, method='spearman', permutations=0, seed=None)[source]#

Compare two RDMs via Representational Similarity Analysis.

Parameters:
  • rdm_a (ndarray, shape (S, S)) – Representational Dissimilarity Matrices.

  • rdm_b (ndarray, shape (S, S)) – Representational Dissimilarity Matrices.

  • method (str) – Correlation method.

  • permutations (int) – If > 0, compute p-value via permutation test.

  • seed (int)

Returns:

  • r (float) – Correlation between upper triangles.

  • p_value (float) – p-value (parametric if permutations=0, permutation otherwise).

Return type:

tuple[float, float]

spectralbrain.statistics.analysis.spectral_mds(distance_matrix, n_components=2, *, seed=None)[source]#

Classical MDS embedding from a distance matrix.

Parameters:
  • distance_matrix (ndarray, shape (S, S))

  • n_components (int)

  • seed (int)

Returns:

ndarray, shape (S, n_components)

Return type:

ndarray

spectralbrain.statistics.analysis.spectral_pca(features, n_components=2)[source]#

PCA on spectral features.

Parameters:
  • features (ndarray, shape (S, d))

  • n_components (int)

Returns:

  • scores (ndarray, shape (S, n_components))

  • loadings (ndarray, shape (n_components, d))

  • explained_variance_ratio (ndarray, shape (n_components,))

Return type:

tuple[ndarray, ndarray, ndarray]

spectralbrain.statistics.analysis.spectral_umap(features, n_components=2, *, n_neighbors=15, min_dist=0.1, seed=None)[source]#

UMAP embedding of spectral features.

Parameters:
  • features (ndarray, shape (S, d))

  • n_components (int)

  • n_neighbors (int)

  • min_dist (float)

  • seed (int)

Returns:

ndarray, shape (S, n_components)

Return type:

ndarray

spectralbrain.statistics.analysis.surprise_map(subject_descriptor, normative_mean, normative_std)[source]#

Z-score anomaly map against a normative distribution.

Parameters:
  • subject_descriptor (ndarray, shape (N,) or (N, T))

  • normative_mean (ndarray, same shape)

  • normative_std (ndarray, same shape)

Returns:

ndarray, same shape – Z-scores: positive = above normative, negative = below.

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.surprise_map_percentile(subject_descriptor, normative_distribution)[source]#

Percentile-based anomaly map.

Parameters:
  • subject_descriptor (ndarray, shape (N,))

  • normative_distribution (ndarray, shape (S, N)) – Normative values from S reference subjects.

Returns:

ndarray, shape (N,) – Percentile rank (0–100) of subject relative to normative.

Return type:

ndarray[tuple[Any, …], dtype[floating]]

spectralbrain.statistics.analysis.tfce(statistic_map, adjacency, *, E=0.5, H=2.0, n_steps=100)[source]#

Threshold-Free Cluster Enhancement (Smith & Nichols 2009).

Enhances a statistical map by integrating cluster extent and height across all thresholds.

TFCE(v) = ∫₀^h(v) e(h)^E · h^H dh

Parameters:
  • statistic_map (ndarray, shape (N,)) – Vertex-wise test statistic (e.g. t-values).

  • adjacency (sparse matrix, shape (N, N)) – Vertex adjacency (from mesh or kNN graph).

  • E (float) – Cluster extent exponent (default 0.5).

  • H (float) – Height exponent (default 2.0).

  • n_steps (int) – Number of threshold steps for numerical integration.

Returns:

ndarray, shape (N,) – TFCE-enhanced statistic map.

Return type:

ndarray

References

Smith SM, Nichols TE. Threshold-free cluster enhancement. NeuroImage 44(1):83–98, 2009.

spectralbrain.statistics.analysis.vertexwise_correlation(descriptors, scores, *, method='pearson', correction='fdr', alpha=0.05, covariates=None)[source]#

Correlate a per-vertex descriptor with a clinical score.

Parameters:
  • descriptors (ndarray, shape (S, N) or (S, N, T)) – Per-subject descriptor values. If 3D, averaged over T.

  • scores (ndarray, shape (S,)) – Clinical score per subject.

  • method (str)

  • correction (str)

  • alpha (float)

  • covariates (ndarray, shape (S, C), optional) – If given, partial correlation controlling for covariates.

Returns:

VertexWiseResult

Return type:

VertexWiseResult

spectralbrain.statistics.analysis.vertexwise_mannwhitney(group_a, group_b, *, correction='fdr', alpha=0.05)[source]#

Non-parametric Mann-Whitney U test at each vertex.

Parameters:
  • group_a (ndarray, shape (n, N))

  • group_b (ndarray, shape (n, N))

  • correction (as above.)

  • alpha (as above.)

Returns:

VertexWiseResult

Return type:

VertexWiseResult

spectralbrain.statistics.analysis.vertexwise_permutation(group_a, group_b, *, n_permutations=5000, stat_func='t', correction='max', seed=None, alpha=0.05)[source]#

Permutation test at each vertex with multiple-comparison control.

The label permutation builds an exact (non-parametric) null. Crucially, the per-vertex permutation p-value is not corrected for multiple comparisons on its own. This function offers proper correction:

  • correction="max" (default): family-wise error rate (FWER) control via the maximum-statistic null distribution (Westfall & Young; Nichols & Holmes 2002). On each permutation the maximum of |statistic| across all vertices is recorded; a vertex is significant if its observed statistic exceeds the (1-alpha) quantile of that null. This is the standard rigorous correction for vertex/voxel-wise permutation testing.

  • correction="fdr": per-vertex permutation p-values, then Benjamini-Hochberg.

  • correction="none": raw per-vertex permutation p-values.

Parameters:
  • group_a (ndarray, shape (n, N))

  • group_b (ndarray, shape (n, N))

  • n_permutations (int)

  • stat_func ("t" or "mean_diff")

  • correction ("max", "fdr", or "none")

  • seed (int, optional)

  • alpha (float)

Returns:

VertexWiseResultp_values are always the raw per-vertex permutation p-values; p_corrected reflects the chosen correction. For "max", p_corrected is the FWER-adjusted p-value (fraction of the max-null at or above each observed statistic).

Return type:

VertexWiseResult

References

Nichols TE, Holmes AP. Nonparametric permutation tests for functional neuroimaging. Hum Brain Mapp 15(1):1–25, 2002.

spectralbrain.statistics.analysis.vertexwise_ttest(group_a, group_b, *, correction='fdr', alpha=0.05, equal_var=False)[source]#

Independent two-sample t-test at each vertex (vectorised).

Parameters:
  • group_a (ndarray, shape (n_a, N) or (n_a, N, T)) – Descriptor values for group A (subjects × vertices [× scales]). If 3D, tests are run on the mean across the last axis.

  • group_b (ndarray, shape (n_b, N)) – Descriptor values for group B.

  • correction (str) – "fdr" — Benjamini-Hochberg; "bonferroni"; "none".

  • alpha (float)

  • equal_var (bool) – If False (default), Welch’s t-test (does not assume equal variances) — the safer default for groups with unequal size or spread. If True, Student’s pooled-variance t-test.

Returns:

VertexWiseResult

Return type:

VertexWiseResult