spectralbrain.statistics.eda#
Exploratory data analysis and quality control for spectral morphometry.
Five diagnostic blocks plus the descriptor recommendation engine:
Spectral QC — validate eigendecomposition quality.
Optimal k — how many eigenpairs are enough?
Descriptor profiling — summary statistics, normality, outliers.
Reliability — ICC test-retest, batch-effect detection.
Report — integrated markdown/Rich output.
recommend_descriptor() — surrogate-based descriptor selection.
Functions
|
Scan for batch/site effects in spectral descriptors. |
|
Intraclass Correlation Coefficient for test-retest. |
|
Correlation matrix between descriptors (redundancy check). |
|
Summary statistics for each descriptor. |
|
Cross-subject eigenvalue stability analysis. |
|
Determine optimal number of eigenpairs. |
|
Recommend the best spectral descriptor for an analysis objective. |
|
Run quality-control diagnostics on a spectral decomposition. |
Classes
|
Output of |
|
Recommended number of eigenpairs by multiple criteria. |
|
Quality-control diagnostics for a spectral decomposition. |
- class spectralbrain.statistics.eda.DescriptorRecommendation(recommended, objective, ranking, surrogate_details)[source]#
Bases:
objectOutput of
recommend_descriptor().- Parameters:
- class spectralbrain.statistics.eda.OptimalKResult(k_elbow=0, k_energy_95=0, k_energy_99=0, k_gap=0, k_recommended=0, eigenvalues=None, cumulative_energy=None)[source]#
Bases:
objectRecommended number of eigenpairs by multiple criteria.
- Parameters:
- class spectralbrain.statistics.eda.SpectralQCReport(n_vertices=0, n_eigenvalues=0, lambda_0=0.0, lambda_0_ok=True, fiedler_value=0.0, spectral_gap=0.0, eigenvalues_nonneg=True, n_negative_eigenvalues=0, max_negative_eigenvalue=0.0, orthonormality_error=0.0, orthonormality_ok=True, laplacian_row_sum_max=0.0, laplacian_row_sum_ok=True, near_degenerate_pairs=0, recommended_k=None, warnings=<factory>, passed=True)[source]#
Bases:
objectQuality-control diagnostics for a spectral decomposition.
All fields are populated by
spectral_qc().- Parameters:
n_vertices (int)
n_eigenvalues (int)
lambda_0 (float)
lambda_0_ok (bool)
fiedler_value (float)
spectral_gap (float)
eigenvalues_nonneg (bool)
n_negative_eigenvalues (int)
max_negative_eigenvalue (float)
orthonormality_error (float)
orthonormality_ok (bool)
laplacian_row_sum_max (float)
laplacian_row_sum_ok (bool)
near_degenerate_pairs (int)
recommended_k (int | None)
passed (bool)
- spectralbrain.statistics.eda.batch_effect_scan(descriptors, site_labels, *, alpha=0.05)[source]#
Scan for batch/site effects in spectral descriptors.
For each descriptor, tests whether distributions differ significantly across sites using Kruskal-Wallis.
- Parameters:
descriptors (dict of {name: ndarray}) – Per-subject descriptor values.
site_labels (ndarray, shape (n_subjects,)) – Site/scanner labels.
alpha (float) – Significance threshold.
- Returns:
dict of {name ({statistic, p_value, has_batch_effect, effect_size}})
- Return type:
- spectralbrain.statistics.eda.compute_icc(test, retest, *, icc_type='ICC3,1')[source]#
Intraclass Correlation Coefficient for test-retest.
- Parameters:
test (ndarray, shape (N,) or (N, T)) – Descriptor values at time 1.
retest (ndarray, shape (N,) or (N, T)) – Descriptor values at time 2.
icc_type (str) –
"ICC2,1"— two-way random, single measures."ICC3,1"— two-way mixed, single measures (recommended for neuroimaging).
- Returns:
float – ICC value in [-1, 1]. >0.75 = excellent, 0.60–0.75 = good, 0.40–0.60 = fair, <0.40 = poor.
- Return type:
- spectralbrain.statistics.eda.descriptor_correlation(descriptors, *, method='pearson')[source]#
Correlation matrix between descriptors (redundancy check).
For multi-column descriptors, uses the mean across columns.
- spectralbrain.statistics.eda.descriptor_profile(descriptors, *, normality_samples=500, seed=None)[source]#
Summary statistics for each descriptor.
- Parameters:
- Returns:
dict of {name ({stat: value}}) – Keys per descriptor: mean, std, min, max, skew, kurtosis, q25, q50, q75, shapiro_p, n_outliers_3sigma, shape.
- Return type:
- spectralbrain.statistics.eda.eigenvalue_stability(decomps, *, n_eigenvalues=None)[source]#
Cross-subject eigenvalue stability analysis.
- spectralbrain.statistics.eda.optimal_k(eigenvalues, *, energy_thresholds=(0.95, 0.99))[source]#
Determine optimal number of eigenpairs.
Three criteria: 1. Elbow — maximum curvature of log(λ) vs index. 2. Energy — Σᵢλᵢ / Σλ > threshold. 3. Max gap — largest relative gap between consecutive λ.
- Parameters:
- Returns:
OptimalKResult
- Return type:
- spectralbrain.statistics.eda.recommend_descriptor(points, labels=None, objective='group_discrimination', *, n_surrogates=30, k_eigenpairs=30, n_jobs=1, seed=42)[source]#
Recommend the best spectral descriptor for an analysis objective.
Generates synthetic surrogates with controlled deformations, computes all eligible descriptors, evaluates each descriptor’s discriminative power, and ranks by consensus.
- Parameters:
points (ndarray, shape (N, 3)) – Representative geometry (e.g. mean mesh vertices, or one subject’s point cloud).
labels (ndarray, optional) – Not used by the surrogate engine (surrogates generate their own labels). Reserved for future data-driven evaluation.
objective (str or AnalysisObjective) – Analysis goal. Determines eligible descriptors and surrogate deformation type.
n_surrogates (int) – Number of synthetic shapes to generate.
k_eigenpairs (int) – Eigenpairs per surrogate decomposition.
n_jobs (int) – Number of parallel workers for surrogate decomposition.
1= sequential (default),-1= all cores. Requiresjoblibwhen > 1.seed (int, optional) – RNG seed for reproducibility.
- Returns:
DescriptorRecommendation – Contains
.recommended,.ranking(top descriptors with AUC, accuracy, effect size), and.surrogate_details.- Return type:
Notes
This function is computationally heavy (30 surrogates × k eigenpairs × all descriptors by default). For large meshes, consider using
n_jobs=-1to parallelise the surrogate decomposition across CPU cores.Examples
>>> rec = sb.statistics.recommend_descriptor( ... mesh.vertices, ... objective="group_discrimination", ... n_jobs=-1, ... ) >>> print(rec.recommended) 'wks' >>> print(rec.ranking[:3])
- spectralbrain.statistics.eda.spectral_qc(decomp, *, lambda_0_tol=0.0001, ortho_tol=0.001, row_sum_tol=0.01, degeneracy_tol=1e-06)[source]#
Run quality-control diagnostics on a spectral decomposition.
- Parameters:
decomp (SpectralDecomposition)
lambda_0_tol (float) – Tolerance for λ₀ ≈ 0.
ortho_tol (float) – Tolerance for M-orthonormality of eigenvectors.
row_sum_tol (float) – Tolerance for Laplacian row-sum ≈ 0.
degeneracy_tol (float) – Relative gap below which eigenvalue pairs are flagged as near-degenerate.
- Returns:
SpectralQCReport
- Return type: