Pywayne Statistics

Comprehensive statistical testing library with 37+ methods for normality tests, location tests, correlation tests, time series tests, and model diagnostics....

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 480 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description match the contents of SKILL.md: it documents many statistical tests and shows example usage. The declared capability aligns with the methods and examples shown.
Instruction Scope
The instructions are examples that import 'pywayne.statistics' and call API methods, but the skill provides no install steps or code files. The SKILL.md does not direct the agent to read files, environment variables, or send data to external endpoints — so there is no scope creep. However, runtime use will require the referenced Python package to actually exist in the environment; the skill does not provide provenance or installation guidance.
Install Mechanism
No install spec is provided (instruction-only), which minimizes installation risk. Because nothing is downloaded or written, there is no install-time code execution risk from the skill itself.
Credentials
No environment variables, secrets, or config paths are requested. The claimed functionality (statistical tests) does not require additional credentials, so the lack of requested secrets is appropriate.
Persistence & Privilege
The skill does not request 'always' presence and does not instruct modifying system or other skills' configurations. Default autonomous invocation is allowed but not accompanied by increased privileges.
Assessment
This SKILL.md is documentation-only for a Python package; it does not include the package source or installation steps. Before relying on it, verify that the 'pywayne' package (or equivalent) is available from a trusted source (PyPI/GitHub) and inspect its source code or provenance. If you plan to execute the example code, pip-install only from known repositories and run in an isolated environment. There are no direct signs of data-exfiltration or credential requests in the skill itself, but lack of provenance lowers confidence — confirm the package origin before installing or running it.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.0
Download zip
latestvk97eq7yyrqtwpwg61t8xbgmxpx818ew3

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Pywayne Statistics

Comprehensive statistical testing library for hypothesis testing, A/B testing, and data analysis.

Quick Start

from pywayne.statistics import NormalityTests, LocationTests
import numpy as np

# Test data normality
nt = NormalityTests()
data = np.random.normal(0, 1, 100)
result = nt.shapiro_wilk(data)
print(f"p-value: {result.p_value:.4f}, is_normal: {not result.reject_null}")

# Compare two groups
lt = LocationTests()
group_a = np.random.normal(100, 15, 50)
group_b = np.random.normal(105, 15, 50)
result = lt.two_sample_ttest(group_a, group_b)
print(f"Significant difference: {result.reject_null}")

Test Categories

NormalityTests (NormalityTests)

Test if data follows a normal distribution or other specified distributions.

MethodDescriptionUse Case
shapiro_wilkShapiro-Wilk testSmall-medium samples (n ≤ 5000)
ks_test_normalK-S normality testMedium-large samples
ks_test_two_sampleTwo-sample K-S testCompare two sample distributions
anderson_darlingAnderson-Darling testTail-sensitive normality test
dagostino_pearsonD'Agostino-Pearson K²Based on skewness and kurtosis
jarque_beraJarque-Bera testLarge samples, regression residuals
chi_square_goodness_of_fitChi-square goodness-of-fitCategorical data
lilliefors_testLilliefors testUnknown parameters K-S test

Example:

from pywayne.statistics import NormalityTests

nt = NormalityTests()
result = nt.shapiro_wilk(data)
if result.p_value < 0.05:
    print("Data is NOT normally distributed")
else:
    print("Data follows normal distribution")

LocationTests (LocationTests)

Compare means or medians across groups (parametric and non-parametric).

MethodDescriptionUse Case
one_sample_ttestOne-sample t-testCompare sample mean to a value
two_sample_ttestTwo-sample t-testCompare two independent group means
paired_ttestPaired t-testCompare before/after measurements
one_way_anovaOne-way ANOVACompare 3+ group means
mann_whitney_uMann-Whitney U testNon-parametric two-sample test
wilcoxon_signed_rankWilcoxon signed-rankNon-parametric paired test
kruskal_wallisKruskal-Wallis H testNon-parametric multi-group test

Example (A/B Testing):

from pywayne.statistics import LocationTests, NormalityTests

lt = LocationTests()
nt = NormalityTests()

# Check normality first
if nt.shapiro_wilk(control).p_value > 0.05:
    result = lt.two_sample_ttest(control, treatment)
else:
    result = lt.mann_whitney_u(control, treatment)

print(f"Effect significant: {result.reject_null}")

CorrelationTests (CorrelationTests)

Test correlation between variables and independence of categorical variables.

MethodDescriptionUse Case
pearson_correlationPearson correlationLinear relationship
spearman_correlationSpearman's rankMonotonic relationship
kendall_tauKendall's tauRank correlation, small samples
chi_square_independenceChi-square independenceCategorical variables
fisher_exact_testFisher's exact test2×2 contingency table
mcnemar_testMcNemar's testPaired categorical data

Example:

from pywayne.statistics import CorrelationTests

ct = CorrelationTests()
result = ct.pearson_correlation(x, y)
print(f"Correlation: {result.statistic:.3f}, p-value: {result.p_value:.4f}")

TimeSeriesTests (TimeSeriesTests)

Test time series properties: stationarity, autocorrelation, cointegration.

MethodDescriptionUse Case
adf_testAugmented Dickey-FullerUnit root test for stationarity
kpss_testKPSS testStationarity test (complements ADF)
ljung_box_testLjung-Box Q testOverall autocorrelation
runs_testRuns testRandomness testing
arch_testARCH effect testHeteroscedasticity
granger_causalityGranger causalityCausal relationship
engle_granger_cointegrationEngle-Granger cointegrationLong-term equilibrium
breusch_godfrey_testBreusch-GodfreyHigher-order autocorrelation

Example:

from pywayne.statistics import TimeSeriesTests

tst = TimeSeriesTests()
adf_result = tst.adf_test(time_series_data)
kpss_result = tst.kpss_test(time_series_data)

if adf_result.reject_null:
    print("Series is stationary")
else:
    print("Series has unit root (non-stationary)")

ModelDiagnostics (ModelDiagnostics)

Regression model diagnostics: heteroscedasticity, autocorrelation, multicollinearity.

MethodDescriptionUse Case
breusch_pagan_testBreusch-PaganHeteroscedasticity test
white_testWhite's testGeneral heteroscedasticity
goldfeld_quandt_testGoldfeld-QuandtStructural break heteroscedasticity
durbin_watson_testDurbin-WatsonFirst-order autocorrelation
variance_inflation_factorVIFMulticollinearity diagnosis
levene_testLevene's testHomogeneity of variance
bartlett_testBartlett's testHomogeneity (normal assumption)
residual_normality_testResidual normalityRegression assumption check

Example:

from pywayne.statistics import ModelDiagnostics

md = ModelDiagnostics()
residuals = y - model.predict(X)

# Check assumptions
bp_result = md.breusch_pagan_test(residuals, X)
dw_result = md.durbin_watson_test(residuals)

if bp_result.reject_null:
    print("Warning: Heteroscedasticity detected")

TestResult Object

All test methods return a unified TestResult object:

result = nt.shapiro_wilk(data)

# Access results
result.test_name        # Test method name
result.statistic        # Test statistic value
result.p_value          # P-value
result.reject_null      # True if null hypothesis is rejected
result.critical_value   # Critical value (if applicable)
result.confidence_interval # Tuple (lower, upper) if applicable
result.effect_size      # Effect size if applicable
result.additional_info  # Dict with additional information

Utility Functions

list_all_tests()

List all available test methods across all modules.

from pywayne.statistics import list_all_tests
print(list_all_tests())

show_test_usage(method_name)

Display usage and documentation for a specific test.

from pywayne.statistics import show_test_usage
show_test_usage('shapiro_wilk')

Method Selection Guide

Normality Tests

Sample SizeRecommended Method
n < 30Shapiro-Wilk
30 ≤ n ≤ 300Shapiro-Wilk, D'Agostino-Pearson
n > 300Jarque-Bera, Kolmogorov-Smirnov

Location Tests

ConditionParametricNon-parametric
Normal datat-test, ANOVA-
Non-normal data-Mann-Whitney U, Kruskal-Wallis
Paired dataPaired t-testWilcoxon signed-rank

Multiple Testing Correction

When performing multiple tests, apply p-value correction:

from statsmodels.stats.multitest import multipletests

p_values = [r.p_value for r in results]
rejected, p_corrected, _, _ = multipletests(
    p_values, alpha=0.05, method='fdr_bh'
)

Common Applications

Data Quality Check

def data_quality_check(data):
    nt = NormalityTests()
    lt = LocationTests()

    normality = nt.shapiro_wilk(data)

    # Outlier detection (IQR)
    Q1, Q3 = np.percentile(data, [25, 75])
    IQR = Q3 - Q1
    outliers = data[(data < Q1 - 1.5*IQR) | (data > Q3 + 1.5*IQR)]

    return {
        'size': len(data),
        'is_normal': not normality.reject_null,
        'p_value': normality.p_value,
        'outliers': len(outliers)
    }

A/B Testing Workflow

def ab_test_analysis(control, treatment):
    nt = NormalityTests()
    lt = LocationTests()

    # Check normality
    norm_c = nt.shapiro_wilk(control[:100])
    norm_t = nt.shapiro_wilk(treatment[:100])

    # Choose appropriate test
    if norm_c.p_value > 0.05 and norm_t.p_value > 0.05:
        result = lt.two_sample_ttest(control, treatment)
    else:
        result = lt.mann_whitney_u(control, treatment)

    return {
        'test_used': result.test_name,
        'p_value': result.p_value,
        'significant': result.reject_null,
        'effect_size': result.effect_size
    }

Regression Model Diagnostics

def diagnose_model(y, X, model):
    md = ModelDiagnostics()
    residuals = y - model.predict(X)

    return {
        'heteroscedasticity_bp': md.breusch_pagan_test(residuals, X).reject_null,
        'autocorrelation_dw': md.durbin_watson_test(residuals).statistic,
        'residuals_normal': md.residual_normality_test(residuals).p_value,
        'vif_max': max(md.variance_inflation_factor(X))
    }

Notes

  • All methods accept np.ndarray or list as input
  • All methods return TestResult with consistent interface
  • Always validate test assumptions before applying parametric tests
  • Apply multiple testing correction when performing several tests
  • Report effect sizes alongside p-values for complete interpretation

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…