Stats Engine#

Statistical metrics and confidence interval computations for Monte Carlo simulation results.

Quick Start #

from mcframework.stats_engine import DEFAULT_ENGINE, StatsContext
import numpy as np

# Your simulation results
data = np.random.normal(100, 15, size=10_000)

# Configure and compute
ctx = StatsContext(n=len(data), confidence=0.95)
result = DEFAULT_ENGINE.compute(data, ctx)

print(f"Mean: {result.metrics['mean']:.2f}")
print(f"95% CI: [{result.metrics['ci_mean']['low']:.2f}, {result.metrics['ci_mean']['high']:.2f}]")

Configuration #

StatsContext #

The StatsContext dataclass configures all metric computations:

ctx = StatsContext(
    n=10_000,              # Sample size (required)
    confidence=0.95,       # CI confidence level
    ci_method="auto",      # "auto", "z", "t", or "bootstrap"
    percentiles=(5, 50, 95),
    nan_policy="propagate",  # or "omit"
)

StatsContext

Shared, explicit configuration for statistic and CI computations.

See StatsContext for full attribute documentation with examples.

Quick Reference:

Field	Default	Description
`n`	(required)	Declared sample size
`confidence`	0.95	Confidence level ∈ (0, 1)
`ci_method`	“auto”	CI method: “auto”, “z”, “t”, “bootstrap”
`percentiles`	(5,25,50,75,95)	Quantiles to compute
`nan_policy`	“propagate”	“propagate” or “omit” non-finite values
`ddof`	1	Degrees of freedom for std
`target`	None	Target value for bias/MSE metrics
`eps`	None	Error tolerance for Chebyshev/Markov
`n_bootstrap`	10,000	Bootstrap resamples
`bootstrap`	“percentile”	Bootstrap method: “percentile” or “bca”
`rng`	None	Seed or `Generator` for reproducibility

Helper Properties:

ctx.alpha           # Tail probability: 1 - confidence
ctx.q_bound()       # Percentile bounds: (2.5, 97.5) for 95% CI
ctx.eff_n(len(x))   # Effective sample size
ctx.with_overrides(confidence=0.99)  # Create modified copy

Metrics #

Descriptive Statistics #

`mean`	Sample mean \(\bar X = \frac{1}{n}\sum_{i=1}^n x_i\).
`std`	Sample standard deviation with Bessel correction.
`percentiles`	Empirical percentiles evaluated on the cleaned sample.
`skew`	Unbiased sample skewness (Fisher–Pearson standardized third central moment).
`kurtosis`	Unbiased sample excess kurtosis (Fisher definition).

Usage:

from mcframework.stats_engine import mean, std, percentiles

m = mean(data, ctx)           # Sample mean
s = std(data, ctx)            # Sample std (ddof=1)
pcts = percentiles(data, ctx) # {5: ..., 25: ..., 50: ..., 75: ..., 95: ...}

Confidence Intervals #

Three methods for computing CIs on the mean:

Method	Function	Best For
Parametric	`ci_mean()`	Large samples, normal-ish data
Bootstrap	`ci_mean_bootstrap()`	Non-normal distributions
Chebyshev	`ci_mean_chebyshev()`	Distribution-free guarantees

`ci_mean`	Parametric CI for \(\mathbb{E}[X]\) using z/t critical values.
`ci_mean_bootstrap`	Bootstrap confidence interval for \(\mathbb{E}[X]\) via resampling.
`ci_mean_chebyshev`	Distribution-free CI for \(\mathbb{E}[X]\) via Chebyshev's inequality.

Parametric CI (z or t):

from mcframework.stats_engine import ci_mean

result = ci_mean(data, ctx)
# {'confidence': 0.95, 'method': 'z', 'low': 99.5, 'high': 100.5, 'se': 0.15, 'crit': 1.96}

Bootstrap CI:

from mcframework.stats_engine import ci_mean_bootstrap

ctx = StatsContext(n=len(data), bootstrap="bca", rng=42)
result = ci_mean_bootstrap(data, ctx)
# {'confidence': 0.95, 'method': 'bootstrap-bca', 'low': 99.4, 'high': 100.6}

Target-Based Metrics #

Metrics that compare results to a known target value:

`bias_to_target`	Bias of the sample mean relative to a target \(\theta\).
`mse_to_target`	Mean squared error of \(\bar X\) relative to a target \(\theta\).
`markov_error_prob`	Markov bound on error probability for target \(\theta\).
`chebyshev_required_n`	Required \(n\) to achieve Chebyshev CI half-width \(\le \varepsilon\).

Usage (requires target and/or eps in context):

ctx = StatsContext(n=len(data), target=100.0, eps=0.5)

bias = bias_to_target(data, ctx)      # Mean - target
mse = mse_to_target(data, ctx)        # Mean squared error
prob = markov_error_prob(data, ctx)   # P(|mean - target| >= eps)
req_n = chebyshev_required_n(data, ctx)  # Required n for precision

Engine #

The StatsEngine orchestrates multiple metrics at once:

from mcframework.stats_engine import StatsEngine, FnMetric, mean, std, ci_mean

engine = StatsEngine([
    FnMetric("mean", mean),
    FnMetric("std", std),
    FnMetric("ci_mean", ci_mean),
])

result = engine.compute(data, ctx)
print(result.metrics)   # {'mean': 100.1, 'std': 15.2, 'ci_mean': {...}}
print(result.skipped)   # Metrics skipped due to missing context
print(result.errors)    # Metrics that raised errors

`StatsEngine`	Orchestrator that evaluates a set of metrics over an input array.
`FnMetric`	Lightweight adapter that binds a human-readable `name` to a metric function.
`ComputeResult`	Result from `StatsEngine.compute()` with tracking of computation failures.

Default Engine:

A pre-configured engine with all standard metrics:

from mcframework.stats_engine import DEFAULT_ENGINE

result = DEFAULT_ENGINE.compute(data, ctx)

Includes: mean, std, percentiles, skew, kurtosis, ci_mean, ci_mean_bootstrap, ci_mean_chebyshev, chebyshev_required_n, markov_error_prob, bias_to_target, mse_to_target.

build_default_engine

Construct a StatsEngine with a practical set of metrics.

Enumerations #

`CIMethod`	Parametric strategies for selecting confidence-interval critical values.
`NanPolicy`	Strategies for handling the propagation of non-finite values.
`BootstrapMethod`	Supported bootstrap confidence-interval flavors.

from mcframework.stats_engine import CIMethod, NanPolicy, BootstrapMethod

ctx = StatsContext(
    n=100,
    ci_method=CIMethod.auto,      # or "auto"
    nan_policy=NanPolicy.omit,    # or "omit"
    bootstrap=BootstrapMethod.bca # or "bca"
)

Exceptions #

`MissingContextError`	Raised when a required context field is missing.
`InsufficientDataError`	Raised when insufficient data is available for computation.

from mcframework.stats_engine import MissingContextError

try:
    bias_to_target(data, StatsContext(n=100))  # Missing 'target'
except MissingContextError as e:
    print(f"Missing field: {e}")

Custom Metrics #

Create custom metrics with FnMetric:

import numpy as np
from mcframework.stats_engine import FnMetric, StatsEngine, StatsContext

def coefficient_of_variation(x, ctx):
    """CV = std / mean"""
    m, s = float(np.mean(x)), float(np.std(x, ddof=1))
    return s / m if m != 0 else float('nan')

def interquartile_range(x, ctx):
    """IQR = Q3 - Q1"""
    q1, q3 = np.percentile(x, [25, 75])
    return float(q3 - q1)

# Build custom engine
engine = StatsEngine([
    FnMetric("cv", coefficient_of_variation, doc="Coefficient of variation"),
    FnMetric("iqr", interquartile_range, doc="Interquartile range"),
])

result = engine.compute(data, StatsContext(n=len(data)))