Utilities Module
================

.. currentmodule:: mcframework.utils

The utilities module provides critical value functions for constructing confidence
intervals. These are the building blocks used by the :doc:`stats_engine`.

Overview
--------

When constructing a two-sided confidence interval for the mean:

.. math::

   \bar{X} \pm c \cdot \frac{s}{\sqrt{n}}

the critical value :math:`c` determines the interval width. This module provides:

- :func:`z_crit` — Normal distribution critical values
- :func:`t_crit` — Student's t-distribution critical values  
- :func:`autocrit` — Automatic selection based on sample size


Critical Values
---------------

z Critical Value (Normal)
~~~~~~~~~~~~~~~~~~~~~~~~~

For large samples (:math:`n \ge 30`), use the normal approximation:

.. math::

   z_{\alpha/2} = \Phi^{-1}\left(1 - \frac{\alpha}{2}\right)

where :math:`\Phi^{-1}` is the inverse standard normal CDF and :math:`\alpha = 1 - \text{confidence}`.

.. code-block:: python

   from mcframework.utils import z_crit

   z_crit(0.95)   # 1.96 (95% CI)
   z_crit(0.99)   # 2.576 (99% CI)
   z_crit(0.90)   # 1.645 (90% CI)

**Common Values:**

.. list-table::
   :header-rows: 1
   :widths: 30 30 40

   * - Confidence
     - α
     - z-critical
   * - 90%
     - 0.10
     - 1.645
   * - 95%
     - 0.05
     - 1.960
   * - 99%
     - 0.01
     - 2.576


t Critical Value (Student's t)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For small samples or when population variance is unknown, use the t-distribution:

.. math::

   t_{\alpha/2, \text{df}} = T_{\text{df}}^{-1}\left(1 - \frac{\alpha}{2}\right)

where :math:`\text{df} = n - 1` degrees of freedom.

.. code-block:: python

   from mcframework.utils import t_crit

   t_crit(0.95, df=9)    # 2.262 (n=10)
   t_crit(0.95, df=29)   # 2.045 (n=30)
   t_crit(0.95, df=99)   # 1.984 (n=100, approaches z)

The t critical value is always larger than z for finite df, yielding wider (more conservative) intervals.


Automatic Selection
~~~~~~~~~~~~~~~~~~~

The :func:`autocrit` function chooses between z and t based on sample size:

.. code-block:: python

   from mcframework.utils import autocrit

   # Small sample → use t
   crit, method = autocrit(0.95, n=15)
   print(f"{method}: {crit:.3f}")  # t: 2.145

   # Large sample → use z
   crit, method = autocrit(0.95, n=100)
   print(f"{method}: {crit:.3f}")  # z: 1.960

   # Force specific method
   crit, method = autocrit(0.95, n=100, method="t")
   print(f"{method}: {crit:.3f}")  # t: 1.984

**Selection Rules:**

- ``method="auto"`` (default): Use t if :math:`n < 30`, otherwise z
- ``method="z"``: Always use normal critical value
- ``method="t"``: Always use t with :math:`\text{df} = \max(1, n-1)`


Usage with Stats Engine
-----------------------

The stats engine uses these utilities internally:

.. code-block:: python

   from mcframework.stats_engine import ci_mean

   # ci_method controls which critical value is used
   ci_mean(data, {"n": 25, "ci_method": "auto"})   # Uses t (n < 30)
   ci_mean(data, {"n": 25, "ci_method": "z"})      # Forces z
   ci_mean(data, {"n": 100, "ci_method": "auto"})  # Uses z (n ≥ 30)


Mathematical Background
-----------------------

**Why the n < 30 threshold?**

The threshold comes from the convergence of the t-distribution to normal:

- At df=29, the 97.5th percentile differs from z by only ~4%
- Below df=10, the difference exceeds 10%
- The t-distribution accounts for additional uncertainty in estimating variance

**Coverage Probability:**

A confidence interval has "coverage" :math:`1 - \alpha` if:

.. math::

   \Pr\left(\mu \in \left[\bar{X} - c \cdot \text{SE}, \bar{X} + c \cdot \text{SE}\right]\right) = 1 - \alpha

Using t critical values with small samples ensures proper coverage even when the population variance is unknown.


Module Reference
----------------

.. automodule:: mcframework.utils
   :no-members:
   :no-undoc-members:
   :no-inherited-members:

Functions
~~~~~~~~~

.. autosummary::
   :toctree: generated
   :recursive:
   :nosignatures:

   z_crit
   t_crit
   autocrit
   _validate_confidence


See Also
--------

- :doc:`stats_engine` — Uses these utilities for confidence intervals
- :func:`~mcframework.stats_engine.ci_mean` — Parametric CI for the mean
- :func:`~mcframework.stats_engine.ci_mean_chebyshev` — Distribution-free alternative