mcframework.sims.PiEstimationSimulation#

class mcframework.sims.PiEstimationSimulation[source]#

Bases: MonteCarloSimulation

Estimate \(\pi\) by geometric probability on the unit disk.

The simulation throws \(n\) i.i.d. points \((X_i, Y_i)\) uniformly on \([-1, 1]^2\) and uses the identity

\[\pi = 4 \,\Pr\!\left(X^2 + Y^2 \le 1\right),\]

to form the Monte Carlo estimator

\[\widehat{\pi}_n = \frac{4}{n} \sum_{i=1}^n \mathbf{1}\{X_i^2 + Y_i^2 \le 1\}.\]
Attributes:
namestr

Human-readable label registered with MonteCarloFramework.

supports_batchbool

Whether this simulation supports Torch batch execution (True).

Notes

This simulation supports both scalar (NumPy) and vectorized (Torch) execution:

Examples

>>> sim = PiEstimationSimulation()
>>> sim.set_seed(42)
>>> result = sim.run(100_000, backend="torch")  # GPU-ready

Methods

cupy_batch

Optional vectorized cuRAND implementation using CuPy.

run

Run the Monte Carlo simulation.

set_seed

Set the random seed for reproducible experiments.

single_simulation

Throw \(n_{\text{points}}\) darts at \([-1, 1]^2\) and return the single-run estimator \(\widehat{\pi}\).

torch_batch

Vectorized Torch implementation for GPU-accelerated Pi estimation.

cupy_batch(n: int, *, device: torch.device, rng: cupy.random.RandomState) cupy.ndarray#

Optional vectorized cuRAND implementation using CuPy.

Override this method in subclasses to enable GPU-accelerated batch execution. When implemented alongside supports_batch = True, the framework will use this method instead of repeated single_simulation calls.

Parameters:
nint

Number of simulation draws.

devicetorch.device

Device to use for the simulation ("cuda").

rngcupy.random.RandomState

cuRAND generator for reproducible random sampling.

Returns:
cupy.ndarray

A 1D array of length n containing simulation results.

run(n_simulations: int, *, backend: str = 'auto', torch_device: str = 'cpu', cuda_device_id: int = 0, cuda_use_curand: bool = False, cuda_batch_size: int | None = None, cuda_use_streams: bool = True, parallel: bool | None = None, n_workers: int | None = None, progress_callback: Callable[[int, int], None] | None = None, percentiles: Iterable[int] | None = None, compute_stats: bool = True, stats_engine: StatsEngine | None = None, confidence: float = 0.95, ci_method: str = 'auto', extra_context: Mapping[str, Any] | None = None, **simulation_kwargs: Any) SimulationResult#

Run the Monte Carlo simulation.

Parameters:
n_simulationsint

Number of simulation draws.

backend{“auto”, “sequential”, “thread”, “process”, “torch”}, default "auto"

Execution backend to use:

  • "auto" — Sequential for small jobs, parallel (thread/process) for large jobs

  • "sequential" — Single-threaded execution

  • "thread" — Thread-based parallelism (best when NumPy releases GIL)

  • "process" — Process-based parallelism (required on Windows for true parallelism)

  • "torch" — Torch batch execution (requires supports_batch = True)

torch_device{“cpu”, “mps”, “cuda”}, default "cpu"

Torch device for backend="torch". Ignored for other backends.

  • "cpu" — Safe default, works everywhere

  • "mps" — Apple Metal Performance Shaders (M1/M2/M3 Macs)

  • "cuda" — NVIDIA GPU acceleration

cuda_device_idint, default 0

CUDA device index for multi-GPU systems. Only used when backend="torch" and torch_device="cuda".

cuda_use_curandbool, default False

Use cuRAND (via CuPy) instead of torch.Generator for maximum GPU performance. Requires CuPy and curand_batch() implementation.

cuda_batch_sizeint or None, default None

Fixed batch size for CUDA execution. If None, automatically estimates optimal batch size based on available GPU memory.

cuda_use_streamsbool, default True

Use CUDA streams for overlapped execution. Recommended for performance.

parallelbool, optional

Deprecated. Use backend instead. If provided, parallel=True maps to backend="auto" with parallel preference, parallel=False maps to backend="sequential".

n_workersint, optional

Worker count for parallel backends. Defaults to CPU count.

progress_callbackcallable(), optional

A function f(completed: int, total: int) called periodically.

percentilesiterable of int, optional

Percentiles to compute from raw results. If None and compute_stats=True, the stats engine’s defaults (_PCTS) are used; if compute_stats=False, no percentiles are computed unless explicitly provided.

compute_statsbool, default True

Compute additional metrics via a StatsEngine.

stats_engineStatsEngine, optional

Custom engine (defaults to mcframework.stats_engine.DEFAULT_ENGINE).

confidencefloat, default 0.95

Confidence level for CI-related metrics.

ci_method{“auto”,”z”,”t”}, default "auto"

Which critical values the stats engine should use.

extra_contextmapping, optional

Extra context forwarded to the stats engine.

**simulation_kwargsAny

Keyword arguments forwarded to single_simulation().

Returns:
SimulationResult

See SimulationResult.

See also

run_simulation()

Run a registered simulation by name.

Notes

MPS determinism caveat. When using torch_device="mps", the framework preserves RNG stream structure but does not guarantee bitwise reproducibility due to Metal backend scheduling and float32 arithmetic. Statistical properties (mean, variance, CI coverage) remain correct.

set_seed(seed: int | None) None#

Set the random seed for reproducible experiments.

Parameters:
seedint or None

Seed for numpy.random.SeedSequence. None chooses entropy from the OS.

Notes

The framework spawns independent child sequences per worker/chunk via numpy.random.SeedSequence.spawn(), ensuring deterministic parallel streams given the same seed and block layout.

single_simulation(n_points: int = 10000, antithetic: bool = False, _rng: Generator | None = None, **kwargs) float[source]#

Throw \(n_{\text{points}}\) darts at \([-1, 1]^2\) and return the single-run estimator \(\widehat{\pi}\).

Parameters:
n_pointsint, default 10_000

Number of uniformly distributed points to simulate. The Monte Carlo variance decays as \(\mathcal{O}(n_{\text{points}}^{-1})\).

antitheticbool, default False

Whether to pair each point \((x, y)\) with its reflection \((-x, -y)\) to achieve first-order variance cancellation.

**kwargsAny

Ignored. Reserved for framework compatibility.

Returns:
float

Estimate of \(\pi\) computed via \(\widehat{\pi} = 4 \,\widehat{p}\), where \(\widehat{p}\) is the observed fraction of darts that land inside the unit disk.

torch_batch(n: int, *, device: torch.device, generator: torch.Generator) torch.Tensor[source]#

Vectorized Torch implementation for GPU-accelerated Pi estimation.

Each element of the returned tensor is an independent estimate of \(\pi\) using the standard Monte Carlo disk-in-square method. This is equivalent to calling single_simulation() with n_points=1 for each draw.

Parameters:
nint

Number of \(\pi\) estimates to generate.

devicetorch.device

Device for computation ("cpu", "mps", or "cuda").

generatortorch.Generator

Explicit Torch generator for reproducible random sampling. All random operations must use this generator—never rely on global Torch RNG.

Returns:
torch.Tensor

A 1D tensor of length n where each element is 4.0 (inside disk) or 0.0 (outside disk). Returns float32 for MPS compatibility; the framework promotes to float64 after moving to CPU.

Notes

Unlike single_simulation(), this method does not support the n_points parameter—each simulation is a single point evaluation. For high-precision estimates, use many simulations and let the framework compute the mean.

The expected value of each element is \(\pi\), so the sample mean converges to \(\pi\) as n .

MPS compatibility. This method returns float32 tensors to support Apple MPS backend (which doesn’t support float64). The framework handles promotion to float64 after moving results to CPU.

__init__()[source]#
classmethod __new__(*args, **kwargs)#