Project Plan#

Project Selection and Project Plan#

1) Team#

Name	Email
Milan Fusco	mdfusco@student.ysu.edu

2) Project Description#

McFramework is a Python library providing a robust, extensible foundation for building and running Monte Carlo simulations with rigorous statistical analysis.

Key Capabilities:

Abstract base class pattern for defining custom simulations
Deterministic parallel execution with reproducible RNG streams
Comprehensive statistics engine with 12+ metric functions
Multiple confidence interval methods (parametric, bootstrap, distribution-free)
Built-in simulations: Pi estimation, Portfolio wealth, Black-Scholes options

Note

For detailed architecture and UML diagrams, see System Design

3) Software System Type#

System for Modeling and Simulation

The framework is designed for computational experiments involving:

Stochastic process simulation (random sampling, GBM)
Statistical estimation (Monte Carlo integration)
Financial modeling (option pricing, portfolio analysis)
Uncertainty quantification (confidence intervals, error bounds)

Architecture Classification:

Library/Framework — Provides reusable abstractions for simulation development
Batch Processing — Executes thousands of independent simulation runs
Parallel System — Distributes work across threads or processes

4) Project Plan#

Note

Development will follow a hybrid Agile-XP methodology with continuous practices applied throughout all phases of the project:

Continuous Testing — Tests written alongside implementation, not after
Continuous Documentation — Inline docstrings updated with each feature
Continuous Review/Refactor — Iterative PRs with code review and cleanup

Phase 1: Core Framework#

Define abstract base class for custom simulation development
Design result encapsulation for simulation outputs
Create registry pattern for managing multiple simulations
Establish reproducible RNG seeding with independent streams
Build parallel execution backends (threads for POSIX, processes for Windows)
Implement platform-aware backend auto-selection
Design chunk-based work distribution for load balancing
Define public API and module organization

Phase 2: Statistics Engine#

Design configuration system for statistical parameters
Implement descriptive statistics (mean, std, percentiles, skew, kurtosis)
Implement parametric confidence intervals (z/t critical values)
Implement bootstrap confidence intervals (percentile and BCa methods)
Implement distribution-free bounds (Chebyshev, Markov)
Build orchestrator with result encapsulation and error tracking

Phase 3: Built-in Simulations#

Pi estimation via geometric probability sampling
Portfolio wealth simulation with GBM dynamics
Black-Scholes European/American option pricing
Path generation for visualization and analysis

Phase 4: CI/CD & Quality Assurance#

GitHub Actions pipeline (linting, testing, type checking, building)
Cross-platform testing matrix (Linux, macOS, Windows; Python 3.10-3.12)
Code coverage reporting and quality metrics
Automated dependency management and security scanning
Release automation with changelog generation

Phase 5: GUI Application (MVP Demo)#

Interactive Black-Scholes Monte Carlo simulator using PySide6
Real-time market data visualization with candlestick charts
Option pricing calculator with Greeks display
Monte Carlo path visualization with live statistics
3D volatility/time sensitivity surfaces
Scenario presets for different market conditions

Phase 6: Documentation#

NumPy-style docstrings with LaTeX math notation
Sphinx API documentation with autosummary
Architecture diagrams (Mermaid UML)
User guides and getting started documentation
GitHub Pages deployment

Phase 7: Release & Deployment using GitHub Actions workflow.#

Package distribution via PyPI (trusted publishing with OIDC)
Documentation hosting on GitHub Pages
Version management and semantic versioning, release automation with changelog generation.

5) Requirements#

User Requirements#

ID	Requirement
UR-1	Users shall define custom simulations by subclassing `MonteCarloSimulation` and implementing `single_simulation()`
UR-2	Users shall obtain reproducible results by setting a seed via `set_seed()`
UR-3	Users shall run simulations in parallel by passing `parallel=True` to `run()`
UR-4	Users shall receive statistical summaries including mean, std, percentiles, and confidence intervals
UR-5	Users shall configure statistical behavior via `StatsContext` parameters
UR-6	Users shall register and compare multiple simulations using `MonteCarloFramework`
UR-7	Users shall extend the statistics engine by implementing the `Metric` protocol

System Requirements#

ID	Requirement
SR-1	The system shall require Python ≥ 3.10, NumPy ≥ 1.24, SciPy ≥ 1.10
SR-2	The system shall use `spawn()` to create independent RNG streams per worker
SR-3	The system shall use `Philox` as the bit generator for parallel reproducibility
SR-4	The system shall select `ThreadPoolExecutor` on POSIX and `ProcessPoolExecutor` on Windows by default
SR-5	The system shall fall back to sequential execution for n < 20,000 simulations
SR-6	The system shall compute confidence intervals using `scipy.stats.norm` and `scipy.stats.t`
SR-7	The system shall support bootstrap resampling with configurable `n_bootstrap` (default 10,000)
SR-8	The system shall implement BCa bootstrap using jackknife acceleration
SR-9	The system shall handle NaN values according to `nan_policy` (“propagate” or “omit”)

6) Stakeholders#

Stakeholder Type	Role	Primary Concerns
Simulation Developers	Create custom `MonteCarloSimulation` subclasses	Clean API, extensibility, documentation
Researchers	Run experiments, analyze results	Reproducibility, statistical rigor, accuracy
Students	Learn Monte Carlo methods	Simple examples, clear explanations
Library Maintainers	Maintain codebase	Test coverage, code quality, CI/CD
Framework Integrators	Embed in larger systems	Modular design, minimal dependencies
FOSS Community	MIT licensed, open source, free and libre software.	Contribute to the project, report issues, and request features.

7) Development Methodology#

Agile with XP (Extreme Programming) Practices

Continuous testing and improvement. Testing happened continuously, not after implementation but throughout the development process.
Continuous code review and improvement. Features evolved through iterative PRs and reviews.
Continuous documentation improvement. Documentation was written inline with the code and updated continuously as features were added and refactored.

Development Practice	Evidence in Project Implementation
Test-Driven Development	16 test files covering unit, integration, edge cases, regression and performance.
Continuous Integration	GitHub Actions CI (lint, test, build, docs, codeql, dependabot, release-drafter, stale). Runs on every push to main or PR, and on every release to PyPI and GitHub Pages.
Refactoring	Continuous refactoring and improvement of the codebase, documentation, and test coverage.
Small Releases	Semantic versioning: dev-0.1.0 → dev-0.2.0 → dev-0.x.x → 0.1.0 (initial public release)
Coding Standards	PEP 8, PEP 585 type hints, NumPy docstring convention, clear and consistent naming conventions, modular architecture, separation of concerns, and maintainability. Code is well-organized and easy to understand.

8) Functional Requirements#

ID	Requirement	Module
FR-1	Provide abstract `MonteCarloSimulation` class with `single_simulation()` method	`core.py`
FR-2	Execute simulations sequentially via `_run_sequential()`	`core.py`
FR-3	Execute simulations in parallel via `_run_parallel()` with thread/process backends	`core.py`
FR-4	Spawn independent RNG streams per worker chunk using `SeedSequence`	`core.py`
FR-5	Return results in `SimulationResult` dataclass with mean, std, percentiles, stats, metadata	`core.py`
FR-6	Register simulations by name in `MonteCarloFramework`	`core.py`
FR-7	Compare metrics across simulations via `compare_results()`	`core.py`
FR-8	Compute sample mean with NaN handling	`stats_engine.py`
FR-9	Compute sample standard deviation with configurable ddof	`stats_engine.py`
FR-10	Compute arbitrary percentiles via `numpy.percentile`	`stats_engine.py`
FR-11	Compute skewness and kurtosis using `scipy.stats`	`stats_engine.py`
FR-12	Compute parametric CI using z or t critical values	`stats_engine.py`
FR-13	Compute bootstrap CI using percentile or BCa method	`stats_engine.py`
FR-14	Compute Chebyshev distribution-free CI	`stats_engine.py`
FR-15	Compute required n for target Chebyshev half-width	`stats_engine.py`
FR-16	Compute Markov error probability bound	`stats_engine.py`
FR-17	Track skipped and errored metrics in `ComputeResult`	`stats_engine.py`
FR-18	Provide z-critical and t-critical value functions	`utils.py`
FR-19	Auto-select z/t based on sample size threshold (n ≥ 30)	`utils.py`
FR-20	Estimate π via geometric probability sampling	`sims/pi.py`
FR-21	Simulate portfolio wealth under GBM dynamics	`sims/portfolio.py`
FR-22	Price European options with discounted payoff	`sims/black_scholes.py`
FR-23	Price American options using Longstaff-Schwartz LSM	`sims/black_scholes.py`
FR-24	Calculate Greeks via finite difference bumping	`sims/black_scholes.py`

9) Non-Functional Requirements#

ID	Category	Requirement
NFR-1	Performance	Parallel execution shall achieve near-linear speedup up to available CPU cores
NFR-2	Performance	Sequential fallback for n < `_PARALLEL_THRESHOLD` (20,000) to avoid overhead
NFR-3	Reliability	Identical results given the same seed, regardless of parallelism or platform
NFR-4	Reliability	Graceful handling of empty arrays, NaN, and infinite values
NFR-5	Reliability	`ComputeResult` tracks errors without crashing the engine
NFR-6	Portability	Support Python 3.10, 3.11, 3.12 on Linux, macOS, Windows
NFR-7	Portability	Auto-detect platform for thread vs. process backend
NFR-8	Maintainability	Test coverage ≥ 90%
NFR-9	Maintainability	All public functions have type hints and docstrings
NFR-10	Maintainability	Modular design: core, stats_engine, sims, utils as separate concerns
NFR-11	Extensibility	Custom simulations via subclassing `MonteCarloSimulation`
NFR-12	Extensibility	Custom metrics via `Metric` protocol and `FnMetric` adapter
NFR-13	Documentation	Sphinx-generated API docs with mathematical notation

10) Usability Requirements#

ID	Requirement
USA-1	A minimal simulation shall require implementing only `single_simulation()` (< 10 LOC)
USA-2	The `run()` method shall use sensible defaults: `parallel=False`, `confidence=0.95`, `ci_method="auto"`
USA-3	`SimulationResult.result_to_string()` shall produce human-readable summaries
USA-4	Error messages shall specify invalid parameter values and valid ranges
USA-5	`StatsContext.with_overrides()` shall allow easy configuration modification
USA-6	The `Metric` protocol shall be simple: `name: str` + `__call__(x, ctx)`
USA-7	Built-in simulations shall demonstrate framework capabilities with realistic parameters
USA-8	Docstrings shall include Examples sections with executable code, and shall be tested with doctest to ensure they are working correctly.