Bias–Variance Tradeoff Explorer

Adjust the simulation parameters in the sidebar and click Run Simulation to generate results. Training MSE, test MSE, bias\(^2\), and variance are computed from independent Monte Carlo draws.

Bias–Variance Decomposition:

$$\mathbb{E}\left[(y - \hat{f}(x))^2\right] = \underbrace{\left[\text{Bias}(\hat{f}(x))\right]^2}_{\text{Bias}^2} + \underbrace{\text{Var}(\hat{f}(x))}_{\text{Variance}} + \underbrace{\sigma^2}_{\text{Irreducible}}$$

MC Estimates (across B simulations):

$$\widehat{\text{Bias}^2}(x_0) = \left(\frac{1}{B}\sum_{b=1}^{B} \hat{f}_b(x_0) - f(x_0)\right)^2$$ $$\widehat{\text{Var}}(x_0) = \frac{1}{B}\sum_{b=1}^{B} \left(\hat{f}_b(x_0) - \bar{\hat{f}}(x_0)\right)^2$$

Train MSE

Test MSE

Bias²

Variance

Best Complexity

Loading...

Model Recommendation

Test MSE (red) = Bias² (green) + Variance (amber) + irreducible noise. Training MSE (blue) decreases monotonically with model complexity.

Loading...

Each grey curve is a fitted model from one MC repetition. The true function f(x) is shown in black.

Values shown are averaged over all Monte Carlo repetitions.

Monte Carlo Study of the Bias–Variance Tradeoff

This application accompanies the STA380 project by Jizheng Huang, Victor Jiang, Tianchen Xu, and Kai Rui Zhu (University of Toronto, Mississauga).

The goal of this project is to illustrate the bias-variance tradeoff in regression through repeated Monte Carlo simulation. Users can explore how model flexibility, sample size, and noise level affect training MSE, test MSE, bias^2, and variance.

Simulation Framework

Data are generated according to the model \( y = f(x) + \varepsilon \) where \( f(x) \) is a known nonlinear function and \( \varepsilon \sim \mathcal{N}(0, \sigma^2) \). Monte Carlo simulation is used to estimate bias\(^2\), variance, and MSE across model complexities.

For each Monte Carlo repetition, the app generates a new training sample, fits the selected regression model, and evaluates predictive performance. Repeated fits are then used to estimate bias², variance, and mean squared error across model complexities.

How to Use the App

Choose the simulation settings in the sidebar, including the random seed, sample size, noise level, model type, and model complexity.
Adjust the number of Monte Carlo repetitions to control the stability of the estimates.
Click Run Simulation to generate updated results.
Use the plot tabs to examine the bias-variance curves, prediction spread, and MSE decomposition table.
Download the summary results if desired.

Interpretation

As model complexity increases, bias² typically decreases while variance increases. This tradeoff often leads to a U-shaped test MSE curve, where overly simple models underfit and overly flexible models overfit.

The prediction spread plot shows how fitted models vary across Monte Carlo repetitions, while the decomposition table summarizes the average training MSE, test MSE, bias², and variance.

References

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
R Core Team (2024). R: A Language and Environment for Statistical Computing.
R Core Team (2024). shiny: Web Application Framework for R.
Voss, J. (2013). An Introduction to Statistical Computing. Wiley.