Overview
Direct Answer
Bagging (Bootstrap Aggregating) is an ensemble method that reduces variance by training multiple models independently on random subsets of the training data drawn with replacement, then combining their predictions through averaging or voting. This approach is particularly effective for high-variance algorithms such as decision trees.
How It Works
The method generates B bootstrap samples by randomly sampling the original dataset with replacement, each typically the same size as the original. A base learner trains independently on each sample, producing B distinct models. For regression tasks, predictions are averaged across all models; for classification, a majority vote determines the final output. This independence between training iterations enables parallel computation.
Why It Matters
Bagging significantly improves model stability and generalisation performance, reducing overfitting without requiring algorithmic modification. Teams benefit from lower prediction variance and more reliable confidence intervals, which is critical for high-stakes decisions in finance, healthcare, and risk management where model robustness directly impacts business outcomes.
Common Applications
Random forests, a bagged ensemble of decision trees, are widely deployed in credit risk assessment, medical diagnosis support systems, and feature importance analysis. The technique also improves neural network robustness and is applied in manufacturing defect detection and customer churn prediction.
Key Considerations
Bagging reduces variance but provides minimal bias reduction; it works best with unstable learners prone to overfitting. Computational cost scales linearly with the number of models trained, and gains diminish as base learner stability increases, requiring practitioners to balance accuracy improvements against training overhead.
Referenced By1 term mentions Bagging
Other entries in the wiki whose definition references Bagging — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
Active Learning
MLOps & ProductionA machine learning approach where the algorithm interactively queries a user or oracle to label new data points.
Association Rule Learning
Unsupervised LearningA method for discovering interesting relationships and patterns between variables in large datasets.
Gradient Boosting
Supervised LearningAn ensemble technique that builds models sequentially, with each new model correcting residual errors of the combined ensemble.
Matrix Factorisation
Unsupervised LearningA technique that decomposes a matrix into constituent matrices, widely used in recommendation systems and dimensionality reduction.
Adam Optimiser
Training TechniquesAn adaptive learning rate optimisation algorithm combining momentum and RMSProp for efficient deep learning training.
Naive Bayes
Supervised LearningA probabilistic classifier based on applying Bayes' theorem with the assumption of independence between features.
Principal Component Analysis
Unsupervised LearningA dimensionality reduction technique that transforms data into orthogonal components ordered by the amount of variance they explain.
SHAP Values
MLOps & ProductionA game-theoretic approach to explaining individual model predictions by computing each feature's marginal contribution, based on Shapley values from cooperative game theory.