Overview
Direct Answer
Multi-task learning is a machine learning paradigm in which a single model is trained simultaneously on multiple related prediction tasks, leveraging shared representations to improve generalisation and reduce overfitting compared to training separate task-specific models.
How It Works
The model architecture contains shared hidden layers that extract common features, with task-specific output heads branching from these layers. During training, loss functions from all tasks are combined—typically through weighted summation—and gradients propagate through both task-specific and shared parameters, forcing the model to learn representations beneficial across tasks.
Why It Matters
Multi-task learning reduces data requirements per task, decreases training time and computational cost, and improves robustness by allowing models to exploit task relationships. Organisations benefit from fewer model deployments and improved performance in data-constrained domains such as rare disease diagnosis or low-resource languages.
Common Applications
Applications include natural language processing (simultaneous part-of-speech tagging, named entity recognition, and dependency parsing), computer vision (joint detection, segmentation, and classification), and autonomous systems (predicting vehicle trajectory alongside traffic sign recognition).
Key Considerations
Effective multi-task learning requires careful selection of related tasks; incompatible or weakly related tasks can degrade performance through negative transfer. Task weighting and architectural design significantly influence outcomes and often require domain expertise to optimise.
Cross-References(1)
Referenced By1 term mentions Multi-Task Learning
Other entries in the wiki whose definition references Multi-Task Learning — useful for understanding how this concept connects across Machine Learning and adjacent domains.
More in Machine Learning
Underfitting
Training TechniquesWhen a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data.
Lasso Regression
Feature Engineering & SelectionA regularised regression technique that adds an L1 penalty, enabling feature selection by driving some coefficients to zero.
Boosting
Supervised LearningAn ensemble technique that sequentially trains models, each focusing on correcting the errors of previous models.
Collaborative Filtering
Unsupervised LearningA recommendation technique that makes predictions based on the collective preferences and behaviour of many users.
UMAP
Unsupervised LearningUniform Manifold Approximation and Projection — a dimensionality reduction technique for visualisation and general non-linear reduction.
Regularisation
Training TechniquesTechniques that add constraints or penalties to a model to prevent overfitting and improve generalisation to new data.
Matrix Factorisation
Unsupervised LearningA technique that decomposes a matrix into constituent matrices, widely used in recommendation systems and dimensionality reduction.
Random Forest
Supervised LearningAn ensemble learning method that constructs multiple decision trees during training and outputs the mode of their predictions.