Overview
Direct Answer
Few-shot learning is a machine learning paradigm where models achieve task performance through exposure to only a small number of labelled examples—typically between two and ten instances per class. This approach differs fundamentally from traditional supervised learning, which requires thousands of examples, and leverages transfer learning or in-context learning mechanisms to generalise from minimal data.
How It Works
The mechanism relies on the model's pre-trained representations and ability to recognise patterns from limited exemplars. In large language models, few-shot capability emerges through in-context learning, where examples are provided within the input prompt without parameter updates. Meta-learning approaches train models explicitly to adapt quickly to new tasks, whilst metric-learning methods learn similarity functions that can classify unseen data points based on proximity to support examples.
Why It Matters
Organisations benefit significantly from reduced labelling costs, faster deployment timelines, and the ability to address long-tail problems where abundant training data is infeasible. In regulated industries and specialised domains—such as medical imaging or legal document analysis—few-shot methods accelerate model development whilst maintaining data privacy and compliance requirements.
Common Applications
Applications include intent classification in customer service chatbots, rapid personalisation in recommendation systems, and medical diagnosis from limited patient records. Few-shot techniques are particularly valuable in rare disease detection, multilingual natural language processing, and content moderation where class distributions are highly imbalanced.
Key Considerations
Performance often remains lower than fully-supervised baselines, and quality of selected examples disproportionately influences outcomes. Practitioners must carefully curate exemplars and recognise that success depends heavily on the model's pre-training quality and task similarity to the training distribution.
Cross-References(2)
More in Artificial Intelligence
AI Safety
Safety & GovernanceThe interdisciplinary field dedicated to making AI systems safe, robust, and beneficial while minimizing risks of unintended consequences.
Artificial General Intelligence
Foundations & TheoryA hypothetical form of AI that possesses the ability to understand, learn, and apply knowledge across any intellectual task a human can perform.
Neural Scaling Laws
Models & ArchitectureEmpirical relationships describing how AI model performance improves predictably with increases in model size, training data volume, and computational resources.
AUC Score
Evaluation & MetricsArea Under the ROC Curve, a single metric summarising a classifier's ability to distinguish between classes.
TinyML
Evaluation & MetricsMachine learning techniques optimised to run on microcontrollers and extremely resource-constrained embedded devices.
BLEU Score
Evaluation & MetricsA metric for evaluating the quality of machine-generated text by comparing it to reference translations or texts.
Edge AI
Foundations & TheoryArtificial intelligence algorithms processed locally on edge devices rather than in centralised cloud data centres.
Tensor Processing Unit
Models & ArchitectureGoogle's custom-designed application-specific integrated circuit for accelerating machine learning workloads.