Overview
The process of taking a pretrained model and further training it on a smaller, task-specific dataset.
More in Deep Learning
Contrastive Learning
ArchitecturesA self-supervised learning approach that trains models by comparing similar and dissimilar pairs of data representations.
Embedding
ArchitecturesA learned dense vector representation of discrete data (like words or categories) in a continuous vector space.
Sigmoid Function
Training & OptimisationAn activation function that maps input values to a range between 0 and 1, useful for binary classification outputs.
Weight Decay
ArchitecturesA regularisation technique that penalises large model weights during training by adding a fraction of the weight magnitude to the loss function, preventing overfitting.
Mixed Precision Training
Training & OptimisationTraining neural networks using both 16-bit and 32-bit floating-point arithmetic to speed up computation while maintaining accuracy.
Vanishing Gradient
ArchitecturesA problem in deep networks where gradients become extremely small during backpropagation, preventing earlier layers from learning.
Pretraining
ArchitecturesTraining a model on a large general dataset before fine-tuning it on a specific downstream task.
Skip Connection
ArchitecturesA neural network shortcut that allows the output of one layer to bypass intermediate layers and be added to a later layer's output.