Overview
A parameter-efficient method that prepends trainable continuous vectors to the input of each transformer layer, guiding model behaviour without altering base parameters.
Cross-References(1)
More in Deep Learning
Fine-Tuning
ArchitecturesThe process of taking a pretrained model and further training it on a smaller, task-specific dataset.
Activation Function
Training & OptimisationA mathematical function applied to neural network outputs to introduce non-linearity, enabling the learning of complex patterns.
Attention Head
Training & OptimisationAn individual attention computation within a multi-head attention layer that learns to focus on different aspects of the input, with outputs concatenated for richer representations.
Batch Normalisation
ArchitecturesA technique that normalises layer inputs during training to stabilise and accelerate deep neural network learning.
Residual Network
Training & OptimisationA deep neural network architecture using skip connections that allow gradients to flow directly through layers, enabling very deep networks.
Deep Learning
ArchitecturesA subset of machine learning using neural networks with multiple layers to learn hierarchical representations of data.
Autoencoder
ArchitecturesA neural network trained to encode input data into a compressed representation and then decode it back to reconstruct the original.
Generative Adversarial Network
Generative ModelsA framework where two neural networks compete — a generator creates synthetic data while a discriminator evaluates its authenticity.