Overview
A computing system inspired by biological neural networks, consisting of interconnected nodes that process information in layers.
More in Deep Learning
Word Embedding
Language ModelsDense vector representations of words where semantically similar words are mapped to nearby points in vector space.
Fully Connected Layer
ArchitecturesA neural network layer where every neuron is connected to every neuron in the adjacent layers.
Fine-Tuning
Language ModelsThe process of adapting a pre-trained model to a specific task by continuing training on a smaller task-specific dataset, transferring learned representations to new domains.
Adapter Layers
Language ModelsSmall trainable modules inserted between frozen transformer layers that enable task-specific adaptation without modifying the original model weights.
Exploding Gradient
ArchitecturesA problem where gradients grow exponentially during backpropagation, causing unstable weight updates and training failure.
Self-Attention
Training & OptimisationAn attention mechanism where each element in a sequence attends to all other elements to compute its representation.
Representation Learning
ArchitecturesThe automatic discovery of data representations needed for feature detection or classification from raw data.
Weight Decay
ArchitecturesA regularisation technique that penalises large model weights during training by adding a fraction of the weight magnitude to the loss function, preventing overfitting.