Overview
Dense vector representations of words where semantically similar words are mapped to nearby points in vector space.
More in Deep Learning
Positional Encoding
Training & OptimisationA technique that injects information about the position of tokens in a sequence into transformer architectures.
Rotary Positional Encoding
Training & OptimisationA position encoding method that encodes absolute position with a rotation matrix and naturally incorporates relative position information into attention computations.
Convolutional Neural Network
ArchitecturesA deep learning architecture designed for processing structured grid data like images, using convolutional filters to detect features.
Representation Learning
ArchitecturesThe automatic discovery of data representations needed for feature detection or classification from raw data.
Capsule Network
ArchitecturesA neural network architecture that groups neurons into capsules to better capture spatial hierarchies and part-whole relationships.
Diffusion Model
Generative ModelsA generative model that learns to reverse a gradual noising process, generating high-quality samples from random noise.
Gradient Checkpointing
ArchitecturesA memory optimisation that trades computation for memory by recomputing intermediate activations during the backward pass instead of storing them all during the forward pass.
Autoencoder
ArchitecturesA neural network trained to encode input data into a compressed representation and then decode it back to reconstruct the original.