Overview
A neural network architecture that groups neurons into capsules to better capture spatial hierarchies and part-whole relationships.
Cross-References(1)
More in Deep Learning
Generative Adversarial Network
Generative ModelsA framework where two neural networks compete — a generator creates synthetic data while a discriminator evaluates its authenticity.
Multi-Head Attention
Training & OptimisationAn attention mechanism that runs multiple attention operations in parallel, capturing different types of relationships.
Key-Value Cache
ArchitecturesAn optimisation in autoregressive transformer inference that stores previously computed key and value tensors to avoid redundant computation during sequential token generation.
Residual Connection
Training & OptimisationA skip connection that adds a layer's input directly to its output, enabling gradient flow through deep networks and allowing training of architectures with hundreds of layers.
Mixed Precision Training
Training & OptimisationTraining neural networks using both 16-bit and 32-bit floating-point arithmetic to speed up computation while maintaining accuracy.
Gradient Checkpointing
ArchitecturesA memory optimisation that trades computation for memory by recomputing intermediate activations during the backward pass instead of storing them all during the forward pass.
Skip Connection
ArchitecturesA neural network shortcut that allows the output of one layer to bypass intermediate layers and be added to a later layer's output.
Convolutional Layer
ArchitecturesA neural network layer that applies learnable filters across input data to detect local patterns and features.