Overview
A neural network component that learns to focus on relevant parts of the input when producing each element of the output.
Cross-References(1)
More in Deep Learning
Parameter-Efficient Fine-Tuning
Language ModelsMethods for adapting large pretrained models to new tasks by only updating a small fraction of their parameters.
Diffusion Model
Generative ModelsA generative model that learns to reverse a gradual noising process, generating high-quality samples from random noise.
Convolutional Layer
ArchitecturesA neural network layer that applies learnable filters across input data to detect local patterns and features.
Adapter Layers
Language ModelsSmall trainable modules inserted between frozen transformer layers that enable task-specific adaptation without modifying the original model weights.
Skip Connection
ArchitecturesA neural network shortcut that allows the output of one layer to bypass intermediate layers and be added to a later layer's output.
Tensor Parallelism
ArchitecturesA distributed computing strategy that splits individual layer computations across multiple devices by partitioning weight matrices along specific dimensions.
Flash Attention
ArchitecturesAn IO-aware attention algorithm that reduces memory reads and writes by tiling the attention computation, enabling faster training of long-context transformer models.
Generative Adversarial Network
Generative ModelsA framework where two neural networks compete — a generator creates synthetic data while a discriminator evaluates its authenticity.