Deep LearningTraining & Optimisation

Attention Head

Overview

An individual attention computation within a multi-head attention layer that learns to focus on different aspects of the input, with outputs concatenated for richer representations.

Cross-References(1)

More in Deep Learning