Sparse Attention

Overview

An attention mechanism that selectively computes relationships between a subset of input tokens rather than all pairs, reducing quadratic complexity in transformer models.

Cross-References(2)

Deep Learning

Attention Mechanism Transformer

Related in Models & Architecture

Tensor Processing Unit

Google's custom-designed application-specific integrated circuit for accelerating machine learning workloads.

Neural Processing Unit

A specialised processor designed to accelerate neural network computations in edge devices and mobile platforms.

Model Distillation

A technique where a smaller, simpler model is trained to replicate the behaviour of a larger, more complex model.

Model Pruning

The process of removing redundant or less important parameters from a neural network to reduce its size and computational cost.

Neural Architecture Search

An automated technique for designing optimal neural network architectures using search algorithms.

Model Quantisation

The process of reducing the numerical precision of a model's weights and activations from floating-point to lower-bit representations, decreasing memory usage and inference latency.

Model Collapse

A degradation phenomenon where AI models trained on AI-generated data progressively lose diversity and accuracy, converging toward a narrow distribution of outputs.

Neural Scaling Laws

Empirical relationships describing how AI model performance improves predictably with increases in model size, training data volume, and computational resources.

Speculative Decoding

An inference acceleration technique where a small draft model generates candidate token sequences that are verified in parallel by the larger target model.

More in Artificial Intelligence

AI Safety

Safety & Governance

The interdisciplinary field dedicated to making AI systems safe, robust, and beneficial while minimizing risks of unintended consequences.

Strong AI

Foundations & Theory

A theoretical form of AI that would have consciousness, self-awareness, and the ability to truly understand rather than simulate understanding.

Artificial Intelligence

Foundations & Theory

The simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction.

Direct Preference Optimisation

Training & Inference

A simplified alternative to RLHF that directly optimises language model policies using preference data without requiring a separate reward model.

AI Interpretability

Safety & Governance

The degree to which humans can understand the internal mechanics and reasoning of an AI model's predictions and decisions.

Commonsense Reasoning

Foundations & Theory

The AI capability to make inferences based on everyday knowledge that humans typically take for granted.

Constraint Satisfaction

Reasoning & Planning

A computational approach where problems are defined as a set of variables, domains, and constraints that must all be simultaneously satisfied.

Semantic Web

Foundations & Theory

An extension of the World Wide Web that enables machines to interpret and process web content through standardised semantic metadata.

Overview

Cross-References(2)

Related in Models & Architecture

Tensor Processing Unit

Neural Processing Unit

Model Distillation

Model Pruning

Neural Architecture Search

Model Quantisation

Model Collapse

Neural Scaling Laws

Speculative Decoding

More in Artificial Intelligence

AI Safety

Strong AI

Artificial Intelligence

Direct Preference Optimisation

AI Interpretability

Commonsense Reasoning

Constraint Satisfaction

Semantic Web

See Also

Transformer

Attention Mechanism