Overview
A neural network trained to convert text passages into fixed-dimensional vectors that capture semantic meaning, enabling similarity search, clustering, and retrieval applications.
Cross-References(2)
More in Natural Language Processing
Abstractive Summarisation
Text AnalysisA text summarisation approach that generates novel sentences to capture the essential meaning of a document, rather than simply extracting and rearranging existing sentences.
BERT
Semantics & RepresentationBidirectional Encoder Representations from Transformers — a language model that understands context by reading text in both directions.
Semantic Similarity
Semantics & RepresentationA measure of how closely the meanings of two text passages align, computed through embedding comparison and used in duplicate detection, search, and recommendation systems.
Chunking Strategy
Core NLPThe method of dividing long documents into smaller segments for embedding and retrieval, balancing context preservation with optimal chunk sizes for vector search accuracy.
Byte-Pair Encoding
Parsing & StructureA subword tokenisation algorithm that iteratively merges the most frequent character pairs to build a vocabulary.
Instruction Following
Semantics & RepresentationThe capability of language models to accurately interpret and execute natural language instructions, a core skill developed through instruction tuning and alignment training.
Topic Modelling
Text AnalysisAn unsupervised technique for discovering abstract topics that occur in a collection of documents.
Named Entity Recognition
Parsing & StructureAn NLP task that identifies and classifies named entities in text into categories like person, organisation, and location.
See Also
Clustering
Unsupervised learning technique that groups similar data points together based on inherent patterns without predefined labels.
Machine LearningNeural Network
A computing system inspired by biological neural networks, consisting of interconnected nodes that process information in layers.
Deep Learning