Overview
Direct Answer
A neural network architecture comprising an encoder–decoder framework that transforms an input sequence into an output sequence of different length and structure. Originally developed for machine translation, this design pattern has become foundational for sequence-to-sequence transformation tasks across natural language processing.
How It Works
The encoder processes an input sequence token-by-token, compressing information into a fixed-size context vector or attention mechanism. The decoder then consumes this representation to generate the output sequence autoregressively, predicting one token at a time based on the encoder state and previously generated tokens. Attention mechanisms allow the decoder to selectively focus on relevant input positions during generation, significantly improving translation quality and handling of long sequences.
Why It Matters
Organisations deploy this architecture to automate labour-intensive language tasks with high accuracy, reducing operational costs and latency in customer-facing systems. The pattern's flexibility enables handling variable-length inputs and outputs, which is essential for real-world applications where sentence structure and length differ between source and target languages.
Common Applications
Primary use cases include machine translation services, automated summarisation of documents and customer feedback, dialogue systems, and code generation. Question-answering systems and image captioning also leverage this architecture by adapting the encoder to process alternative modalities.
Key Considerations
The fixed context vector can become a bottleneck for very long sequences, though attention mechanisms mitigate this problem. Computational cost during inference—particularly beam search decoding—and exposure bias during training remain important tradeoffs requiring careful tuning and regularisation strategies.
Cross-References(1)
More in Natural Language Processing
Text Summarisation
Text AnalysisThe process of creating a concise and coherent summary of a longer text document while preserving key information.
Instruction Following
Semantics & RepresentationThe capability of language models to accurately interpret and execute natural language instructions, a core skill developed through instruction tuning and alignment training.
Text-to-Speech
Speech & AudioTechnology that converts written text into natural-sounding spoken audio using neural networks, enabling voice interfaces, accessibility tools, and content narration.
Chunking Strategy
Core NLPThe method of dividing long documents into smaller segments for embedding and retrieval, balancing context preservation with optimal chunk sizes for vector search accuracy.
Text-to-SQL
Generation & TranslationThe task of automatically converting natural language questions into executable SQL queries, enabling non-technical users to interrogate databases through conversational interfaces.
GloVe
Semantics & RepresentationGlobal Vectors for Word Representation — an unsupervised learning algorithm for obtaining word vector representations from aggregated word co-occurrence statistics.
RLHF
Semantics & RepresentationReinforcement Learning from Human Feedback — a technique for aligning language models with human preferences through reward modelling.
Context Window
Semantics & RepresentationThe maximum amount of text a language model can consider at once when generating a response.