Direct Preference Optimisation

Overview

A simplified alternative to RLHF that directly optimises language model policies using preference data without requiring a separate reward model.

Cross-References(2)

Natural Language Processing

Language Model RLHF

Related in Training & Inference

AI Bias

Systematic errors in AI outputs that arise from biased training data, flawed assumptions, or prejudicial algorithm design.

Causal Inference

The process of determining cause-and-effect relationships from data, going beyond correlation to establish causation.

AI Feature Store

A centralised platform for storing, managing, and serving machine learning features consistently across training and inference.

Federated Learning

A machine learning approach where models are trained across decentralised devices without sharing raw data, preserving privacy.

AI Inference

The process of using a trained AI model to make predictions or decisions on new, unseen data.

AI Training

The process of teaching an AI model to recognise patterns by exposing it to large datasets and adjusting its parameters.

Hyperparameter Tuning

The process of optimising the external configuration settings of a machine learning model that are not learned during training.

AutoML

Automated machine learning that automates the end-to-end process of applying machine learning to real-world problems.

Reinforcement Learning from Human Feedback

A training paradigm where AI models are refined using human preference signals, aligning model outputs with human values and quality expectations through reward modelling.

Model Merging

Techniques for combining the weights and capabilities of multiple fine-tuned models into a single model without additional training, creating versatile multi-capability systems.

More in Artificial Intelligence

Cognitive Computing

Foundations & Theory

Computing systems that simulate human thought processes using self-learning algorithms, data mining, pattern recognition, and natural language processing.

Retrieval-Augmented Generation

Infrastructure & Operations

A technique combining information retrieval with text generation, allowing AI to access external knowledge before generating responses.

AI Safety

Safety & Governance

The interdisciplinary field dedicated to making AI systems safe, robust, and beneficial while minimizing risks of unintended consequences.

Abductive Reasoning

Reasoning & Planning

A form of logical inference that seeks the simplest and most likely explanation for a set of observations.

Symbolic AI

Foundations & Theory

An approach to AI that uses human-readable symbols and rules to represent problems and derive solutions through logical reasoning.

Artificial Intelligence

Foundations & Theory

The simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction.

Bayesian Reasoning

Reasoning & Planning

A statistical approach to AI that uses Bayes' theorem to update probability estimates as new evidence becomes available.

Constraint Satisfaction

Reasoning & Planning

A computational approach where problems are defined as a set of variables, domains, and constraints that must all be simultaneously satisfied.

Overview

Cross-References(2)

Related in Training & Inference

AI Bias

Causal Inference

AI Feature Store

Federated Learning

AI Inference

AI Training

Hyperparameter Tuning

AutoML

Reinforcement Learning from Human Feedback

Model Merging

More in Artificial Intelligence

Cognitive Computing

Retrieval-Augmented Generation

AI Safety

Abductive Reasoning

Symbolic AI

Artificial Intelligence

Bayesian Reasoning

Constraint Satisfaction

See Also

Language Model

RLHF