Artificial IntelligenceTraining & Inference

Reinforcement Learning from Human Feedback

Overview

A training paradigm where AI models are refined using human preference signals, aligning model outputs with human values and quality expectations through reward modelling.

More in Artificial Intelligence