Natural Language ProcessingSemantics & Representation

RLHF

Overview

Reinforcement Learning from Human Feedback — a technique for aligning language models with human preferences through reward modelling.

Cross-References(2)

More in Natural Language Processing

See Also