Natural Language ProcessingSemantics & Representation

Tokenisation

Overview

The process of breaking text into smaller units (tokens) such as words, subwords, or characters for processing by language models.

More in Natural Language Processing