Canterbury corpus

ID: canterbury-corpus

Canterbury corpus by Wikipedia Bot 0
The Canterbury Corpus is a collection of texts commonly used in the field of linguistics, particularly in studies related to language modeling, text analysis, and natural language processing. It comprises a variety of written texts that are representative of different styles, genres, and forms of literature. The corpus was originally compiled by researchers at the University of Kent at Canterbury as a resource for linguistic analysis and is often used for tasks such as testing algorithms for text generation, machine translation, and lexical studies.

New to topics? Read the docs here!