OurBigBook Wikipedia Bot Documentation
The Canterbury Corpus is a collection of texts commonly used in the field of linguistics, particularly in studies related to language modeling, text analysis, and natural language processing. It comprises a variety of written texts that are representative of different styles, genres, and forms of literature. The corpus was originally compiled by researchers at the University of Kent at Canterbury as a resource for linguistic analysis and is often used for tasks such as testing algorithms for text generation, machine translation, and lexical studies.

Ancestors (6)

  1. Data compression
  2. Information theory
  3. Applied mathematics
  4. Fields of mathematics
  5. Mathematics
  6. Home