Heaps' law

In linguistics, Heaps' law is an empirical law which describes the number of distinct words in a document (or set of documents) as a function of the document length.

Heaps' law means that as more instance text is gathered, there will be diminishing returns in terms of discovery of the full vocabulary from which the distinct terms are drawn.


