Incidence matrices for -ness and -ity

The following data files are available for download.

CEEC, 17th century

Incidence matrices for -ness and -ity in the 17th-century part of the Corpus of Early English Correspondence (1998).

Browse the directory with all files. Some files are relatively large; compressed versions of those files are also available (with a suffix .zip or .gz).

incidence-ness-all.txt (0.5 MB) and incidence-ity-all.txt (0.3 MB): Incidence matrices for -ness and -ity in plain text format. These files can be used directly with our software; the file format is explained in the documentation of the software.

list-ness-all.txt and list-ity-all.txt: These files show the following information for each sample, in plain text format: the name of the sample, the number of running words, and a list of -ness or -ity which occur in the sample. Only distinct types are shown; the number of occurrences is omitted.

incidence.xls (3 MB): Incidence matrices in Microsoft Excel format. The first worksheet contains the incidence matrix for -ity and the second worksheet for -ness. Rows are samples, columns are types, and the last column is the number of running words.

summary-short.txt: Summary information on subcorpora. Plain text file, tabulator-separated values:

person-summary-short.txt: Summary information on samples. Plain text file, tabulator-separated values:


Tanja Säily and Jukka Suomela. “Comparing type counts: The case of women, men and -ity in early English letters.” In Corpus Linguistics: Refinements and Reassessments, edited by Antoinette Renouf and Andrew Kehoe, 87–109. Amsterdam: Rodopi, 2009.

Contact information

For further information, please contact: