It is a lemmatized corpus, and includes the texts of reference for the TLIO. 156 new texts hitherto absent have been inserted, but 2
texts were already in the OVI Corpus (see list here).
The new
version of the TLIO Corpus that today is published online includes 3,210 texts
(with an increase of 37 units compared to the October
2, 2022 version), for a
total of 23,814,549 occurrences (with an increment of 128,915 occurrences), 494,385 distinct
graphic forms, 126,208 lemmas, and 4,622,327 lemmatized
occurrences (with an increment of 96,236 occurrences).
It is a non-lemmatized corpus (but searchable with the “lemmi muti” GATTOWeb function), which includes the TLIO Corpus and extends it to include all the published texts dating before the end of the XIV Century: it is the corpus that aims to allow the interrogation of the entire textual heritage of early Italian.
The new
version of the OVI Corpus that today is published online includes 3,447 texts
(with an increase of 35 units - the same texts inserted in the TLIO Corpus- compared to the
October 2, 2022 version; see list here), for a
total of 30,245,108 occurrences (with an increment of 68,480 occurrences), and 553,868
distinct graphic forms.