It is a lemmatized corpus, and includes the texts of reference for the TLIO.
The
editions of 8 texts already present in outdated editions in the TLIO Corpus have been updated (see list here)
and 95 new texts hitherto absent have been inserted (see list here).
The new version of the TLIO Corpus that today is
published online includes 3,452 texts (with an increase of 99 units
compared to the June 16, 2025 version),
for a total of 24,285,827 occurrences (with an increment of 125,630 occurrences), 503,539 distinct graphic forms, 128,196 lemmas, and 4,904,409 lemmatized
occurrences (with an increment of 47,745 occurrences).
It is a non-lemmatized corpus (but searchable with the “lemmi muti” GattoWeb function), which includes the TLIO Corpus and extends it to include all the published texts dating before the end of the XIV Century: it is the corpus that aims to allow the interrogation of the entire textual heritage of early Italian.
There
have been inserted 16 texts so far absent, which do not fall within the
inclusion criteria of the TLIO Corpus (see list here).
The new version of the OVI Corpus that today is
published online includes 3,840 texts (with an increase of 115 units compared to the
June 16, 2025 version), for a total of 31,222,521
occurrences (with an
increment of 397,235 occurrences), and 566,393 distinct graphic forms.