Powered by <TEI:TOK>
CoDiAJe - Ladino Corpus
Welcome to CoDiAJe - the Annotated Diachronic Corpus of Judeo-Spanish.
CoDiAJe is a structured multi-genre diachronic corpus that includes text samples, classified by types, period, and geographical origins, from the 16th century to the 21st century, enriched automatically or semi-automatically with different kinds of linguistic annotations.
CoDiAJe is also accompanied by metadata providing information on the authors (birthplace, place of residence, social status, etc.) and the documents (text type, date, location, alphabet, print/manuscript, original/translation).
The digital edition workflow in CoDiAJe is composed of two main tasks: the linguistic processing and annotation of the documents using various NLP tools (Freeling: http://nlp.lsi.upc.edu/freeling/ and Neotag: http://www.lrec-conf.org/proceedings/lrec2012/summaries/1098.html), and the encoding of metadata and linguistic annotation incorporated in the texts using XML to be visualized and searched via TEITOK.
How to cite this corpus
CoDiAJe - The Annotated Diachronic Corpus of Judeo-Spanish. Director: Aldina Quintana. Available online at http://corptedig-glif.upf.edu/teitok/codiaje/ Accessed on [date].
CoDiAJe is part of two research projects supported by the Israel Science Foundation (ISF) (grant No. 473/11 and grant No. 486/19).