CoDiAJe

EN | ES | HE

Main Menu


Powered by TEITOK
© Maarten Janssen, 2014-

CoDiAJe - Ladino Corpus


 

Welcome to CoDiAJe - the Annotated Diachronic Corpus of Judeo-Spanish.

CoDiAJe is a structured multi-genre diachronic corpus that includes text samples, classified by types, period and geographical origins, from the 16th century to the 21th century enriched automatically or semi-automatically with different types of linguistic annotations.

CoDiAJe is also accompanied by metadata providing information on the authors (birth place, place of residence, social status, etc.) and on the documents (text type, date, place, alphabet, print/manuscript, original/translation).

The digital edition workflow in CoDiAJe is composed of two main tasks: the linguistic processing and annotation of the documents using various NLP tools (Freeling: http://nlp.lsi.upc.edu/freeling/ and Neotag: http://www.lrec-conf.org/proceedings/lrec2012/summaries/1098.html), and the encoding of metadata and linguistic annotation incorporated in the texts using XML to be visualized and searched via TEITOK.

 

How to cite this corpus

CoDiAJe - The Annotated Diachronic Corpus of Judeo-Spanish. Director: Aldina Quintana. Available online at http://corptedig-glif.upf.edu/teitok/codiaje/ Accessed on [date].

2015-12-09