Garabík, R., Caravolas, M., Kessler, B., Hoeflerová, E., Masterson, J. Mikulajová, M. Szczerbiński, M., & Wierzchoń, P.;
Slovko 2007: Fourth International Seminar in Natural Language Processing, Computational Lexicography and Terminology, 25-27 October 2007, Bratislava, Slovakia
A cross linguistic database of children's printed words in three Slavic languages
Levická J. & Garabík, R.

We describe a lexical database consisting of morphologically and phonetically tagged words that occur in the texts primarily used for language arts instruction in the Czech Republic, Poland and Slovakia in
the initial period of primary education (up to grade 4 or 5). The database aims to parallel the contents and usage of the British English Childrenżs Printed Word Database. It contains words from texts of the most widely used Czech, Polish and Slovak textbooks. The corpus is accessible via a simpleWWWinterface, allowing regular expression searches and boolean expression across word forms, lemmas, morphology tags and phonemic transcription, and providing useful statistics on the textwords included. We anticipate extensive usage of the database as a reference in the development
of psychodiagnostic batteries for literacy impairments in the three languages, as well as for the creation of experimental materials in psycholinguistic research.ík2007/ ;
