Abstract
In this article, we introduce HelexKids, an online written-word database for Greek-speaking children in primary education (Grades 1 to 6). The database is organized on a grade-by-grade basis, and on a cumulative basis by combining Grade 1 with Grades 2 to 6. It provides values for Zipf, frequency per million, dispersion, estimated word frequency per million, standard word frequency, contextual diversity, orthographic Levenshtein distance, and lemma frequency. These values are derived from 116 textbooks used in primary education in Greece and Cyprus, producing a total of 68,692 different word types. HelexKids was developed to assist researchers in studying language development, educators in selecting age-appropriate items for teaching, as well as writers and authors of educational books for Greek/Cypriot children. The database is open access and can be searched online at www.helexkids.org.
Original language | English |
---|---|
Pages (from-to) | 83-96 |
Number of pages | 14 |
Journal | Behavior Research Methods |
Volume | 49 |
Issue number | 1 |
DOIs | |
Publication status | Published (VoR) - 1 Feb 2017 |
Funding
This research was supported by a University of Dundee PhD studentship, 2014–2016. Many thanks are due Nikos Glaros and the Institute of Language and Speech Processing (Athens, Greece) for providing us with the Symfonia software. We also express our gratitude to Athanasios Protopapas (University of Athens), Antonis Kyparissiadis (University of Nottingham), Emanuel Keuleers (Ghent University), and Aggelos Papaloudis for their valuable advice and assistance.
Funders | Funder number |
---|---|
University of Dundee |
Keywords
- Children
- Contextual diversity
- Frequency
- Greek language
- Word database