HelexKids: A word frequency database for Greek and Cypriot primary school children

Aris R. Terzopoulos*, Lynne G. Duncan, Mark A.J. Wilson, Georgia Z. Niolaki, Jackie Masterson

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    14 Citations (SciVal)


    In this article, we introduce HelexKids, an online written-word database for Greek-speaking children in primary education (Grades 1 to 6). The database is organized on a grade-by-grade basis, and on a cumulative basis by combining Grade 1 with Grades 2 to 6. It provides values for Zipf, frequency per million, dispersion, estimated word frequency per million, standard word frequency, contextual diversity, orthographic Levenshtein distance, and lemma frequency. These values are derived from 116 textbooks used in primary education in Greece and Cyprus, producing a total of 68,692 different word types. HelexKids was developed to assist researchers in studying language development, educators in selecting age-appropriate items for teaching, as well as writers and authors of educational books for Greek/Cypriot children. The database is open access and can be searched online at www.helexkids.org.
    Original languageEnglish
    Pages (from-to)83-96
    Number of pages14
    JournalBehavior Research Methods
    Issue number1
    Publication statusPublished (VoR) - 1 Feb 2017


    This research was supported by a University of Dundee PhD studentship, 2014–2016. Many thanks are due Nikos Glaros and the Institute of Language and Speech Processing (Athens, Greece) for providing us with the Symfonia software. We also express our gratitude to Athanasios Protopapas (University of Athens), Antonis Kyparissiadis (University of Nottingham), Emanuel Keuleers (Ghent University), and Aggelos Papaloudis for their valuable advice and assistance.

    FundersFunder number
    University of Dundee


      • Children
      • Contextual diversity
      • Frequency
      • Greek language
      • Word database


      Dive into the research topics of 'HelexKids: A word frequency database for Greek and Cypriot primary school children'. Together they form a unique fingerprint.

      Cite this