Study on Difficulty-Level Classification for English Writings

Authors

  • Hiromi Ban Faculty of Engineering, Sanjo City University, Niigata, Japan.
  • Rei Oguri Graduate School of Natural Science and Technology, Kanazawa University, Ishikawa, Japan.
  • Haruhiko Kimura Faculty of Production Systems Engineering and Sciences, Komatsu University, Ishikawa, Japan.

DOI:

https://doi.org/10.9734/bpi/crlle/v4/3409E

Keywords:

Accuracy, difficulty-level, F-measure, machine learning

Abstract

This study extracts eleven types of attribute from English text data, with the aim of classifying English text according to level of difficulty by learning and categorization. Using the method of “leave-one-out cross-validation,” text is subjected to machine learning and categorization. E-books have recently gained in popularity. As the quantity of e-books grows, the effort of manually categorising all of them takes a long time. When English sentences are classified according to their difficulty level, it is possible to recommend a foreign-language book that is appropriate for the reader's level of English proficiency. In order to improve accuracy, furthermore, an experiment is carried out in which the size of text data is varied, and the attribute selection method is implemented. As a result, accuracy is improved to 77.04%, and F-measure to 63.96%. In addition, erroneous identification resulting from the impact of columns between sentences is also noted.

Published

2022-03-09

How to Cite

Hiromi Ban, Rei Oguri, & Haruhiko Kimura. (2022). Study on Difficulty-Level Classification for English Writings. Current Research in Language, Literature and Education Vol. 4, 56–64. https://doi.org/10.9734/bpi/crlle/v4/3409E