Semi-Automated Text Categorization Using Demonstration and Integration Based Term Set

Authors

  • M. Pushpa Department of Computer Science, Quiad-e-Millath Government Arts College for women, Chennai, India.
  • K. Nirmala Department of Computer Science, Quiad-e-Millath Government Arts College for women, Chennai, India.
  • J. Vijayalakshmi Department of Information Technology, Sri Sai Ram Engineering College, West Tambaram, Chennai, India.

DOI:

https://doi.org/10.9734/bpi/naer/v1/8766D

Keywords:

Text mining, text characterization, feature selection, text tokenization, FPI and instructional phase

Abstract

Manual Analysis of massive amounts of textual data requires incredible amount of processing time and effort in the interpretation of the text and organizing them in required format. In the current scenario, the major problem is with text or document categorization because of the high dimensionality of feature space. Now-a-days there are many methods available to deal with text feature selection. This paper aims at one such semi-automated text categorization feature selection methodology to deal with a enormous data using two phases of David Merrill’s First principles of instruction (FPI). It uses a pre-defined category group by providing them with the proper training set based on the demonstration and integration phase of FPI. The methodology involves the text tokenization, text categorization and text analysis.

Published

2021-06-24

How to Cite

M. Pushpa, K. Nirmala, & J. Vijayalakshmi. (2021). Semi-Automated Text Categorization Using Demonstration and Integration Based Term Set. New Approaches in Engineering Research Vol. 1, 11–19. https://doi.org/10.9734/bpi/naer/v1/8766D