Volume 2 Number 3 (May 2013)
Home > Archive > 2013 > Volume 2 Number 3 (May 2013) >
IJCCE 2013 Vol.2(3): 236-240 ISSN: 2010-3743
DOI: 10.7763/IJCCE.2013.V2.179

Using Unlabeled Data to Improve Author Identification

R. Guzmán Cabrera, J. R. Guzmán Sepúlveda, J. A. Gordillo Sosa , M. Torres Cisneros, and J. Herrera Cabral
Abstract—Authorship attribution may be considered as a text categorization problem. Text categorization requires a large number of training examples which are particularly difficult to obtain in the case of authorship attribution task. In this paper, we investigate the possibility of using Web-based text-mining methods for the identification of the author of a given poem. In particular, we propose a semi-supervised method that is specially suited to work with just few training examples in order to tackle the problem of the lack of data with the same writing style. The results obtained on poem categorization show that this method may significantly improve the classification accuracy and it is appropriate to handle the attribution of short documents.

Index Terms—Authorship attribution, text classification, machine learning.

R. Guzmán-Cabrera and M. Torres Cisneros are with the Grupo de NanoBioFotónica, DICIS, Universidad de Guanajuato, Salamanca, Gto., México (e-mail: guzmanc@ugto.mx, mtorres@ugto.mx).
J. R. Guzmán-Sepulveda is with the Departamento de Electrónica, UAM Reynosa-Rodhe, Universidad Autónoma de Tamaulipas, Carr. Reynosa-San Fernando S/N, Reynosa, Tamaulipas 88779, México (e-mail: jrafael_guzmans@yahoo.com.mx)
J. A. Gordillo-Sosa and Joel Herrera Cabral are with the Depto. de TIC. Univ. Tecnológica del Suroeste de Gto. Carr. Valle-Huanímaro km.1.2, Valle de Santiago, Gto. México (e-mail: antgor@antoniogordillo.com, jherrera@utsoe.edu.mx).
A. González Parada is with Universidad de Guanajuato, DICIS, Salamanca, Gto., México (e-mail: gonzaleza@ugto.mx).

Cite:R. Guzmán Cabrera, J. R. Guzmán Sepúlveda, J. A. Gordillo Sosa , M. Torres Cisneros, and J. Herrera Cabral, "Using Unlabeled Data to Improve Author Identification," International Journal of Computer and Communication Engineering vol. 2, no. 3, pp. 页码, 2013.

General Information

ISSN: 2010-3743 (Online)
Abbreviated Title: Int. J. Comput. Commun. Eng.
Frequency: Quarterly
Editor-in-Chief: Dr. Maode Ma
Abstracting/ Indexing: INSPEC, CNKI, Google Scholar, Crossref, EBSCO, ProQuest, and Electronic Journals Library
E-mail: ijcce@iap.org
  • Dec 29, 2021 News!

    IJCCE Vol. 10, No. 1 - Vol. 10, No. 2 have been indexed by Inspec, created by the Institution of Engineering and Tech.!   [Click]

  • Mar 17, 2022 News!

    IJCCE Vol.11, No.2 is published with online version!   [Click]

  • Dec 29, 2021 News!

    The dois of published papers in Vol. 9, No. 3 - Vol. 10, No. 4 have been validated by Crossref.

  • Dec 29, 2021 News!

    IJCCE Vol.11, No.1 is published with online version!   [Click]

  • Sep 16, 2021 News!

    IJCCE Vol.10, No.4 is published with online version!   [Click]

  • Read more>>