Machine-learning methods for text named entity recognition

O.O. Marchenko

Abstract


The article describes machine learning methods for the named entity recognition. To build named entity classifiers two basic models of machine learning, The Naїve Bayes and Conditional Random Fields, were used. A model for multi-classification of named entities using Error Correcting Output Codes was also researched. The paper describes a method for classifiers' training and the results of test experiments. Conditional Random Fields overcome other models in precision and recall evaluations.

Problems in programming 2016; 2-3: 150-157


Keywords


machine learning; natural language processing; named entity recognition

References


LAFFERTY J., MCCALLUM A., PEREIRA F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. in The 18th International Conference on Machine Learning. Williamstown, MA, USA. June 28-July 1, 2001. – Williamstown. P. 282–289.

KLINGER R., TOMANEK K. Classical Probabilistic Models and Conditional Random Fields. Algorithm Engineering Report TR07-2-013, Department of Computer Science, Dortmund University of Technology, December 2007.

Linguistic Data Consortium (2011) OntoNotes Release 4.0 [Online] Available from: https://catalog.ldc.upenn.edu/LDC2011T03

TURIAN J., RATINOV L., BENGIO Y. Word representations: a simple and general method for semi-supervised learning. in The 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden. July 11–16, 2010. Uppsala. – P. 384–394.

NADEAU D., SEKINE S. A survey of named entity recognition and classification. Lingvisticae Investigationes. 30 (1). - P. 3-26. https://doi.org/10.1075/li.30.1.03nad

NADEAU D., TURNEY P., MATWIN S. Unsupervised Named Entity Recognition: Generating Gazetteers and Resolving Ambiguity. in Canadian Conference on Artificial Intelligence-2006. Quebec, Canada. June 7-9, 2006. Quebec. - P. 266-277. https://doi.org/10.1007/11766247_23

ANTONOVA A.Y., SOLOVYOV A.N. Method of Conditional Random Fields in tasks of russian texts processing. in The International Conference on Information technologies and systems-2013. Königsberg. September 1-6, 2013. Königsberg. – P. 321–325.

The Stanford NLP Group (2006-2015) Stanford Named Entity Recognizer [Online] Available from: http://www-nlp.stanford.edu/software/CRF-NER.html




DOI: https://doi.org/10.15407/pp2016.02-03.150

Refbacks

  • There are currently no refbacks.