Use of domain ontology for homonymy clarification into the natural language texts

O.N. Lesko, J.V. Rogushina

Abstract


The article analyses the clarification of various types of homonymy that can be executed without use of semantic information, but only on the basis of syntactic rules. This analysis shows how features of the syntactic structures of legislative and academic texts allow to reduce the number of formal rules required for parsing. A minimal set of syntactic rules necessary for the automatic analysis of such texts is proposed. A method of homonymy clarification in natural language business, scientific and legal text documents is developed. Proposed method does not require the use of a large number of syntactic rules and marked-up texts. Such specificity greatly simplifies the implementation and reduces the time required for creation and markup of text corpora. This result is achieved by use of domain ontology, and by the specifics of syntactic structures of business, scientific and legal documents. In addition, we demonstrate how the use of domain ontology allows to simplify the analysis of the test documents. As opposed to other systems of automatic processing of natural language texts that use domain ontology for semantic analysis too the domain ontology is used to highlight terms in the text and further morphological information of each word in wordy terms.

Problems in programming 2017; 2: 61-71


Keywords


homonymy; morphological analysis; syntactical analysis;, natural language processing; ontology

Full Text:

PDF (Russian)

References


Shkurko E.V. (2011) Syntactic homonymy and ways to prevent its occurrence. In Scientific notes of Taurida national University. im V. I. Vernadsky. Series "Philology. Social communication", Vol. 24 (63), N 2. Part 2, P. 109–113. (In Russian).

Gladky A.V. (1985) Syntactic structure. M., Nauka. (In Russian).

Sokirko A. & Toldova S. (2005). Comparison of effectiveness of two methods of removing lexical and morphological ambiguity for the Russian language. http://aot.ru/docs/Rus Corpora HMM.htm. (In Russian).

Zelenkov Yu.G., Segalovich I.V., &Titov V.A. (2005) Probabilistic model of morphological disambiguity based on normalizing substitutions and positions of neighboring words. In Computer linguistics and intellectual technologies. Proc.of the international workshop Dia-logue. P.188–197. (In Russian).

Brill, E. (1995). Transformation-based er-ror-driven learning and natural language processing: A case study in part-of-speech tagging. In Computational linguistics, 21(4). P. 543–565.

Lakomkin E.D., Puzyrevskiy I.V. and Ryzhov D.A. (2013) Analysis of statistical algorithms of morphological homonymy in the Russian language. (In Russian). http://aistconf.org/stuff/aist2013/submissions/aist2013_submission_33.pdf

Anisimov A.V., Marchenko O.A. and Na-gorny V.A. (2002) Creation of control space of syntactic structures of natural lan-guage. In Bulletin of Kiev University, series: Physical-mathematical science. Issue 1, Kiev. (In Ukrainian).

Marchenko О.О. and Nikonenko А.О. (2008) The Contextual Semantic Analysis of Natural Language Text. System of Text Monitoring and Qualitative Estimation of the Focus Object. In Artificial intelligence, N 3, P. 808–813. (In Russian).

Guarino N. (1998) Formal Ontology in Information Systems. In Formal Ontology in Information Systems. Proceedings of FOIS'98. P. 3–15.

Dobrov B.V., Ivanov V.V., Lukashevich N. and Solovyev V.D. (2006) Ontologies and thesauri: models, tools, applications. (In Russian) http://catscpp.googlecode.com/svn-history/r146/trunk/diploma/materials/ on-tologies_tesauruses.pdf

Modern Ukrainian literary language. Edited by M.J. Plusch (2001), 3rd edition, stereotyped, Kiev, High school. (In Ukrainian).

Big explanatory dictionary of modern Ukrainian language (2009). K. Perun. (In Ukrainian).

Lesko O.N. and Rogushina J.V. (2009) Automation of semantic markup of natural language texts. In Proc. of the IX international scientific conference named after T.A. Taran, "Intellectual analysis of information IAI-2009". P. 247–253. (In Russian).


Refbacks

  • There are currently no refbacks.