Method of information obtaining from ontology on the basis of a natural language phrase analysis

A.A. Litvin, V.Yu. Velychko, V.V. Kaverynskyi

Abstract


A method for phrases analyzing in natural languages of inflective type (Ukrainian and Russian) has been developed.  The method allows one to outline main expressed ideas and groups of words in the text by which they are stated. The semantic trees of propositions formed in this way, each of which expresses one specific idea, are a convenient source material for constructing queries to the ontology in the SPARQL language. The analysis algorithm is based on the following sequence of basic steps: word tokenize, determining of marker words and phrases, identifying available type of proposition, identifying nouns groups, building a syntactic graph of a sentence, building semantic trees of propositions based on existing types of propositions, substituting parameters from semantic trees of propositions in the corresponding SPARQL query templates. The choice of an appropriate template depends on the type of proposition expressed by a given semantic tree of a proposition. The sets of concepts received as an answer are tied as corresponding answers to the previously defined semantic tree of proposition. In case of non-receipt of information from the ontology, the reduction of noun groups is carried out to express more general concepts and the building queries using them. This allows us to get some answer, although not as accurate as when we use the full noun group. The use of SPARQL query templates requires an a priori known ontology structure, which is also proposed in this paper. Such a system is applicable for dialogue using chat-bots or for automatically receiving answers to questions from the text.

Problems in programming 2020; 2-3: 322-330


Keywords


ontology; SPARQL; text analysis; noun group; syntactic graph; semantic tree of a proposition; NLP; NLU

References


Gavrilova T.A., V.F. Khoroshevsky (2000) Knowledge Base of Intelligent Systems. St. Petersburg: Peter.

Antoniou G. (2016) Semantic Web. Moscow: DMK-Press. CrossRef

W3C (2013) SPARQL 1.1 Query Language [Online] Available from: https://www.w3.org/TR/sparql11-query/ [Accessed: 11 February 2020].

Galitsky B. (2019) Developing Enterprise Chatbots. Learning Linguistic Structures. San Jose: Springer. CrossRef

Popescu A. M., Etzioni O., Kautz H. A. (2003) Towards a theory of natural language interfaces to databases. IUI. p. 149-157. CrossRef

Galitsky B., Usikov D. (2015) Programming Spatial Algorithms in Natural Language. AAAI Workshop Technical Report WS-08-11. P. 16-24.

Quirk C., Mooney R., Galley M. (2015) Language to code: learning semantic parsers for if-this-then-that recipes. ACL. P. 878-888. CrossRef

Galitsky B., De La Rosa J.L., Dobrocsi G. (2011) Mapping syntactic to semantic generalizations of linguistic parse trees. Proceedings of the twenty-fourth international Florida artificial intelligence research society conference. P. 168-173.

Li F., Jagadish H. V. (2016) Understanding natural language queries over relational databases. SIGMOD Record. 45. P. 6-13. CrossRef

Zhong V., Xiong G., Socher R. (2017) Seq2SQL: generating structured queries from natural language using reinforcement learning. [Online] Available from: https://arxiv.org/pdf/1709.00103.pdf [Accessed: 11 February 2020].

Kupper D., Strobel M., Rosner D. (1993) Nauda - a cooperative, natural language interface to relational databases. SIGMOD conference. P. 529-533. CrossRef

Li Y., Yang H., Jagadish H. V. (2005) Nalix: an interactive natural language interface for querying xml. SIGMOD conference. P. 900-902. CrossRef

Shaik S., Kanakam P., Hussain S.M., Suryanarayana D. (2016) Transforming Natural Language Query to SPARQL for Semantic Information Retrieval. International Journal of Engineering Trends and Technology. 7. P. 347-350. CrossRef

Lapshin V.A. (2010) Ontologies in computer systems. Moscow: Scientific World.

NLTK Project (2019) Natural Language Toolkit. NLTK 3.4.5 documentation. [Online] Available from: https://www.nltk.org [Accessed: 11 February 2020].

Crystal D.A (2008) Dictionary of Linguistics and Phonetics Wiley-Blackwell. CrossRef

Kurysheva M.V. (2014) Russian language: syntactic analysis of phrases and simple sentences. Tomsk: Tomsk State Pedagogical University.

Shelmanov A.O. (2015) Ph.D. Tresses: Study of methods for automatic text analysis and development of an integrated system of semantic-syntactic analysis. Moscow.




DOI: https://doi.org/10.15407/pp2020.02-03.322

Refbacks

  • There are currently no refbacks.