Defining degree of semantic similarity using description logic tools

O.V. Zakharova

Abstract


Establishing the semantic similarity of information is an integral part of the process of solving any information retrieval tasks, including tasks related to big data processing, discovery of semantic web services, categorization and classification of information, etc. The special functions to determine quantitative indicators of degree of se­mantic similarity of the information allow ranking the found information on its semantic proximity to the pur­po­se or search request/template. Forming such measures should take into account many aspects from the mea­nings of the matched concepts to the specifics of the business-task in which it is done. Usually, to construct such si­milarity functions, semantic ap­proaches are combined with structural ones, which provide syntactic comparison of concepts descriptions. This allows to do descriptions of the concepts more detail, and the impact of syntactic matching can be significantly reduced by using more expressive descriptive logics to represent information and by moving the focus to semantic properties. Today, DL-ontologies are the most developed tools for representing semantics, and the mechanisms of reasoning of descriptive logics (DL) provide the possibility of logical inference. Most of the estimates presented in this paper are based on basic DLs that support only the intersection constructor, but the described approaches can be applied to any DL that provides basic reasoning services.

This article contains the analysis of existing approaches, models and measures based on descriptive logics. Classification of the estimation methods both on the levels of defining similarity and the matching types is proposed. The main attention is paid to establishing the similarity between concepts (conceptual level models). The task of establishing the value of similarity between instances and between concept and instance consists of finding the most specific concept for the instance / instances and evaluating the similarity between the concepts. The term of existential similarity is introduced. In this paper the examples of applying certain types of measures to evaluate the degree of semantic similarity of notions and/or knowledge based on the geometry ontology is demonstrated.

 Prombles in programming 2021; 3: 16-26


Keywords


semantic similarity of information; a value of similarity of concepts; least concept subsumer; measures for similarity evaluating; most specific concept; most specific is-a ancestor; similarity function; similarity measure information content

References


Fellbaum, C. (Ed.). (1998). Wordnet: An Electronic Lexical Database. MA: MIT Press. https://doi.org/10.7551/mitpress/7287.001.0001

Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P., eds.: The Description Logic Handbook. Cambridge University Press (2003).

Staab, S., Studer, R., eds.: Handbook on Ontologies. International Handbooks on Information Systems. Springer (2004).

https://doi.org/10.1007/978-3-540-24750-0

Thompson, K., Langley, P.: Concept formation in structured domains. In Fisher, D., Pazzani, M., Langley, P., eds.: Concept Formation: Knowledge and Experience in Unsupervised Learning. Morgan Kaufmann (1991) https://doi.org/10.1016/B978-1-4832-0773-5.50011-0

Haussler, D.: Learning conjuntive concepts in structural domains. Machine Learning (1989) 7-40. https://doi.org/10.1007/BF00114802

F. Baader, R. K¨usters, and R. Molitor. Computing least common subsumers in description logics with existential restrictions. In T. Dean, editor, Proceedings of the 16th International Joint Conference on Artificial Intelligence, pages 96-101. Morgan Kaufmann, 1999.

F. Baader, R. Sertkaya, and Y. Turhan. Computing least common subsumers w.r.t. a background terminology. In V. Haarslev and R. M¨oller, editors, Proceedings of Proceedings of the 2004 International Workshop on Description Logics (DL2004). CEUR-WS.org, 2004.

https://doi.org/10.1007/978-3-540-30227-8_34

R. Rada, H. Milli, E. Bicknell, M. Blettner, "Development and Application of a metric on Semantic Nets", IEEE Trans. on Systems, Man, and Cybernetics, 19(1): 17-30 (1989) https://doi.org/10.1109/21.24528

Mantay, T.: Commonality-based ABox retriev- al. Technical Report FBI-HH-M-291/2000, Department of Computer Science, University of Hamburg, Germany (2000).

Collet, C., Huhns, M.N., Shen, W.M.: Resource integration using a large knowledge base in carnot. IEEE Computer 24 (1991) 55-62.

https://doi.org/10.1109/2.116889

Fankhauser, P., Neuhold, E.J.: Knowledge based integration of heterogeneous databases. In Hsiao, D.K., Neuhold, E.J., Sacks-Davis, R., eds.: Proceedings of the IFIP WG 2.6 Database Semantics Conference on Interoperable Database Systems (DS-5). IFIP Transactions, North-Holland (1992). https://doi.org/10.1016/B978-0-444-89879-1.50015-4

Bright, M.W., Hurson, A.R., Pakzad, S.H.: Automated resolution of semantic heterogeneity in multidatabases. ACM Transaction on Data- base Systems 19 (1994) 212-253. https://doi.org/10.1145/176567.176569

Tversky, A.: Features od similarity. Psycological Review 84 (1997) 327-352. https://doi.org/10.1037/0033-295X.84.4.327

Jang, J., Conrath, D.: Semantic symilarity based on corpus statistic and lexical taxonomy. In: Proceedings of the International Conference on Computational Linguistics. (1997)

Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11 (1999) 95-130.

https://doi.org/10.1613/jair.514

Weinstein, P., Bimingham, P.: Comparing concepts in differentiated ontologies. In: Proceedings of 12th Workshop on Knowledge Acquisition, Modelling, and Management. (1999)

Rodr'ıguez, M.A., Egenhofer, M.J.: Determining semantic similarity among entity classes from different ontologies. IEEE Transaction on Knowledge and Data Engineering 15 (2003) 442-456. https://doi.org/10.1109/TKDE.2003.1185844

A. Tversky, "Features of Similarity", Psychological Review 84(4): 327-352, 1977. https://doi.org/10.1037/0033-295X.84.4.327

J. Lee, M. Kim, and Y. Lee. Information retrieval based on conceptual distance in is-a hierarchies. Journal of Documentation, 2(49):188-207, 1993. https://doi.org/10.1108/eb026913

D. Maynard, W. Peters, and Y. Li. Metrics for evaluation of ontology-based information extraction. In Proceeding of the EON 2006 Workshop, 2006.

P. Resnik, "Using Information Content to Evaluate Semantic Similarity", Proc. IJCAI 1995 : 448-453

G. Miller & W.G. Charles, "Contextual correlates of semantic similarity", Language and Cognitive Processes, 6, 1-28, 1991.

https://doi.org/10.1080/01690969108406936

W. Cohen, A. Borgida, H. Hirsh: "Computing Least Common Subsumers in Description Logics", AAAI 1992: 754-760

R. Kusters & R. Molitor, "Computing Least Common Subsumers in ALEN", IJCAI 2001: 219-224

Claudia d'Amato, Steffen Staab, Nicola Fanizzi, F. Esposito: "Efficient Discovery of Services Specified in Description Logics Languages", SMRR 2007

C. d'Amato, N. Fanizzi, and F. Esposito. A semantic similarity measure for expressive description logics. In A. Pettorossi, editor, Proceedings of Convegno Italiano di Logica Com- putazionale, CILC05, Rome, Italy, 2005

C. d'Amato, N. Fanizzi, and F. Esposito. A dissimilarity measure for ALC concept descriptions. In Proc. of the 21st Annual ACM Symposium of Applied Computing, SAC2006, 2006.

https://doi.org/10.1145/1141277.1141677

P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its ap- plication to problems of ambiguity in natural language. Journal of Artificial Intelligence Re- search, 11:95-130, 1999.

https://doi.org/10.1613/jair.514

A. Borgida, T. Walsh, and H. Hirsh. Towards measuring similarity in description logics. In I. Horrocks, U. Sattler, and F. Wolter, editors, Proceedings of the 2005 International Workshop on Description Logics (DL2005), volume 147 of CEURWorkshop Proceedings. CEUR-WS. org, 2005.




DOI: https://doi.org/10.15407/pp2021.03.016

Refbacks

  • There are currently no refbacks.