Reverse synthesis of natural language phrases grounding on their ontological representation using a large language model

V.V. Kaverynskyi, A.A. Litvin, O.V. Palagin

Abstract


The presented article introduces a novel solution that uses a specially developed structured prompt for a large language model (Chat GPT). A series of experiments were carried out on synthesizing natural language phrases based on their ontological representations. These ontological representations were automatically constructed from sentences of scientific and technical texts using previously developed software tools. Such representations contain entities found in the text and typed semantic relationships between them, which can be realised in the phrases of the analysed text. The system of relationships, specified by a set of concepts, is linked with the entity of the related part of the sentence, which in turn can be a simple sentence or part of a complex sentence. The structured prompt for the large language model includes explanations of the semantic relationships between concepts in the context of sentence synthesis from ontological representation, as well as a set of pairs of concepts connected by semantic relationships, which serve as materia l for sentence creation. The synthesised natural language sentences were compared with the originals using the cosine similarity measure across different vectorisation methods. The obtained similarity scores ranged from 0.8193 to 0.9722 according to the xx_ent_wiki_sm model, although stylistic distortions of the generated sentences were observed in some cases. The research presented in this work has practical significance for the development of dialogue information systems that combine the ontological approach with the use of large language models.

Prombles in programming 2024; 2-3: 359-368

 


Keywords


large language model; ontology; natural language text synthesis; natural language text analysis; cosine similarity; text vectorization

References


K. Malakhov, M. Petrenko, E. Cohn, Developing an ontology-based system for semantic processing of scientific digital libraries, South African Computer Journal, 2023. Vol. 35, No. 1. P. 19–36.

O. Palagin, M. Petrenko, M. Boyko, Ontology-related Complex for Semantic Processing of Scientific Data. Proceedings of the 13th International Scientific and Practical Programming Conference UkrPROG 2022. Kyiv, Ukraine, October 11–12, 2022. Vol. 3501. P. 279 – 290.

M. Petrenko, E. Cohn, O. Shchurov, K. Malakhov, Ontology-Driven Computer Systems: Elementary Senses in Domain Knowledge Processing. South African Computer Journal, 2023. Vol. 35, No.2. P.

– 144.

K. S. Malakhov, Insight into the Digital Health System of Ukraine (eHealth): Trends, Definitions, Standards, and Legislative Revisions. International Journal of Telerehabilitation, 2023. Vol. 15, No. 2. P. 1 – 21.

K. S. Malakhov, Letter to the Editor – Update from Ukraine: Development of the Cloud-based Platform for Patient-centered Telerehabilitation of Oncology Patients with Mathematical-related

Modeling. International Journal of Telerehabilitation, 2023. Vol. 15, No. 1. P. 1–3.

K. Malakhov, Letter to the Editor – Update from Ukraine: Rehabilitation and Research. International Journal of Telerehabilitation, 2022. Vol. 14, No. 2. P. 1–2.

H. Inefuku, K. Malakhov, E. R. Cohn, L. B. Collister, Service Diversification, Connections, and Flexibility in Library Publishing: Rapid

Publication of Research from Ukraine in Wartime. Case Studies in Library Publishing, 2023. Vol.1, No.1.

O. V. Palagin, K. S. Malakhov, V. Yu. Velychko, T. V. Semykopn, Hybrid e-rehabilitation services: SMART-system for remote support of

rehabilitation activities and services. International Journal of Telerehabilitation, Special Issue: Research Status Report – Ukraine, 2022. P. 1–16.

O. Palagin, V. Kaverinskiy, K. Malakhov, M. Petrenko, Fundamentals of the Integrated Use of Neural Network and Ontolinguistic Paradigms: A Comprehensive Approach. Cybern. Syst. Anal., 2024. Vol. 60. P. 111–123.

A. A. Litvin, V. Yu. Velychko, V. V. Kaverinsky, Synthesis of chat-bot responses in the natural language of the flexive type based on the results of formal questions to ontology and semantic analysis of the initial phrase. International Journal "Information Content and Processing", 2020. Vol. 7, No. 1.

O. Palagin, V. Kaverinskiy, K. Malakhov, A. Litvin, OntoChatGPT Information System: Ontology-Driven Structured Prompts for ChatGPT

Meta-Learning. International Journal of Computing, 2023. Vol. 22, No. 2. P. 170 – 183.

Levenshtein distance.

A.F. Kurgaev, N.G. Petrenko, Processor structure design. Cybern Syst Anal., 1995. Vol. 31. P. 618–625.

N. G. Petrenko, A. A. Sofiyuk, On one approach to the transfer of an information structures interpreter to PLD-implementation. Upravlyayushchie Sistemy i Mashiny, 2003. No. 6. P. 48 – 57.


Refbacks

  • There are currently no refbacks.