The practice implementation of the information technology for automated definition of semantic terms sets in the content of educational materials

Yu.V. Krak, O.V. Barmak, O.V. Mazurets


The problem of automation of key terms search in the content of educational materials is investigated. The information technology of automated determination of a set of key semantic terms in the content of educational materials is considered, which is based on the search of used phrases in the text and the disperse evaluation of words importance. In accordance with this information technology, on the basis of the data entered as an educational material file, the structure of a digital document is automatically formed to select an element for analysis, after which segmentation is performed by phrases and terms, the terms are lemmatized and set of them is compactified. On the basis of automatically lemmatized text, a search and disperse evaluation of the importance of words in the chosen fragment is performed, after which the terms importance is calculated, and their number is limited by the value of the keyword density ratio. Input data of information technology is a digital document of educational material, the output data is the corresponding set of key semantic terms of the educational material. The results of the analysis of the regularities of the existing sets of key semantic terms are also described.
The test software that allows to automate the determination of sets of key semantic terms using this information technology is considered. Conducted investigations confirmed the possibility of effectively forming the set of key semantic terms of educational materials, evaluated search precision metrics up to 92.9 % and search recall up to 100.0 %. The practical features of the use of specialized extension for working with electronic documents are considered. The factors that complicate effective search of semantic terms in educational materials are described. The established effectiveness of the proposed technology allows use it to solution a number of urgent tasks, such as determination the conformity of educational materials to content requirements, determination the conformity of sets of test tasks to educational materials, semantic assistance in creating tests, automation of the creation of abstracts and annotations to the elements of educational materials, etc. Further researches are aimed at analyzing the impact on the effectiveness of the technology of the relationship between the number of key semantic terms in the resulting set and the value of the keyword density ratio and improve of the information technology considered to improve the results.

Problems in programming 2018; 2-3: 245-254


digital document; key terms; educational materials; disperse evaluation


