Ontology-based semantic similarity to metadata analysis in the information security domain

A.Y. Gladun, K.A. Khala


It is becoming clear with growing complication of cybersecurity threats, that one of the most important resources to combat cyberattacks is the processing of large amounts of data in the cyber environment. In order to process a huge amount of data and to make decisions, there is a need to automate the tasks of searching, selecting and interpreting Big Data to solve operational information security problems. Big data analytics is complemented by semantic technology, can improve cybersecurity, and allows you to process and interpret large amounts of information in the cyber environment. Using of semantic modeling methods in Big Data analytics is necessary for the selection and combination of heterogeneous Big Data sources, recognition of the patterns of network attacks and other cyber threats, which must occur quickly to implement countermeasures. Therefore to analyze Big Data metadata, the authors propose pre-processing of metadata at the semantic level. As analysis tools, it is proposed to create a thesaurus of the problem based on the domain ontology, which should provide a terminological basis for the integration of ontologies of different levels. To build a thesaurus of the problem, it is proposed to use the standards of open information resources, dictionaries, encyclopedias. The development of an ontology hierarchy formalizes the relationships between data elements that will be used in future for machine learning and artificial intelligence algorithms to adapt to changes in the environment, which in turn will increase the efficiency of big data analytics for the cybersecurity domain.

Prombles in programming 2021; 2: 34-41


big data analytics; information security; cyber security; ontology; thesaurus; unstructured data; metadata; semantic similarity

