Big data metadata classification

O.V. Zakharova


Now there are a lot of data of different structure (or not structured at all) and origin, their volumes are growing ex­ponentially. The problem is the existing software and hardware are not able to cope with so many dif­fe­rent types of data appearing with great speed. Big Data has become too complex and dynamic to pro­cess, store, analyze and manage with traditional tools. It caused the appearance of new platforms and ap­pro­a­ches for working with data, and at the same time, an understanding of the fact that to solve big data prob­lems, these raw data must be supplemented with me­ta­data. Metadata in this case is a means of classifying, organizing, and characterizing data and its content. Their main advantage is an ordered structure. Due to it, metadata is readable not only by a person, but also by a computer. Thus, they can be pro­ces­sed auto­ma­ti­cal­ly and used for indexing, searching, com­bining, auto­mated processing, classification of big data, etc. The creation of effective metadata management sys­tems, first of all, requires their coordinated general classification that take into account the types of data sour­ces (methods of their obtaining) that form the con­tent, tasks solved at different stages of the life cycle, existing formats of data presentation, principles of re­a­sonable efficiency, since often metadata size sig­ni­fi­can­tly exceeds the amount of described data (even big). Therefore, the aim of this work is to analyze exis­ting sources of big data, methods for creating and processing the corresponding metadata, as well as software products that allow them to be processed in a certain way, and building the classification of me­ta­da­ta on the basis of the analysis.

Problems in programming 2019; 4: 53-74


big data source; metadata managment; Hadoop; metadata classification; metadata analysis; services for processing metadata; creation; reviewing; editing of metadata; metadata of images; metadata of audio files; metadata of video files; Data Warehouse metad


ISO 16684-1:2012, Graphic technology – Extensible metadata platform (XMP) specification – Part 1: Data model, serialization and core properties mediaprosvita/how_to/ 13_onlayninstrumentiv_dlya_perevirki_kontentu optimizing?locale=ru_RU



  • There are currently no refbacks.