Using metadata to resolve big data problems
Abstract
Today, the volumes of data used by application systems are growing exponentially and have reached such sizes that they cannot be processed by traditional systems. So the term "Big data" appeared. The main problems of such data sets are associated, first of all, not only with their volumes, but also with the variety and complexity of the information they contain. Thus, along with the growth of data volumes and the number of big data initiatives, the metadata become the most important priority for the success of large data projects. Enterprises understand that the full use of the operational potential of machine learning, in-depth learning and artificial intellect requires the unprocessed data was supplemented with metadata. Therefore, the purpose of this work is to analyze the effect of metadata to solving the big data problems, determine the main categories of data to be annotated by metadata, and the main types of metadata used for this. Today, metadata is a means of classifying, organizing, and characterizing data or its contents. Depending on the role they play in solving big data problems, NISO identifies four main types of metadata: administrative, descriptive, structural, and markup languages. Different types of metadata can be used in a certain way to effectively solve problems of management, search, data integration, etc. A separate issue is the way of their creation/automatic generation, since the manual creation of metadata is a laborious process, and their volume is often several times larger than the volume of the data itself.
Keywords
Full Text:
PDF (Українська)References
https://whatis.techtarget.com/definition/ metadata
https://www.gartner.com/doc/3075917/reasons-big-data-needs-metadata
https://www.datasciencecentral.com/profiles/blogs/why-you-need-metadata-for-big-data-success
https://hbr.org/2013/05/little-data-makes-big-data-mor
hts://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-twitter-archive/
https://www.datasciencecentral.com/profiles/blogs/importance-of-metadata-in-a-big-data-world
http://framework.niso.org/24.html
https://groups.niso.org/apps/group_public/download.php/17443/understanding-metadata
https://www.i-scoop.eu/big-data-action-value-context/data-lakes/
https://groups.niso.org/apps/group_public/download.php/17443/understanding-metadata
"OWL Web Ontology Language Overview," W3C Recommendation, 2004, http:// www.w3.org/TR/owl-features/.
http://www.w3.org/TR/rdf-schema/
Blake J. A. and Bult C. J. "Beyond the data deluge: data integration and bio-ontologies". Journal of Biomedical Informatics. 2006. Vol. 39, N 3. P. 314-320, View at Publisher · View at Google Scholar ·View at Scopus. CrossRef
Viti F., Merelli I., Calabria A. et al., "Ontology-based resources for bioinformatics analysis," International Journal of Metadata, Semantics and Ontologies. 2011. Vol. 6, N 1. P. 35-45. View at Publisher · View at Google Scholar · View at Scopus. CrossRef
Osborne J. D., Flatow J., Holko M. et al. "Annotating the human genome with disease ontology," BMC Genomics. 2009. Vol. 10, supplement 1, article S6. View at Publisher·View at Google Scholar·View at Scopus. CrossRef
https://www.w3.org/DesignIssues/LinkedData.html
DOI: https://doi.org/10.15407/pp2019.02.081
Refbacks
- There are currently no refbacks.