Main Aspects of Big Data Semantic Annotation
Abstract
Semantic annotations, due to their structure, are an integral part of the effective solution of big data problems. However, the problem of defining semantic annotations is not trivial. Manual annotation is not acceptable for big data due to their size and heterogeneity, as well as the complexity and cost of the annotation process, the automatic annotation task for big data has not yet decision. So, resolving the problem of semantic annotation requires modern mixed approaches, which would be based on and using the existing theoretical apparatus, namely methods and models of machine learning, statistical learning, working with content of different types and formats, natural language processing, etc. It also should provide solutions for main annotation tasks: discovering and extracting entities and relationships from content of any type and defining semantic annotations based on existing sources of knowledge (dictionaries, ontologies, etc.). The obtained annotations must be accurate and provide a further opportunity to solve application problems with the annotated data. Note that the big data contents are very different, as a result, their properties that should be annotated are very different too. This requires different metadata to describe the data. It leads to large number of different metadata standards for data of different types or formats appears. However, to effectively solve the annotation problem, it is necessary to have a generalized description of the metadata types, and we have to consider metadata specificity within this description. The purpose of this work is to define the general classification of metadata and determinate common aspects and approaches to big data semantic annotation.
Problems in programming 2020; 4: 22-33
Keywords
Full Text:
PDF (Українська)References
https://www.ontotext.com/services/semantic-data-modeling/
Thabet Slimani, Taif University, Taif, Saudia Arabia, "Semantic Annotation: The Mainstay of Semantic Web". International Journal of Computer Applications Technology and Research. 2013. Vol. 2. Issue 6. P. 763-770. ISSN: 2319-8656 CrossRef
http://oa.upm.es/5638/2/IJMSO_Corcho_FinalVersionPrintedInJournal.pdf
http://www.nlm.nih.gov/mesh/meshhome.html
http://www.getty.edu/research/tools/vocabulary/tgn/index.html
http://www.foaf-project.org/
http://www.ontoweb.org/
http://knowledgeweb.semanticweb.org/
http://www.esperonto.net
https://pubmed.ncbi.nlm.nih.gov/23734708/
Phesto Enock Mwakyusa. Semantic Annotation and Big Data Techniques for Patent Information Processing. Master's Thesis in Information Technology, October 10, 2017.
Tang, Jie, Duo Zhang, Limin Yao, and Yi Li, "Automatic Semantic Annotation Using Machine Learning". IGI Global 1:1., 2009. CrossRef
https://studme.org/235608/informatika/proektirovanie_ontologiy_srede_protege
Song D., Chute C.G., Tao C. 2011. Semantator: a semi-automatic semantic annotation tool for clinical narratives. In 10th International SemanticWeb Conference (ISWC2011).
Cunningham H., Maynard D., Bontcheva K., Tablan V. 2002. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL '02), Philadelphia.
Savova G.K., Masanz J.J., Ogren P.V., Zheng J., Sohn S., Kipper-Schuler K.C., Chute C.G. Mayo clinical text analysis and knowledge extraction system (ctakes): architecture, component evaluation and applications. Journal of the American Medical Informatics Association. 2010. 17(5). P. 507-513. CrossRef
http://annotation.semanticweb.org/
Handschuh S., Staab S. and Maedche A. (2001) 'CREAM - creating relational metadata with a componentbased, ontology-driven annotation framework', in Gil, Y., Musen, M. and Shavlik, J. (Eds.): First International Conference on Knowledge Capture (KCAP'01), ACM Press, Victoria, Canadá, 1-58113-380-4. New York. P. 76-83.
http://www.cs.umd.edu/projects/plus/ SHOE/KnowledgeAnnotator.html.
Vargas-Vera M., Motta E., Domingue J., Lanzoni M., Stutt A. and Ciravegna F. (2002) 'MnM: ontology driven semi-automatic and automatic support for semantic markup', in Gómez-Pérez, A. and Benjamins, V.R. (Eds.): 13th International Conference on Knowledge Engineering and Management (EKAW 2002), Springer Verlag P. 379-391. CrossRef
Kogut P. and Holmes W. (2001) 'AeroDAML: applying information extraction to genérate daml annotation from web pages', in Handschuh S., Dieng R. and Staab S. (Eds): KCAP'01 Workshop on Semantic Markup and Annotation, Victoria, Canadá.
http://www.cyc.com/2003/04/01/cyc
http://reliant.teknowledge.com/DAML/SUMO.owl
DOI: https://doi.org/10.15407/pp2020.04.022
Refbacks
- There are currently no refbacks.