Big data platforms. Main objectives, features and advantages

O.V. Zakharova

Abstract


This paper presents an overview of existing big data platforms. The goal is to identify the main problems and solutions that exist in this area, as well as the properties of the big data platforms that determine their capabilities, advantages or weaknesses in solving these problems. The relevance of the topic is due to the rapid evaluation of mobile devices and application systems, the corresponding increase in the volume of information and the inability of traditional systems to process such amounts of data in a reasonable time. That is, it is an information technology platform enterprise class that provides the properties and functionality of an application in one solution for developing, deploying, processing and managing big data. The goal of creating and using such platforms is to improve the scalability, availability, performance, and security of organizations working with big data. Big data platforms enable to process multi-structured data in real time and allow different users to use them for various tasks related to using big data. The paper discusses frameworks developed for solving big data problems, analyzes their characteristics, operating principles and capabilities in the context of the problems they are able to solve, it also identifies existing “gaps” and directions for further development. Solving the problems of big data, namely ensuring the effective storage, processing and analysis of data, will make information more useful, and companies that work with big data more competitive.

Problems in programming 2019; 3: 101-115


Keywords


big data platform; machine learning; Apache Hadoop; document-oriented storage; «key-value» storage; column storage; graph storage; data management; distributed storage; stream computing; NoSQL data bases; distributed file system; Map­Reduce; TaskTracker;

References


Big data: big problems. E. E. Chеhаrin. International Electronic Scientific Journal, ISSN 2307-2334.

https://proglib.io/p/nosql-db-part-1/

Big Data: Survey, Technologies, Opportunities, and Challenges. Nawsher Khan, Ibrar Yaqoob, Ibrahim Abaker Targio Hashem, ZakiraInayat, Waleed Kamaleldin Mahmoud Ali, Muhammad Alam, Mu-hammad Shiraz and Abdullah Gani. The Scientific World Journal · January 2014. CrossRef

Sagiroglu S. and Sinanc D. "Big data: are view," in Proceedings of the International Conference on Collaboration Technologies and Systems (CTS'13), P. 42-47, IEEE, SanDiego, Calif, USA, May 2013. CrossRef

Wang D. "An efficient cloud storage model for heterogeneous cloud infrastructures", Procedia Engineering. 2011. Vol. 23. P. 510-515. CrossRef

K. Bakshi, "Considerations for big data: architecture and approach", in Proceedings of the IEEE Aerospace Conference. P. 1-7. BigSky, Mont, USA, March 2012. CrossRef

Aho A.V. "Computation and computational thinking", The Computer Journal. 2012. Vol. 55, N 7. P. 833-835. CrossRef

Bhatnagar S.S.V. and Srinivasa S. Big Data Analytics, 2012. CrossRef

Pastorelli M., Barbuzzi A., Carra D., Dell'Amico M. and Michiardi P. "HFSP: size-based scheduling for Hadoop", in Proceedings of the IEEE International Congress on Big Data (BigData '13), 2013. P. 51-59. CrossRef

Katal A., Wazid M., and Goudar R.H. "Big data: issues, challenges, tools and good practices", in Proceedings of the 6th International Conference on Contemporary Computing (IC3'13). 2013. P. 404-409. CrossRef

Big Data Analytics: A Literature Review Paper. Nada Elgendy and Ahmed Elragal. Conference Paper in Lecture Notes in Computer Science · August 2014.

https://tproger.ru/translations/types-of-nosql-db/

https://www.forbes.com/sites/bernardmarr/2016/02/09/how-to-find-the-best-big-data-product-or-service-vendors/

http://datareview.info/article/analitika-v-rezhime-realnogo-vremeni-s-pomoshhyu-spark-sql/

https://habr.com/post/415939/

https://hadoop.apache.org/docs/r1.2.1/ fair_scheduler.html




DOI: https://doi.org/10.15407/pp2019.03.101

Refbacks

  • There are currently no refbacks.