The means of the apache hadoop for parallel and distributed programs

K.A. Rukhlis; A.Yu. Doroshenko

The means of the apache hadoop for parallel and distributed programs

K.A. Rukhlis, A.Yu. Doroshenko

Abstract

Nowadays Grid-based scientific projects become more wide-spread. However, the essential restriction is the problem to choose the appropriate platform for concrete tasks resolutions. The distributed platform Apache Hadoop is considered in the paper. This modern Grid platform, based on the modern scientific paradigms, is suitable for processing of the huge amount of the data in reliable and effective way. The functional comparison of this platform to the other modern Grid platforms is described in this paper.

Problems in programming 2010; 4: 3-10

Full Text:

PDF (Русский)

References

Berman F., Fox G., Hey T. (eds). Grid Computing: Making the Global Infrastructure a Reality. Chichester: John Wiley & Sons, 2003. – 1060 p.

Sun JXTA, http://jxta.org.

JGroups, www.jgroups.org.

MapReduce: Simplified Data Processing on Large Clusters.

The Google File System.

Bigtable: A Distributed Storage System for Structured Data.

T. White, Hadoop: The Definitive Guide, Sebastopol: O’Reilly Media, Inc,2009.

Hadoop MapReduce Documentation.

Hadoop Distributed File System Documentation.

HDFS Architecture.

HBase Documentation.

Understanding HBase and BigTable.

Refbacks

There are currently no refbacks.

Username
Password
Remember me