Horizontal and Vertical Scalability of Machine Learning Methods

B.O. Biletskyy


The main stages of Machine Learning Pipelines are considered in the paper, such as: train data collection and storage, training and scoring. The effect of the Big Data phenomenon on each of the stages is discussed. Different approaches to efficient organization of computation are on each of the stage are evaluated. In the first part of the paper we introduce the notion of horizontal and vertical scalability together with corresponding cons and pros. We consider some limitations of scaling, such as Amdahl's law. In the second part of the paper we consider scalability of data storage routines. First we discuss relational databases and scalability limitations related to ACID guarantees, which such database satisfy. Then we consider horizontally scalable non-relational databases, so called NoSQL databases. We formulate CAP-theorem as a fundamental limitation of horizontally scalable databases. The third part of the paper is dedicated to scalability of computation based on the MapReduce programming model. We discuss some implementations of this programming model, such as Hadoop and Spark together with some basic principles which they are based on. In the fourth part of the article we consider various approaches towards scaling of Machine Learning methods. We give the general statement of Machine Learning problem. Then we show how MapReduce programming model can be applied for horizontal scaling of Machine Learning methods on the example of Bayessian pattern recognition procedure. On the example of Deep Neural Networks we discuss Machine Learning methods which are not horizontally scalable. Then we consider some approaches towards vertical scaling of such methods based on GPU’s and the TensorFlow programming model.




machine learning; pattern recognition; horizontal scalability; vertical scalability; relational databases; ACID; NoSQL; CAP-theorem


Hilbert M., López P. The World's Technolo-gical Capacity to Store, Communicate, and Compute Information. Science. 2011. Vol. 332. P. 60−65. CrossRef

Andrew Danowitz, Kyle Kelley, James Mao, John P. Stevenson, Mark Horowitz Commu-nications of the ACM. Vol. 55, N 4. P. 55-63. CrossRef

Moore G. Cramming more components onto integrated circuits. Electronics. 1965. Vol. 38. P. 114−117.

Seth Gilbert and Nancy Lynch, "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services", ACM SIGACT News. 2002. Vol. 33, Issue 2. P. 51-59. CrossRef

MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean and Sanjay Ghemawat https://static.googleusercontent. com/media/

A computational model for TensorFlow: an introduction by Martín Abadi, Michael Isard, Derek G. Murray cfm?doid=3088525.3088527



  • There are currently no refbacks.