Parallel software auto-tuning using statistical modeling and machine learning

А.Yu. Doroshenko, P.A. Ivanenko, O.S. Novak, O.A. Yatsenko


Auto-tuning for complex and nontrivial parallel systems is usually time-consuming because of empirical evaluation of huge amount of combinations of parameter values of an initial parallel program in a target execution environment. This paper proposes the improvement of the auto-tuning method using statistical modeling and neural network algorithms that allow to reduce significantly the space of possible combinations of parameters values to analyse. The resulting optimization is illustrated by an example of tuning of parallel sorting program, that combines several sorting methods, by means of automatic training of a neural network model on results of “traditional” tuning cycles with subsequent replacement of some auto-tuner calls with an evaluation from the statistical model.

Problems in programming 2018; 2-3: 046-053


auto-tuning; software development automation; machine learning; neural network; parallel computing; statistical modeling


Naono, K., Teranishi, K., Cavazos, J. & Suda, R. (2010) Software automatic tuning: from concepts to state-of-the-art results. Berlin: Springer.

Durillo, J. & Fahringer, T. (2014) From single- to multi-objective auto-tuning of programs: advantages and implications. Scientific programming. 22 (4). P. 285-297.

Doroshenko, A. & Shevchenko, R. (2006) A rewriting framework for rule-based programming dynamic applications. Fundamenta informaticae. 72 (1-3). P. 95-108.

Yatsenko, O.A. (2013) Integration of tools of algebra of algorithms and term rewriting system for developing efficient parallel programs. Problems in programming. (2). P. 62-70. (in Russian)

Ivanenko, P.A. & Doroshenko, A.Yu. (2014) Method of automated generation of autotuners for parallel programs. Cybernetics and systems analysis. 50 (3). P. 465-475.

Ivanenko, P., Doroshenko, A. & Zhereb, K. (2014) TuningGenie: auto-tuning framework based on rewriting rules. In Proc. 10th International Conference "ICT in Education, Research, and Industrial Applications" (ICTERI 2014), Revised Selected Papers. Kherson, Ukraine, 9-12 June 2014. Berlin: Springer. 469. P. 139-158.

Doroshenko, А.Yu., Ivanenko, P.A. & Novak, O.S. (2016) Hybrid autotuning model with statistic modelling. Problems in programming. (4). P. 27-32. (in Ukrainian)

Andon, P.I., Doroshenko, A.Yu., Zhereb, K.A., Shevchenko, R.S. & Yatsenko, O.A. (2017) Methods of algebraic programming: formal methods of parallel program development. Кyiv: Naukova dumka. (in Ukrainian)

Whaley, R., Petitet, A. & Dongarra, J.J. (2001) Automated empirical optimizations of software and the ATLAS Project. Parallel computing. 27 (1-2). P. 3-35.

Frigo, M. & Johnson, S. (1998) FFTW: an adaptive software architecture for the FF. Acoustics, speech and signal processing. 3. pp. 1381-1384.

Schaefer, C.A., Pankratius, V. & Tichy, W.F. (2009) Atune-IL: an instrumentation language for auto-tuning parallel applications. In Proc. 15th International Euro-Par Conference (Euro-Par 2009). Delft, The Netherlands, 25-28 August 2009. LNCS. 5704. P. 9-20.

Mitchell, T.M. (1997) Machine learning. 1st edn. New York: McGraw-Hill Education.

Givens, G.H. & Hoeting, J.A. (2012) Computational statistics. 2nd edn. Chichester: Wiley.

Fursin, G., Kashnikov, Y., Memon, A.W., Chamski, Z. et al. (2011) Milepost GCC: machine learning enabled self-tuning compiler. International journal of parallel programming. 39 (3). P. 296-327.

Rahman, M., Pouchet, L.-N. & Sadayappan, P. (2010) Neural network assisted tile size selection. In Proc. 5th International Workshop on Automatic Performance Tuning (IWAPT'2010). USA, Berkeley, CA, 22 June 2010. Berkeley, CA: Springer.

Kofler, K., Grasso, I., Cosenza, B. & Fahringer, T. (2013) An automatic input-sensitive approach for heterogeneous task partitioning. In Proc. 27th ACM International Conference on Supercomputing (ICS'13). USA, Eugene, Oregon, 10-14 June 2013. New York: ACM. P. 149-160.

ORACLE HELP CENTER. (2018) Class RecursiveAction (Java SE 9 & JDK 9) [Online]. Available from: [Accessed: 24 January 2018]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V. et al. (2011) Scikit-learn: machine learning in Python. Journal of machine learning research. 12. P. 2825-2830.

Crawley, M.J. (2012) The R book. 2nd edn. Chichester: Wiley.

Fletcher, R. (2000) Practical methods of optimization. 2nd edn. Chichester: Wiley.

Fawcett, T. (2006) An introduction to ROC analysis. Pattern recognition letters. 27 (8). P. 861-874.



  • There are currently no refbacks.