Parallel software auto-tuning using statistical modeling and machine learning
Abstract
Auto-tuning for complex and nontrivial parallel systems is usually time-consuming because of empirical evaluation of huge amount of combinations of parameter values of an initial parallel program in a target execution environment. This paper proposes the improvement of the auto-tuning method using statistical modeling and neural network algorithms that allow to reduce significantly the space of possible combinations of parameters values to analyse. The resulting optimization is illustrated by an example of tuning of parallel sorting program, that combines several sorting methods, by means of automatic training of a neural network model on results of “traditional” tuning cycles with subsequent replacement of some auto-tuner calls with an evaluation from the statistical model.
Problems in programming 2018; 2-3: 046-053
Keywords
Full Text:
PDF (Українська)References
Naono, K., Teranishi, K., Cavazos, J. & Suda, R. (2010) Software automatic tuning: from concepts to state-of-the-art results. Berlin: Springer.
https://doi.org/10.1007/978-1-4419-6935-4
Durillo, J. & Fahringer, T. (2014) From single- to multi-objective auto-tuning of programs: advantages and implications. Scientific programming. 22 (4). P. 285-297.
https://doi.org/10.1155/2014/818579
Doroshenko, A. & Shevchenko, R. (2006) A rewriting framework for rule-based programming dynamic applications. Fundamenta informaticae. 72 (1-3). P. 95-108.
Yatsenko, O.A. (2013) Integration of tools of algebra of algorithms and term rewriting system for developing efficient parallel programs. Problems in programming. (2). P. 62-70. (in Russian)
Ivanenko, P.A. & Doroshenko, A.Yu. (2014) Method of automated generation of autotuners for parallel programs. Cybernetics and systems analysis. 50 (3). P. 465-475.
https://doi.org/10.1007/s10559-014-9635-3
Ivanenko, P., Doroshenko, A. & Zhereb, K. (2014) TuningGenie: auto-tuning framework based on rewriting rules. In Proc. 10th International Conference "ICT in Education, Research, and Industrial Applications" (ICTERI 2014), Revised Selected Papers. Kherson, Ukraine, 9-12 June 2014. Berlin: Springer. 469. P. 139-158.
https://doi.org/10.1007/978-3-319-13206-8_7
Doroshenko, А.Yu., Ivanenko, P.A. & Novak, O.S. (2016) Hybrid autotuning model with statistic modelling. Problems in programming. (4). P. 27-32. (in Ukrainian)
Andon, P.I., Doroshenko, A.Yu., Zhereb, K.A., Shevchenko, R.S. & Yatsenko, O.A. (2017) Methods of algebraic programming: formal methods of parallel program development. Кyiv: Naukova dumka. (in Ukrainian)
Whaley, R., Petitet, A. & Dongarra, J.J. (2001) Automated empirical optimizations of software and the ATLAS Project. Parallel computing. 27 (1-2). P. 3-35.
https://doi.org/10.1016/S0167-8191(00)00087-9
Frigo, M. & Johnson, S. (1998) FFTW: an adaptive software architecture for the FF. Acoustics, speech and signal processing. 3. pp. 1381-1384.
https://doi.org/10.1109/ICASSP.1998.681704
Schaefer, C.A., Pankratius, V. & Tichy, W.F. (2009) Atune-IL: an instrumentation language for auto-tuning parallel applications. In Proc. 15th International Euro-Par Conference (Euro-Par 2009). Delft, The Netherlands, 25-28 August 2009. LNCS. 5704. P. 9-20.
https://doi.org/10.1007/978-3-642-03869-3_5
Mitchell, T.M. (1997) Machine learning. 1st edn. New York: McGraw-Hill Education.
Givens, G.H. & Hoeting, J.A. (2012) Computational statistics. 2nd edn. Chichester: Wiley.
https://doi.org/10.1002/9781118555552
Fursin, G., Kashnikov, Y., Memon, A.W., Chamski, Z. et al. (2011) Milepost GCC: machine learning enabled self-tuning compiler. International journal of parallel programming. 39 (3). P. 296-327.
https://doi.org/10.1007/s10766-010-0161-2
Rahman, M., Pouchet, L.-N. & Sadayappan, P. (2010) Neural network assisted tile size selection. In Proc. 5th International Workshop on Automatic Performance Tuning (IWAPT'2010). USA, Berkeley, CA, 22 June 2010. Berkeley, CA: Springer.
Kofler, K., Grasso, I., Cosenza, B. & Fahringer, T. (2013) An automatic input-sensitive approach for heterogeneous task partitioning. In Proc. 27th ACM International Conference on Supercomputing (ICS'13). USA, Eugene, Oregon, 10-14 June 2013. New York: ACM. P. 149-160.
https://doi.org/10.1145/2464996.2465007
ORACLE HELP CENTER. (2018) Class RecursiveAction (Java SE 9 & JDK 9) [Online]. Available from: https://docs.oracle.com/javase/9/docs/api/java/util/concurrent/RecursiveAction.html [Accessed: 24 January 2018]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V. et al. (2011) Scikit-learn: machine learning in Python. Journal of machine learning research. 12. P. 2825-2830.
Crawley, M.J. (2012) The R book. 2nd edn. Chichester: Wiley.
https://doi.org/10.1002/9781118448908
Fletcher, R. (2000) Practical methods of optimization. 2nd edn. Chichester: Wiley.
https://doi.org/10.1002/9781118723203
Fawcett, T. (2006) An introduction to ROC analysis. Pattern recognition letters. 27 (8). P. 861-874.
https://doi.org/10.1016/j.patrec.2005.10.010
DOI: https://doi.org/10.15407/pp2018.02.046
Refbacks
- There are currently no refbacks.