Development of highly-parallel applications for graphical processing units using rewriting rules

A.Yu. Doroshenko, K.A. Zhereb

Abstract


Using graphical processing units (GPUs) allows increasing performance significantly, but requires low-level programming and detailed understanding of underlying hardware and software platform. The paper presents a technique for automating GPU application development, based on rewriting rules paradigm.

Problems in programming 2009; 3: 3-18


References


Эндрюс Г. Основы многопоточного, параллельного и распределенного программирования. – М.: ИД "Вильямс", 2003. – 512 с.

Ryoo S., Rodrigues C.I., Baghsorkhi S.S., Stone S.S., Kirk D.B., and Hwu W.W. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Salt Lake City, UT, USA, February 20–23, 2008). PPoPP '08. ACM, New York, NY. – P. 73–82.

Fatahalian K., Sugerman J., and Hanrahan P. Understanding the efficiency of GPU algorithms for matrix-matrix multiplication. In ACM SIGGRAPH/EURO­GRAPHICS Conf. on Graphics Hardware, 2004. – P. 133–137.

General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org .

NVidia CUDA technology. http://www. nvidia.com/cuda .

AMD (ATI) Stream technology. http:// www.amd.com/stream .

Doroshenko A., Shevchenko R. A Rewriting Framework for Rule-Based Programming Dynamic Applications, Fundamenta Informaticae. – 2006. – Vol. 72, N 1–3. – P. 95–108.

TermWare. – http://www.gradsoft.com.ua/ products/termware_rus.html .

Дорошенко А.Е., Шевченко Р.С. Система символьных вычислений для программирования динамических приложений // Проблемы программирования. – 2005. – № 4. – С. 718–727.

Дорошенко А.Е., Жереб К.А., Яценко Е.А. Формализованное проектирование эффективных многопоточных программ // Проблемы программирования. – 2007. – № 1. – С. 17–30.

Дорошенко А.Е., Жереб К.А., Яценко Е.А. Об оценке сложности и координации вычислений в многопоточных программах // Проблемы программирования. – 2007. – № 2. – С. 41–55.

Lee S., Min S., and Eigenmann R. OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Raleigh, NC, USA, February 14–18, 2009). PPoPP '09. ACM, New York, NY. – P. 101–110.

OpenMP specification. http://openmp.org/ wp/ .

Baskaran M., Bondhugula U., Krishna­moorthy S., Ramanujam J., Rountev A., and Sadayappan P. A compiler frame­work for optimization of affine loop nests for gpgpus. In Proceedings of the 22nd Annual international Conf. on Supercom­uting (Island of Kos, Greece, June 07–12, 2008). ICS '08. ACM, New York, NY. – P. 225–234.

Ma W. and Agrawal G. A compiler and runtime system for enabling data mining applications on gpus. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Raleigh, NC, USA, February 14 – 18, 2009). PPoPP '09. ACM, New York, NY. – P. 287–288.

Allusse Y., Horain P., Agarwal A., and Saipriyadarshan C. GpuCV: an open source GPU-accelerated framework for image processing and computer vision. In Proceeding of the 16th ACM International Conf. on Multimedia (Vancouver, British Columbia, Canada, October 26–31, 2008). MM '08. ACM, New York, NY. – P. 1089–1092.

Lefohn A.E., Sengupta S., Kniss J., Strzodka R., and Owens J.D. Glift: Generic, efficient, random-access GPU data structures. ACM Trans. Graph. 25, 1 Jan. 2006. – P. 60–99.

Han T.D. and Abdelrahman T.S. hiCUDA: a high-level directive-based language for GPU programming. In Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units (Washington, D.C., March 08 – 08, 2009). GPGPU-2, Vol. 383. ACM, New York, NY. – P. 52–61.

Hou, Q., Zhou, K., and Guo, B. BSGP: bulk-synchronous GPU programming. In ACM SIGGRAPH 2008 Papers (Los Angeles, California, August 11 – 15, 2008). SIGGRAPH '08. ACM, New York, NY. – P. 1–12.

CUDA .NET http://www.gass-ltd.co.il/en/ products/cuda.net/ .

Microsoft Research Accelerator Project http://research.microsoft.com/en-us/downloads/648909e1-cb85-46c4-9a94-3cca55971b1d/ .

Bitonic Sort Algorithm. http://www.itl. nist.gov/div897/sqg/dads/HTML/bitonicSort.html .


Refbacks

  • There are currently no refbacks.