HP UPC – IMPROVING UPC USABILITY AND PERFORMANCE
Period: 2008-05-01 – 2011-04-30 (36 months).
Funded by: Empresa
The difficulty to exploit effectively hardware optimizations as the complexity of computers increases is a well-known fact. As a result there is a growing interest in the improvement of the productivity of programmers of parallel applications. The research in this area came out with a family of languages based on the Partitioned Global Address Space (PGAS) paradigm, being Unified Parallel C (UPC) one of the most successful ones. The target of this project is to push further the research in this field, using the Finis Terrae supercomputer as its main tool.
The architecture of CESGA Finis Terrae supercomputer is flexible enough to cover many different scenarios, from highly coupled ones, like many cores in one node, to very distributed ones, such as one core per node, and many nodes. This broad spectrum of possible configurations can help the scientist in this project to compare different paradigms of parallel programming like the traditional ones based on message passing (MPI) or shared memory (OpenMP) and the new emerging paradigms based on PGAS (UPC) both in terms of programmability and performance. As a result of this evaluation, new libraries will be proposed and implemented in order to enhance UPC programmability and performance. This way, the technical objectives in the project are:
1) to evaluate the programmability of applications using UPC.
2) to evaluate the performance of UPC using a set of benchmarks (microbenchmarking and kernel benchmarking). This study on performance can lead to the detection of performance bottlenecks in UPC, which would point out the best UPC programming practices for the development of the third objective.
3) to develop new libraries that improve the programmability and performance of applications in UPC. The analysis of the benchmarks developed in the previous objectives and a detailed study of the UPC functionalities provided by the language constructs and by the UPC libraries will lead the researchers of this project to detect features (not covered by the standard) that would improve UPC usability. Such features could be implemented as libraries, such as new collective operations (mainly relocalization and computational operations) and higher level UPC libraries to support both regular and irregular/sparse computations.
C. Teijeiro, G. L. Taboada, J. Touriño, R. Doallo, J. C. Mouriño, D. A. Mallón, B. Wibecan. “Design and Implementation of an Extended Collectives Library for Unified Parallel C”. Journal of Computer Science and Technology (in press). 2012
González-Domínguez, J., Martín, M. J., Taboada, G. L., Touriño, J., Doallo, R., Mallón, D. A. and Wibecan, B. (2012), UPCBLAS: a library for parallel matrix computations in Unified Parallel C. Concurrency Computat.: Pract. Exper., 24: 1645–1667. doi: 10.1002/cpe.1914
C. Teijeiro, G. L. Taboada, J. Touriño, B. B. Fraguela, R. Doallo, D. A. Mallón, A. Gómez, J. C. Mouriño, and B. Wibecan, “Evaluation of UPC programmability using classroom studies,” in Proceedings of the Third Conference on Partitioned Global Address Space Programing Models – PGAS ’09, 2009, p. 10:1-10:7.
D. A. Mallón, A. Gómez, J. C. Mouriño, G. L. Taboada, C. Teijeiro, J. Touriño, B. B. Fraguela, R. Doallo, and B. Wibecan, “UPC performance evaluation on a multicore system,” in Proceedings of the Third Conference on Partitioned Global Address Space Programing Models – PGAS ’09, 2009, p. 9:1-9:7
D. A. Mallón, G. L. Tabolada, C. Teijeiro, J. Touriño, B. B. Fraguela, A. Gómez, and J. C. Mouriño, “Performance Evaluation of MPI , UPC and OpenMP on Multicore Architectures,” in 16th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009, pp. 174–184.
J. González-Domíngez, M. J. Martín, G. L. Taboada, J. Touriño, R. Doallo, and A. Gómez, “A parallel numerical library for UPC,” in Euro-Par 2009 Parallel Processing, 2009, pp. 630–641.
J. Gonzalez-Dominguez, M. J. Martin, G. L. Taboada, and J. Tourino, “Dense Triangular Solvers on Multicore Clusters using UPC,” in PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS), 2011, vol. 4, pp. 231–240.
G. L. Taboada, C. Teijeiro, J. Tourino, B. B. Fraguela, R. Doallo, J. C. Mourino, D. a. Mallon, and A. Gomez, “Performance Evaluation of Unified Parallel C Collective Communications,” 2009 11th IEEE International Conference on High Performance Computing and Communications, pp. 69–78, Jun. 2009.
UPC Operations Microbenchmarking Suite (UOMS)
UPCBLAS: A numerical library for UPC