TY - GEN
T1 - Parallelization hints via code skeletonization
AU - Aguston, Cfir
AU - Ben Asher, Yosi
AU - Haber, Gadi
PY - 2014
Y1 - 2014
N2 - Tools that provide optimization hints for program developers are facing severe obstacles and often unable to provide meaningful guidance on how to parallelize real-life applications. The main reason is due to the high code complexity and its large size when considering commercially valuable code. Such code is often rich with pointers, heavily nested conditional statements, nested while-based loops, function calls, etc. These constructs prevent existing compiler analysis from extracting the full parallelization potential. We propose a new paradigm to overcome this issue by automatically transforming the code into a much simpler skeletonlike form that is more conductive for auto-parallelization. We then apply existing tools of source-level automatic parallelization on the skeletonized code in order to expose possible parallelization patterns. The skeleton code, along with the parallelized version, are then provided to the programmer in the form of an IDE (Integrated Development Environment) recommendation. The proposed skeletonization algorithm replaces pointers by integer indexes and C-struct references by references to multi-dimensional arrays. This is because automatic parallelizers cannot handle pointer expressions. For example, while(p ≠= NULL){ p → val + +; p = p → next; g will be skeletonized to the parallelizable for(Ip = 0; Ip < N; Ip++)f Aval[Ip]++; g where Aval[] holds the embedding of the original list. It follows that the main goal of the skeletonization process is to embed pointer-based data structures into arrays. Though the skeletonized code is not semantically equivalent to the original code, it points out a possible parallelization pattern for this code segment and can be used as an effective parallelization hint to the programmer. We applied the method on several representative benchmarks from SPEC CPU 2000 and reached up to 80% performance gain after several sequential code segments had been manually parallelized based on the parallelization patterns of the generated skeletons. In a different set of experiments we tried to estimate the potential of skeletonization for a larger set of programs in SPEC 2000 and obtained an estimation of 27% additional loops that can be parallelized/vectorized due to skeletonization.
AB - Tools that provide optimization hints for program developers are facing severe obstacles and often unable to provide meaningful guidance on how to parallelize real-life applications. The main reason is due to the high code complexity and its large size when considering commercially valuable code. Such code is often rich with pointers, heavily nested conditional statements, nested while-based loops, function calls, etc. These constructs prevent existing compiler analysis from extracting the full parallelization potential. We propose a new paradigm to overcome this issue by automatically transforming the code into a much simpler skeletonlike form that is more conductive for auto-parallelization. We then apply existing tools of source-level automatic parallelization on the skeletonized code in order to expose possible parallelization patterns. The skeleton code, along with the parallelized version, are then provided to the programmer in the form of an IDE (Integrated Development Environment) recommendation. The proposed skeletonization algorithm replaces pointers by integer indexes and C-struct references by references to multi-dimensional arrays. This is because automatic parallelizers cannot handle pointer expressions. For example, while(p ≠= NULL){ p → val + +; p = p → next; g will be skeletonized to the parallelizable for(Ip = 0; Ip < N; Ip++)f Aval[Ip]++; g where Aval[] holds the embedding of the original list. It follows that the main goal of the skeletonization process is to embed pointer-based data structures into arrays. Though the skeletonized code is not semantically equivalent to the original code, it points out a possible parallelization pattern for this code segment and can be used as an effective parallelization hint to the programmer. We applied the method on several representative benchmarks from SPEC CPU 2000 and reached up to 80% performance gain after several sequential code segments had been manually parallelized based on the parallelization patterns of the generated skeletons. In a different set of experiments we tried to estimate the potential of skeletonization for a larger set of programs in SPEC 2000 and obtained an estimation of 27% additional loops that can be parallelized/vectorized due to skeletonization.
KW - Parallelization
KW - Skeletonization
KW - Vectorization
UR - http://www.scopus.com/inward/record.url?scp=84896868339&partnerID=8YFLogxK
U2 - 10.1145/10.1145/2555243.2555275
DO - 10.1145/10.1145/2555243.2555275
M3 - Conference contribution
SN - 9781450326568
T3 - Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
SP - 373
EP - 374
BT - PPoPP 2014 - Proceedings of the 2014 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
T2 - 2014 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2014
Y2 - 15 February 2014 through 19 February 2014
ER -