Peter Vanbroekhoven

Dynamic single assignment and pointer conversion

Although classic compiler optimizations like constant propagation, common subexpression elimination and dead code elimination certainly have their use within a compiler, they are not sufficient for optimizing code for modern processors. In a multi-processor environment, one may want to parallelize code such that parts of the code can be run in parallel and the code possibly executes faster. This however requires techniques quite different from those used in classic compiler optimizations. On vector processors, the same operation can be executed on a whole vector of operands at once. Also new processors sometimes include special instructions that allow executing an operation on multiple operand sets (typically 4, but depending on the size of the individual operands), like MMX and SSE instructions on x86. To get the maximum out of these facilities, we need techniques that are substantially different from classic compiler techniques. Also there is the growing performance gap between processors and main memory, and to bridge this gap we need techniques more advanced than classic compiler techniques.

If we look at each of the different examples above, i.e., parallelization, compiling for vector processors or MMX and SSE instruction sets, or optimizing memory use, a central theme is that we need to know how operations can be reordered without changing the functionality of the code. In case of parallelization, code executed in parallel with other code can be interleaved in any way so we can only parallelize if reordering is allowed. On vector machines we want to execute multiple operations in parallel. Besides the need to verify that this is allowed, one needs to reorder the code in order to group operations that can be executed in parallel. When optimizing memory use, we want to reorder operations e.g., in order to reduce the set of data that is live such that less memory is needed. One can see that for each of these optimizations one wants the maximum opportunity for reordering to be present, and have as accurate information about this available. And this is where dynamic single assignment and pointer conversion come in.

Code in dynamic single assignment (or DSA) form is such that during execution there is only a single assignment to each memory element, be it a scalar variable or each of the elements of an array. Another way to formulate this is that no value is ever overwritten. In multiple assignment code, we have to make sure all reads for a certain value happen before that value is overwritten. This clearly limits the reordering we can do on the code, and this we call dependencies. Transforming the code to DSA removes those dependencies by removing overwriting of values. The only restrictions on reordering is then that we cannot read a value before it is written. As such, DSA provides the maximum opportunity for reordering, aside from changes of algorithm (i.e., changes in data flow). In the field of parallelization, dynamic single assignment has been used by Featrier [1], but this work is both limited in applicability and scalability. The work of Kienhuis [2] extends on Feautrier's work, but still has many limitations. In our work we are trying to devise a new automatic method for transformation to DSA that both works for real size applications and that is extendible.

One of the constructs that currently cannot be transformored to DSA automatically is the use of pointers. The problems with pointers are that in a certain way they render the memory locations they point to anonymous, and that pointer arithmetic is allowed instead of direct indexing of memory. One approach is to handle pointers conservatively in analyses as either possibly pointing to anything or pointing to a certain set of data as determined by pointer analyses. Another approach is to convert pointer references to direct references to the arrays these pointers point to. In the context of hardware synthesis, this approach has been used [3], but the resulting program, though nicely mappable onto hardware, is not interesting for analyses since it derives basically nothing about data flow. We are working on extending this method such that the data flow is exposed.

For more details on dynamic single assignment, see [4]. Pointer conversion is part of future work.

[1] P. Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23-53, 1991.
[2] B. Kienhuis. Matparser: An array dataflow analysis compiler. Technical report, University of California, Berkeley, February 2000.
[3] L. Séméria and G. De Micheli. Spc: Synthesis of pointers in C: Application of pointer analysis to the behavioral synthesis from C. In proceedings of the International Conference on Computer-Aided Design ICCAD'98, pages 321-326, November 1998.
[4] P. Vanbroekhoven, G. Janssens, M. Bruynooghe, H. Corporaal, and F. Catthoor. A step towards a scalable dynamic single assignment conversion. Technical Report CW 360, KULeuven, April 2003.

Peter Vanbroekhoven

Last modified: Wed Aug 20 19:13:24 CEST 2003