Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Jones N.D.Partial evaluation and automatic program generation.1999

.pdf
Скачиваний:
9
Добавлен:
23.08.2013
Размер:
1.75 Mб
Скачать

Generating a compiler generator: mix3 91

4.7Generating a compiler generator: mix3

We have seen how to use mix to generate a stand-alone compiler. In this section we shall demonstrate how a stand-alone compiler generator, cogen, is generated. The third Futamura projection is

cogen = [[mix]]L [mix, mix]

We claim that cogen is a compiler generator, that is, cogen applied to an interpreter yields a compiler. The claim is veri ed by

[[cogen]]L int =

[[([[mix]]L [mix, mix] )]]L int

by de nition of cogen

=

[[mix]]L [mix, int]

by the mix equation

since we already know that [[mix]]L [mix, int] yields a compiler. Furthermore, cogen has the interesting property that it is self-generating.

[[cogen]]L mix =

[[([[mix]]L [mix, mix] )]]L mix

by de nition of cogen

=

[[mix]]L [mix, mix]

by the mix equation

=

cogen

by de nition of cogen

We shall not describe cogen here, but its size and speed measures are given in Section 4.10 at the end of the chapter.

4.8The tricks under the carpet

4.8.1Successful self-application: binding-time analysis

For self-application to be successful it is essential to use a prephase called bindingtime analysis. Its output is a division: a classi cation of each program variable as static or dynamic. The program specializer uses the division to determine the static parts of the subject program in advance, instead of analysing the subject program on-line at program specialization time. As a consequence the results of self-application are much smaller and more e cient.

The important point is that the static parts of the subject program are determined prior to the specialization phase, and that the program specializer can use this information when it is run. We have found that supplying a division of the subject program's variables was the simplest way of communicating the necessary insight to mix.

Experiments have shown that specialization can be made still more e cient by annotating the subject program: all assignments and conditionals are marked as either eliminable or residual. This allows the program specializer to determine its actions at a very low cost. The subject will not be pursued here, but we will return to it in Chapter 7.

92 Partial Evaluation for a Flow Chart Language

4.8.2The use of base functions

We included plenty of base functions to make mix more readable and keep focus on the principles of program specialization. Base functions have been used to perform computation in places where it did not matter how the job was done; examples include lookup of variables, store updates, etc. Using base functions has made programming in L less cumbersome and program execution more e cient since a base function written in the underlying implementation language, Chez Scheme, runs faster than the corresponding interpreted L-instructions.

Partial evaluation of base function calls is done by another base function, reduce, which uses a trivial strategy: if all arguments in a base function call are static, evaluate the call completely. If some are dynamic then leave the base function call untouched, and replace the static argument expressions, if any, by their values. This approach works well when a base function is usually called with all arguments either static or dynamic, but when both static and dynamic arguments are present, useful information is likely to be wasted. One situation where it would not be bene cial to use this simple strategy is the lookup of a dynamic label in a static program, as described in the next section.

4.8.3Variables of bounded static variation

It often happens in partial evaluation that a variable seems dynamic since it depends on dynamic input, but only takes on nitely many values. In such cases a bit of reprogramming can yield much better results from partial evaluation. This kind of reprogramming, or program transformation, which does not alter the standard meaning of the program but leads to better residual programs is called a binding-time improvement. The term `improvement' of binding times refers to the goal of the transformation: that more computations can be classi ed as static to be reduced by the partial evaluator. The following shows a classical example seen in mix itself, and Chapter 12 gives a survey of the most common binding-time improvement techniques.

Consider the following line from the mix algorithm, which nds the basic block labelled by pp:

bb := lookup (pp, program);

Recall that pp is taken from pending and hence is dynamic. If lookup were implemented by a base function then bb would also be dynamic. However, since pp can only assume one of nitely many values (the labels that appear in the program), there is an alternative. We implement the lookup by the following loop:

The tricks under the carpet 93

pp0 := pp0;

(* rst label (static)

*)

while pp =6 pp0 do

 

 

pp0 := next label after pp';

(* pp0 remains static

*)

bb := basic block at label pp0;

(* bb is static too

*)

<computations involving bb>

 

 

Intuitively, mix compares the dynamic pp to all possible values it can assume (pp0, pp1 . . . ), and specializes code with respect to the corresponding outcome of the lookup (bb0, bb1 . . . ). The point is that the choice between the labels is done at run-time, but their range of values can be known at compile-time.

In the residual program of mix (e.g. the compiler) the ow chart equivalent (a lot of nested conditionals) of this appears:

case pp of

pp0: < code specialized with respect to bb0 > pp1: < code specialized with respect to bb1 >

. . .

ppn: < code specialized with respect to bbn > end case

In fact the trick of exploiting `statically bounded values' is necessary to avoid trivial self-application in the mix described above. In partial evaluators for more complex languages it is common for this to occur in several places, although here we apply the technique only to labels.

This trick is so common that it has been named The Trick.

4.8.4Transition compression on the y revisited

The size of the generated case-statement naturally depends on how well the value of pp can be approximated, that is, the size of the set of possible values fpp0,

. . . , ppng. A program point ppi should only be in the set, and thus contribute to the case-statement, if pp can assume the value ppi. We now address an important question: which values can pp (which is taken from pending) assume? A rst answer is: any label that appears in the subject program. This answer is de nitely safe but a smaller set also su ces. Since we initialize pending to f(pp0, vs0)g, the approximating set must contain pp0, which is assumed to be the rst label in the program. When mix encounters a residual if-statement: if exp goto pp0 else pp00, the algorithm adds specialized program points containing pp0, respectively pp00. This implies that all program points pp0 and pp00 appearing in the branches of a conditional should also be in the set approximating pp. Due to our simple transition compression strategy which compresses all other gotos, no others need appear in the approximating set.

As a consequence we use a base function find-blocks-in-pending to compute the set of basic blocks labelled by those program points pp that can appear in

94 Partial Evaluation for a Flow Chart Language

pending. The result is stored in blocks-in-pending. When mix takes a program point pp from pending and looks it up, it does not scan the whole program, only blocks-in-pending. (Romanenko gives the rst description of this idea [227]).

The reader might want to re-examine the interpreter in Figure 4.4 and the structure of the generated compiler in Figure 4.8 to see that the generated casestatement contains one branch for each program point among init, cont, jump that is either initial (init) or the target of a conditional jump with a dynamic condition (cont, jump).

The underlying reason why only three of the interpreter's 15 labels contribute to the case-statement is that transition compressing is done on the y. If mix had generated all the trivial gotos, and compressed them afterwards, many specialized program points, namely the targets of all gotos (and not only the targets of residual conditionals), would be added to pending during program specialization. This would mean that every program point pp0, being the target of a goto in the subject program, would be in the set fpp0, . . . , ppng. This advantage of compressing on the y is certainly not evident at rst sight: smaller residual programs are produced when mix is self-applied.

4.9The granularity of binding-time analysis

Until now we have assumed that task of binding-time analysis is to compute one division, valid at all program points. For most small programs it has turned out to be a very reasonable assumption that one division could be used to classify the status of a variable throughout the whole program. There are, however, a series of objections to this simpli ed view of BTA.

4.9.1Pointwise divisions

Consider the following program fragment where the initial division is (S; D):

read (X Y); init: X := X + 1;

Y := Y - 1; goto cont;

cont: Y := 3; next: . . .

Obviously a congruent, uniform division would have to be (S; D), but (judging from the shown fragment alone) a more detailed division init:(S; D), cont:(S; D), next:(S; S) would be safe. We shall call such a division pointwise.

As opposed to the simplest possible binding-time analysis computing uniform divisions (Section 4.4.6), an analysis to compute pointwise divisions will have to consider the control ow of the program. Constructing the algorithm is not hard

The granularity of binding-time analysis 95

and is left as an exercise.

The mix algorithm (Figure 4.7) needs a slight extension to handle pointwise divisions. The division argument should be replaced by a division-table and each time a specialized program point (pp, vs) has been fetched from pending the relevant division should be computed by division = lookup(pp,division-table). We have claimed that it is crucial for getting good results from self-application that the division is kept static. Since pp is static (Section 4.8.3), the use of pointwise divisions does not destroy this property.

4.9.2Live and dead static variables

The use of an imperative language introduces a serious problem: specialization with respect to dead static data. This problem is not directly caused by selfapplication, but it appears at every program specialization of `non{toy programs', that is, programs beyond a certain size. Consider a specialized program point (pp, vs) where, as usual, vs contains the values of those of the program values that are static. Some of these static values might be completely irrelevant for the computations to be performed at the program point pp. For a simple example, consider the program fragment:

start: if hdynamic conditioni

then a := 1; hcommands using ai; goto next else a := 2; hcommands using ai; goto next

next: hcommands not referring to ai;

As far as this program fragment is concerned the variable a is static, hence its value is in vs, even though that value is completely irrelevant to further computations. The unfortunate e ect is that the specialized program point (start, . . . ) has two successors: (next, ..1..) and (next, ..2..), even though the code pieces generated for these two successors are clearly identical.

Fortunately, this problem can be avoided by doing a live variable analysis [4] to recognize which static variables can a ect future computation or control ow, and by specializing program points only with respect to such variables. This analysis is performed by the base function find-projections which given the program text and the division computes the set of live static variables for each program point that can appear in pending. When find-projections is applied to the interpreter text and the corresponding division, the result is

(jump Nextlabel Q)

(cont Qtail Q)

showing that when the interpreter's control is at point jump, the only static variables that will be referenced (before they are rede ned) are Nextlabel and Q.

In the programs used as examples in this chapter we did not need to compute pointwise divisions as discussed in Section 4.9.1, but it was essential to the success-

96 Partial Evaluation for a Flow Chart Language

ful specialization of the larger programs (such as mix itself) to reclassify dead static variables as dynamic. We started out by computing a uniform division and the subsequent reclassi cation of the dead variables transformed the uniform division into a pointwise one.

4.9.3Polyvariant divisions

Even pointwise divisions can be accused of being overly restrictive, namely in the case where the S=D-status of a variable depends not only on the program point but also on how the program point was reached. An example (assume initial division (S; D)):

read (X Y);

init: if Y > 42 goto xsd else dyn dyn: X := Y;

goto xsd; xsd: X := X + 17;

. . .

A congruent, pointwise division would have to rule xsd:(D; D). A polyvariant division assigns to each label a set of divisions. For the above program, a congruent, polyvariant division would be: init:f(S; D)g, dyn:f(D; D)g, xsd:f(S; D),(D; D)g. A division which is not polyvariant is called monovariant.

Computing a polyvariant division is not hard, but how should mix exploit the more detailed information? When generating code for (pp, vs), there is a set of possible divisions for pp to choose from. Unless (pp, vs) is the initial specialized program point, there exists a (pp1, vs1) such that (pp, vs) 2 successors((pp1 , vs1)) and the proper division for vs depends on the division used for vs1. A way of keeping track of the divisions would be to extend the de nition of a specialized program point to include a division component. Then, by employing the trick from Section 4.8.3, the division would be static by self-application (ensuring good results).

A tempting alternative is to transform the source program prior to partial evaluation by duplicating the blocks that have more than one possible division. The above example is transformed into:

read (X Y);

 

init:

if Y

> 42 goto xsd-s else dyn

dyn:

X

:=

Y;

 

goto

xsd-d;

xsd-s: X

:=

X + 17;

. . .

xsd-d: X := X + 17;

. . .

The duplication would otherwise have been done during specialization by mix it-

Overview of mix performance 97

self, so the pre-transformation introduces no `extra' code duplication. The transformation can be regarded as specialization of the source program with respect to binding-time information, and its chief advantage is that the mix algorithm is kept simple (that is, no modi cation is needed).

In the programs in this chapter (including mix itself) we have not needed polyvariant divisions. The absence of functions, procedures, and subroutines does not encourage the modular programming style where one piece of code can be used in many di erent contexts, and that eliminates much of the need for polyvariance. The later chapters will contain examples where the need for a polyvariant division arises naturally.

4.10Overview of mix performance

Following are some program sizes and running times for a preliminary version of mix. Int is the Turing interpreter from Figure 4.4, source is the Turing program from Figure 4.3, and target is the result of compiling source from the Turing language to our language L (Figure 4.5). The run times are measured in Sun 3/50 cpu seconds using Chez Scheme, and include garbage collection.

Program

Size (#lines)

Ratio

Size (bytes)

Ratio

 

 

 

 

 

 

source

4

 

 

57

 

target

7

1.75

265

4.6

 

 

 

 

 

 

int

31

 

1.1

K

 

compiler

60

1.94

4.5

K

4.1

 

 

 

 

 

 

mix

65

 

2.8

K

 

compiler generator

126

1.94

15.0

K

5.4

 

 

 

 

 

 

 

Run

Time

Ratio

 

 

 

 

output

= [[int]]L [ source, data]

0.085

 

 

= [[target]]L d ata

0.010

8.5

target

= [[mix]]L [ int, source]

2.63

 

 

= [[compiler]]L s ource

0.27

9.7

compiler

= [[mix]]L [ mix, int]

28.90

 

 

= [[cogen]]L i nt

3.37

8.6

cogen

= [[mix]]L [ mix, mix]

59.30

 

 

= [[cogen]]L m ix

7.13

8.3

98 Partial Evaluation for a Flow Chart Language

4.11Summary and a more abstract perspective

This chapter discussed self-applicable partial evaluation of a ow chart language. In spite of the simplicity of this language, we shall see that almost all the concepts introduced here are also central to partial evaluation of more complex languages: functional languages, logic programming languages, imperative languages with recursive procedures, etc. Furthermore, essentially the same methods su ce for partial evaluation of such seemingly rather di erent languages.

The essence of program specialization

The previous sections were intentionally concrete and detailed. We now reduce the ideas involved to their essential core, as a preliminary to partial evaluation of more complex languages.

Almost all programming languages involve some form of state, which may be a pair (pp, store) as in the ow chart language; or (fname, environment) in a functional language, where fname is the name of the function currently being evaluated and environment binds actual parameters to their values; or (pname, argumentlist) in Prolog where pname is the name of the current procedure (predicate) and argumentlist is a list of terms, perhaps containing uninstantiated (free) variables.

Equally central is the concept of a state transition, e ected by a goto, a function call, or a predicate call. Each changes the state and control point, described symbolically by

(pp; v) ) (pp0; v0);

where p; p0 are control points and v; v0 are data such as stores or argument lists. Suppose there is a way to decompose or factor a data value v into static and

dynamic parts without loss of information. Such a data division can be thought of as a triple of functions (stat; dyn; pair), each mapping the set V of data values to itself. The ability to deompose and recompose without information loss can be expressed by three equations:

pair(stat(v); dyn(v)) = v

stat(pair(vs; vd))

=

vs

dyn(pair(vs; vd))

=

vd

Remarks. In this chapter a division was speci ed by an S , D vector, for instance SDSD speci ed the division of V = D4 where pair((a; c); (b; d)) = (a; b; c; d), stat((a; b; c; d)) = (a; c), and dyn((a; b; c; d)) = (b; d). More generally, o ine specializers as used in this chapter will use a single, prede ned division, whereas an online specializer will decide static and dynamic projections more dynamically.

Using the division, the transition can be decomposed into: (pp; v) ) (pp0; v0) = (pp; pair(vs; vd)) ) (pp0; pair(vs0 ; vd0 ))

This transition can be specialized by reassociating to get

Exercises 99

((pp; vs); vd) ) ((pp0; vs0 ); vd0 )

This is also a transition, but one with specialized control points (pp; vs) and (pp0; vs0 ), each incorporating some static data. The runtime data are vd, vd0 , the result of the dynamic projections.

Fundamental concepts revisited

Residual code generation amounts to nding commands or a function or procedure call which syntactically speci es the transition from vd to vd0 . If vd = vd0 then transition compression may be possible since no residual code beyond at most a control transfer need be generated. (For ow charts, this happens if the basic block begun by pp contains no dynamic expressions or commands.) Finally, we have seen the congruence condition to be needed for code generation. In the current context this becomes: vs0 must be functionally determined by vs in every transition.

4.12Exercises

Exercise 4.1 Write a program and choose a division such that partial evaluation without transition compression terminates, and partial evaluation with transition

compression on the y (as described in this chapter) loops.

2

Exercise 4.2 The purpose of this exercise is to investigate how much certain extensions to the ow chart language would complicate partial evaluation. For each construction, analyse possible problems and show the specialization time computations and the code generation

1.for loop,

2.while loop,

3.case/switch conditional,

4.computed goto, cgoto hExpri, where Expr evaluates to a natural number = a label,

5.gosub ... return.

Do any of the above constructions complicate the binding-time analysis?

2

Exercise 4.3 At the end of Section 4.2.2 it is mentioned that mix could generate residual programs in a low-level language, e.g. machine code.

1.Write the mix equation for a mix that generates machine code.

2.Do the Futamura projections still hold?

3.What are the consequences for compiler generation?

100 Partial Evaluation for a Flow Chart Language

2

Exercise 4.4 Consider the mix-generated program in Figure 4.5. The program is suboptimal in two ways; discuss how to revise the partial evaluation strategy to obtain an optimal residual program in this particular example.

1.The same conditional statement appears twice.

2.The assignments to the variable Left do not contribute to the nal answer.

Would the proposed revisions have any adverse e ects?

2

Exercise 4.5 Specialize the Turing interpreter with respect to to following program:

0:if B goto 3

1:right

2:goto 0

3:write 1

4:if B goto 7

5:left

6:goto 4

7:write 1

2

Exercise 4.6 The mix equation and the Futamura projections as presented in this chapter gloss over the fact that partial evaluation (here) consists of a bindingtime analysis phase and a specialization phase. Re ne the mix-equation and the

Futamura projections to re ect this two-phase approach.

2

Exercise 4.7 Will specialization of the Turing interpreter (Figure 4.4) with respect

a program p terminate for all Turing programs p?

2

Exercise 4.8 Use the algorithm in Section 4.4.6 to determine a congruent division for the Turing interpreter when the division for the input variables (Q, Right) is

1.(S; D),

2.(D; S).

2

Exercise 4.9 Write a binding time analysis algorithm that computes a pointwise

division.

2

Exercise 4.10

Write a binding time analysis algorithm that computes a polyvariant

division.

2

Exercise 4.11

The binding time analysis algorithm from Section 4.4.6 is not likely

to be very e cient in practice. Construct an e cient algorithm to do the same

job.

2