In this lab you extend your ChocoPy code generator with translation of function definitions and function calls.
See the slides of Lecture 13 on code generation mechanics and the ChocoPy v2.2: RISC-V Implementation Guide for an explanation of the calling convention for ChocoPy.
In the previous lab you used rewrite rules to transform source ASTs to target ASTs. The transformations were context-free. That is, an addition operation can be translated without any information about the context in which it appears. This is not the case for transformations in a compiler. For example, compiling a variable in an expression requires knowing where it is stored on the stack, i.e. its offset from the frame pointer.
The slides of Lecture 13 introduce the use of dynamic rewrite rules in Stratego to implement such context-sensitive transformations. See the paper by Bravenboer et al. (2006) referenced on the lecture page for an overview, including applications of dynamic rules. We discuss some examples here that may be useful for your compiler.
A dynamic rewrite rule in Stratego is like a normal rewrite rule, but it is generated at run-time (dynamically) and some of its variables are bound in the context of its definition. For example, consider the following definition of a constant propagation transformation strategy that propagates assignments of constants to variables to uses of those variables (in straight-line code):
rules
prop-const =
PropConst
<+ prop-const-assign
<+ (all(prop-const); try(eval))
prop-const-assign =
Assign([Target(?x)], prop-const => e)
; if <is-value> e
then rules( PropConst : Var(x) -> e )
else rules( PropConst :- Var(x) )
end
eval : Add(Int(i), Int(j)) -> Int(<addS>(i, j))
is-value = ?Int(_)
The construct rules( PropConst : Var(x) -> e )
generates a new dynamic rule for the specific values of x
and e
found in the context.
Thus, if PropConst
is later applied to a concrete variable for which a rule has been defined, it is replaced with the corresponding right-hand side.
If a different rule had been defined previously for Var(x)
, the new rule overwrites the old one.
The construct rules( PropConst :- Var(x) )
undefines all dynamic PropConst
rules for Var(x)
.
We discuss some examples of applications of dynamic rules that can be useful in your compiler.
Dynamic rules can be used to implement a counter. For example, the following rules keep track of stack offsets:
rules
stack-set(|n) =
rules(Stack : () -> n); !n
stack-get =
<Stack>() <+ !0
stack-inc(|n) =
stack-set(|<add>(<stack-get>, n))
Dynamic rules can be scoped using the construct {| L : s}
, where L
is the label of the dynamic rule. For example, the stack rules above can be used in a transformation to generate subsequent stack offset and at end retrieve the total offset needed:
{| Stack
: stack-set(|0)
; <some-transformation> t => instrs
; size := <stack-get>
|}
After leaving the scope, all definitions for Stack
in the scope are eliminated and the value before the scope is available again. For example,
the code
stack-set(|10)
; debug(!"stack: ")
; {| Stack
: stack-set(|20)
; debug(!"stack: ")
|}
; stack-get
; debug(!"stack: ")
will print
stack: 10
stack: 20
stack: 10
Another application of dynamic rules is to distribute contex-sensitive information about bindings. For example, the following rules bind an offset with respect to the framepointer to a variable name:
rules
var-offset-set(|x, n) =
rules(VarOffset : x -> n)
var-offset-get :
x -> n
with <VarOffset> x => n
This can be used in handling the formal parameters of a function to associate with parameter the next offset in the stack:
rules
fun-arg :
TypedVar(x, t) -> offset
with var-offset-set(|x, <stack-get => offset>)
with stack-inc(|4)
Then, when translating a variable, the dynamic rule can be used to lookup its offset:
rules
exp-to-instrs-(|r, regs) :
Var(x) -> [Lw(r, <int-to-string>offset, "fp")]
with <var-offset-get>x => offset
The main objective of this lab is to implement the compilation of function definitions and function calls. We illustrate the process with the following example ChocoPy program:
def callee(x : int, y : int, z: int) -> int:
a : int = 1
b : int = 2
return x + y + z + a + b
def caller():
d : int = 0
d = callee(345, 4357, 235)
Compiling a function definition consists of generating code for setting up and breaking down an activation record (aka call frame).
This process follows a calling convention such that caller and callee agree about how to pass arguments to and return results from function calls.
The process starts with advancing the stack pointer register sp
so that the call frame has enough space for the local variables and tempories in the call.
Then, the return address (ra
) and frame pointer (fp
) registers are saved in the call frame so that they are not overwritten on subsequent calls.
Next, the local variables are initialized.
This is followed by the computation of the body of the function, leaving the return value in the a0
register.
Finally, the return address and frame pointer registers are restored, and the function jumps to the return address.
The function callee
above is translated as follows:
.globl $callee
$callee:
addi sp, sp, -@callee.size # reserve space for stack frame
sw ra, @callee.size-4(sp) # save return address
sw fp, @callee.size-8(sp) # save control link (fp)
sw fp, @callee.size(sp) # new fp is at old SP
li a0, 1 # initialize local variable a
sw a0, -16(fp)
li a0, 2 # initialize local variable b
sw a0, -12(fp)
lw a0, 8(fp) # load argument x
lw t0, 4(fp) # load argument y
add a0, a0, t0 # x + y
lw t0, 0(fp) # load argument z
add a0, a0, t0 # (x + y) + z
lw t0, -16(fp) # load local variable a
add a0, a0, t0 # (x + y + z) + a
lw t0, -12(fp) # load local variable b
add a0, a0, t0 # (x + y + z + a) + b
j label_47
mv a0, zero
j label_47
label_47:
.equiv @callee.size, 16
lw ra, -4(fp) # restore return address
lw fp, -8(fp) # restore frame pointer
addi sp, sp, @callee.size # restore stack pointer
jr ra # return to caller
In this translation you should compute the size of the stack frame needed to hold all local variables, temporaries, return address, and frame pointer
To compile variables bind the stack offsets of formal parameters and local variables and transfer these offsets from definition site to use site. (See the discussion about dynamic rewrite rules above.)
Given these offsets using a variable and updating a variable is a matter of loading the variable from the stack into a register (see the lw
instructions in the code above), and storing the value a register into the stack (see the sw
instructions in the code above).
Executing a function call requires computing the values of the actual parameters and transferring these to the code of the function following the calling convention. The default ChocoPy calling convention is to pass arguments on the stack in the same order as the function expects them.
For example, the following ChocoPy function call
def caller():
d : int = 0
d = callee(345, 4357, 235)
is translated to the following RISC-V code:
.globl $caller
$caller:
addi sp, sp, -@caller.size
sw ra, @caller.size-4(sp)
sw fp, @caller.size-8(sp)
sw fp, @caller.size(sp)
li a0, 0 # initialize local variable d
sw a0, -12(fp)
sw sp, -12(sp) # reserve space for arguments
li a0, 345 # evaluate first argument
sw a0, 12(sp) # push on stack
li a0, 4357 # evaluate second argument
sw a0, 8(sp) # push on stack
li a0, 235 # evaluate third argument
sw a0, 4(sp) # push on stack
jal $callee # call function
sw sp, 12(sp) # clean up stack
sw a0, -12(fp) # return value in a0
j label_48
mv a0, zero
j label_48
label_48:
.equiv @caller.size, 12
lw ra, -4(fp)
lw fp, -8(fp)
addi sp, sp, @caller.size
jr ra
When using registers to store the intermediate values of an expression, a function call in the middle of such an expression may overwrite such registers.
One approach for avoiding this is to lift function calls to the top-level statements in the context.
For example, the nested call to inc
in the following function
def caller():
d : int = 0
d = callee(345 + 81 + inc(13), 4357, 235)
can be lifted to top-level using an extra local variable:
def caller( ) :
d : int = 0
temp_2 : int = 0
temp_2 = inc(13)
d = callee(345 + 81 + temp_2, 4357, 235)
Define a transformation that lifts nested functions to top-level.
You can use <newname> "temp_"
to generate a unique new name.
The ChocoPy language includes an execution envionrment consisting of a suite of standard functions such as print
.
We provide implementations of these functions as a set of Stratego strategies in the project template.
Integrate this execution environment in the code your compiler generates.
Generating code in a systematic way will generate combinations of instructions that are not very efficient. For example, storing a register on the stack only to retrieve it from the stack in the next instruction. As a challenge, consider avoiding to generate such instructions or to transform the generated code
The ChocoPy calling convention uses the stack to pass all arguments to functions. Design an alternative calling convention that passes the first n
arguments via registers (for some small value of n
). For functions with few arguments or that doesn’t call other functions, this may be quite efficient. (But if a function calls another function, it may end saving the registers on the stack after all.)