Lab 11: Compiling Statements & Nested Functions

Project
December 11, 2020

In this lab you develop code generation for statements and extend your implementation of functions to deal with nested functions.

See the slides of Lecture 14 for a discussion of static links and several critical example ChocoPy programs.

Objectives

  1. Compiling statements
  2. Compiling string constants
  3. Execution environment
  4. Compiling nested functions
  5. Challenges

Miscellaneous

Compiling Statements

Develop transformation rules for assignments and control-flow statements. Here is a sketch for a rule transforming the if statement.

rules

  stat-to-instrs-(|r, regs) :
    IfElse(e, Block(stats1), Else(Block(stats2))) -> <concat>[
      …
    ]    
    with <exp-to-instrs(|r, regs)> e => instrs0
    with <stats-to-instrs(|r, regs)> stats1 => instrs1
    with <stats-to-instrs(|r, regs)> stats2 => instrs2

String Constants

String constants in ChocoPy can be encoded as constant objects. We will discuss objects in more detail next week, but you can already translate string constants. For example

message : str = "hello"
target : str = "world"

print(message + " " + target)

A constant can be translated to a constant object. For example, the string constant "hello" becomes:

.globl const_286
const_286:
.word 3
.word 6
.word $str$dispatchTable
.word 5
.string "hello"
.align 2

You can then use the label (const_286 in this case) to load the string into a register.

  la a0, const_286  # load string constant

Execution Environment

From the reference compiler at you can obtain the implementation of the built-in functions of ChocoPy.

Compiling Nested Functions

Handling Shadowing: Making Names Unique

In ChocoPy definitions can shadow (make invisible) the definition of other names. You will have encountered this in your definition of a type system for Statix. For example, the following program defines two functions named foo and several variables named a:

a : int = 10
def foo(a: int) !- int:
 def foo(b : int) !- int:
 a : int = 20
 return a + b
 return foo(a + 10)

print(foo(a))

In order to make code generation unambiguous you can transform programs to make names unique. For example, the program above could be compiled to the following program, where the name of the enclosing function is used to disambiguate names:

$a : int = 10
def $foo($foo.a: int) !- int:
 def $foo.foo($foo.foo.b : int) !- int:
 $foo.foo.a : int = 20
 return $foo.foo.a + $foo.foo.b
 return $foo($foo.a + 10)

print($foo($a))

Note that this program does not parse, since $ is not part of the syntax of identifiers. But the approach can be used in abstract syntax, and the names can be used in the target RISC-V code.

Nested Functions

ChocoPy supports the definition of nested functions. That is, a function definition can be nested in another function defintion. This entails that they body of a nested function can access the parameters and local variables of the enclosing function(s). For example, here is a (contrived) program with nested functions:

a : int = 10

def foo(x : int) -> int:
  b : int = 0

  def aux(i : int) -> int:
    return b + i

  def bar(y : int) -> int:
    c : int = 0

    def baz(z : int) -> int:
      d : int = 0
      d = aux(c + 1)
      return a + x + y + z

    return baz(a + b + x)

  b = aux(x)
  return bar(b + 10)

print(foo(a))

Global variables can be accessed directly using their named label. Formal parameters and local variables can be accessed via their offset from the frame pointer. The parameters and local variables of enclosing functions are accessible to nested functions. They are stored in the call frame of the corresponding function, which is on the stack when a nested function is being executed. Thus, to access those variables we need to find the corresponding call frame. However, the call frame of the enclosing function is not necessarily the call frame of the caller of a function. For example, when a function is recursive, there may be multiple call frames separating the call frame of a nested function and that of its (statically) enclosing function.

A static link is an additional implicit argument of a function that points to the call frame of its directly enclosing function. Via the chain of static links, the call frames of all enclosing functions can be reached. See the slides of Lecture 14 for example ChocoPy programs with nested functions and the use of static links to implement them.

Making Nesting Explicit

Once you have added static links to your code generator, you can implement the calling of nested functions and access to local variables and formal parameters of nested functions. To do that correctly, you need to know how many static links to follow. One approach is to compute before code generation as a program transformation, annotating all function calls with the number of links to follow. For example, for the program above that would lead to:

a : int = 10

def foo(x : int) -> int:
  b : int = 0

  def aux(i : int) -> int:
    return b/1 + i/0

  def bar(y : int) -> int:
    c : int = 0

    def baz(z : int) -> int:
      d : int = 0
      d = aux/2(c/1 + 1)
      return a/0 + x/2 + y/1 + z/0

    return baz/0(a/0 + b/1 + x/1)

  b = aux/0(x/0)
  return bar/0(b/0 + 10)

print(foo/0(a/0))

Evaluation Order

A question to consider when compiling function calls is whether the order of evaluation of function arguments is relevant. For a language such as C, the order of evaluation is not defined, and up to the compiler to chose an order. This means that programmers cannot count on the order being deterministic. Is the order of evaluation of function call arguments defined for ChocoPy? Does that correspond to the semantics of Python?

Challenges

Boxed vs Unboxed Representation of Primitive Values

The ChocoPy reference implementation uses a boxed representation of strings, integers, and booleans. Is that necessary? Is it possible to use an unboxed representation?

Functions as First-Class Citizens

Functions in ChocoPy can be nested. This means that they can have ‘free variables’ that are defined in the context of the function definitions. But ChocoPy functions are not first-class citizens. Thus, one cannot pass a function to another function or store it in a data structure. Can you extend ChocoPy with functions as first-class citizens, i.e. add anonymous function literals to the language? What syntax would you give to such function literals? What would be your strategy for implementing this feature?