Lab 12: Compiling Objects

Project
December 18, 2020

Develop code generation for ChocoPy classes and methods.

Objectives

  1. Compiling objects
  2. Compiling lists
  3. Challenges
    • Garbage collection
    • Register allocation
    • Peephole optimizations

Compiling Objects

The main objective for today’s lab is to extend your compiler to support the compilation of classes. See also the slides of Lecture 15 and the ChocoPy implementation guide.

Objects in ChocoPy

ChocoPy supports object-oriented programming through classes with attributes and methods. The following example program (from the reference manual) illustrates many of the features of classes in ChocoPy:

class animal(object):
  makes_noise:bool = False

  def make_noise(self: "animal") -> object:
    if (self.makes_noise):
      print(self.sound())

  def sound(self: "animal") -> str:
    return "???"

class cow(animal):
  def __init__(self: "cow"):
    self.makes_noise = True

  def sound(self: "cow") -> str:
    return "moo"

c:animal = None
c = cow()
c.make_noise()

A class defines a type and inherits from a superclass of which it is a subtype.

A class has attributes that are initialized with a literal value, which is None in the case of object types.

A class has method defintions, which are regular ChocoPy function definitions, except that methods always have a first argument (self) of the enclosing class type, which represents the objects to which the method is applied. Methods can have nested functions.

A method call e0.f(e1, ...) invokes a method on the object. The actual implementation called corresponds to the dynamic type of the object value. That is, when the object is of a subtype t of the static type of e0, and the method f has been overridden in t or a supertype of t, the method lowest in the class hierarchy is called. This is known as dynamic dispatch.

A function with the same name as the class (e.g. cow() in the example above) is used to create objects. When a class or one of its superclasses has an __init__ method, that method is called implicitly on construction of the object.

Object Layout and Prototypes

An object value is represented by a chunk of memory in the heap, consisting of a type tag, the size of the object, a pointer to the dispatch table (see below) of the class, and the values of the attributes of the class.

Objects are allocated by the alloc function in the execution environment, by copying a pointer to a prototype object. The prototype object for a class is an object following the layout described above with attribute values set to their initial values in the corresponding class. Thus, there is not need to perform the default initialization programmatically. After copying the prototype object, the __init__ method is invoked to apply further custom initialization.

Dispatch Tables

Since all object instance of a class share the same set of methods, there is no need to copy the addresses of the corresponding functions into each object. Instead, an object contains a pointer to a dispatch table, which contains the addresses (labels) of the functions implementing the methods. Note that the methods of inherited method are also included in the dispatch table.

Compiling Methods

Methods are standard top-level functions and can be compiled using the same approach as used for those. The first argument of a function is a pointer to the object to which the method is applied. Since methods and attributes are accessed through this self argument, no special measures are needed.

Calling Methods

Method calls are similar to top-level function calles, but the first argument is passed as a receiver object. That is, e0.f(e1, ...) is essentially a function call f(e0, e1, ...). However, the actual function to be called should be obtained through the dispatch table to which the value of e0 points. In order to access that object it should not be None.

Accessing Attributes

An attribute access e.x is implemented by accessing the object at the offset corresponding to the attribute x. Also for attribute accesses, the object should not be None.

Compiling Lists

ChocoPy has lists, which are mutable sequences of a fixed length. So, lists behave like arrays in C. Operations on lists are length len, indexing lst[i], and concatenation lst1 + lst2. Lists are represented as objects with len attributes to represent the elements of the list. Concatenation constructs a new list copying the original lists.

Challenges

Garbage Collection

Objects are allocated on the heap. What happens when your program runs out of heap memory to allocate objects? Is compiled code memory safe? Develop a garbage collection algorithm for the execution environment of ChocoPy progams in order to support automatic memory management.

Register Allocation

In Lecture 15 we discussed register allocation using graph coloring. Use a separate register allocation phase to assign temporary values to registers in order to avoid saving values to memory.

Peephole Optimization

A peephole optimization can be applied to generated code, i.e. to the AST of the generated program. A peephole optimization recognizes a couple of adjacent instructions and replaces them with something more efficient. Extend your compiler with a peephole optimization phase after code generation that optimizes generated code. Make sure that optimizations do preserve the semantics of the code.