Lecture 13: Functions, Calling Conventions, and Code Generation Mechanics

Eelco Visser
Lecture | PDF
December 03, 2020

In this lecture we study further code generation techniques.

Calling Conventions

We first study functions in ChocoPy and the operational semantics rules that define their meaning. Then we look at activation records, what motivates their existence, and how they are used to implement function calls. We reconstruct the RISC-V calling convention described in the ChocoPy language implementation guide, looking in detail at the implementation of an example caller and callee.

Dynamic Rewrite Rules

In the second part of the lecture, we study dynamic rewrite rules in Stratego, which can be used to define context-sensitive transformations. We look at examples for keeping track of stack offsets, and mapping variables to their offsets. See the paper by Bravenboer et al. (2006) for more information about dynamic rules.

Code Generation Mechanics

In the third part of the lecture, we take a broader look at mechanics for code generation, the properties we would like compilers to adhere to, and to what extend existing mechanisms support the verification of those properties out of the box.

  • Code generation by string manipulation
    • Stratego does support string templates, which can be useful to do quick code generation
  • Code generation by term transformation
    • This is what we do in this course. A
  • Guaranteeing syntactic correct target code
    • By means of type checking terms against signature; coming up for Stratego [SLE 2020]
  • Program transformation with concrete object syntax
    • Quoting concrete syntax, while transforming the underlying abstract syntax; see [GPCE 2002, OOPSLA 2004, SCP 2010]
  • Hygienic transformations (avoiding name capture)
    • Hygienic macros in Scheme/Racket ensure capture avoidance in binding constructs introduced by macros
    • The namefix approach guarantees capture avoidance by checking after the fact
    • We are working on avoiding capture in rename refactoring based on scope graphs
  • Guaranteeing type correct target code
    • In intrinsically-typed interpreters/compilers, the type safety of the transformation is guaranteed by construction; see [POPL 2018, 2021] for explorations of this approach in Agda
  • Preservation of dynamic semantics
    • The CompCert certified compiler proves that its code generation preserves the semantics of the source language in target programs; a challenge is how to realize such certified compilation with minimal effort

We only briefly touch on these topics. See the references for further information.

References

  • Intrinsically-Typed Compilation with Nameless Labels
    Proceedings of the ACM on Programming Languages 5(POPL) 2021 [bib, researchr]
  • PACMPL 2(POPL) 2018 [doi, bib, researchr, ]
    A definitional interpreter defines the semantics of an object language in terms of the (well-known) semantics of a host language, enabling understanding and validation of the semantics through execution. Combining a definitional interpreter with a separate type system requires a separate type safety proof. An alternative approach, at least for pure object languages, is to use a dependently-typed language to encode the object language type system in the definition of the abstract syntax. Using such intrinsically-typed abstract syntax definitions allows the host language type checker to verify automatically that the interpreter satisfies type safety. Does this approach scale to larger and more realistic object languages, and in particular to languages with mutable state and objects? In this paper, we describe and demonstrate techniques and libraries in Agda that successfully scale up intrinsically-typed definitional interpreters to handle rich object languages with non-trivial binding structures and mutable state. While the resulting interpreters are certainly more complex than the simply-typed λ-calculus interpreter we start with, we claim that they still meet the goals of being concise, comprehensible, and executable, while guaranteeing type safety for more elaborate object languages. We make the following contributions: (1) A dependent-passing style technique for hiding the weakening of indexed values as they propagate through monadic code. (2) An Agda library for programming with scope graphs and frames, which provides a uniform approach to dealing with name binding in intrinsically-typed interpreters. (3) Case studies of intrinsically-typed definitional interpreters for the simply-typed λ-calculus with references (STLC+Ref) and for a large subset of Middleweight Java (MJ).
  • SCP 75(7) 2010 [doi, bib, researchr, ]
    Software written in one language often needs to construct sentences in another language, such as SQL queries, XML output, or shell command invocations. This is almost always done using unhygienic string manipulation, the concatenation of constants and client-supplied strings. A client can then supply specially crafted input that causes the constructed sentence to be interpreted in an unintended way, leading to an injection attack. We describe a more natural style of programming that yields code that is impervious to injections by construction. Our approach embeds the grammars of the guest languages (e.g. SQL) into that of the host language (e.g. Java) and automatically generates code that maps the embedded language to constructs in the host language that reconstruct the embedded sentences, adding escaping functions where appropriate. This approach is generic, meaning that it can be applied with relative ease to any combination of context-free host and guest languages.
  • FUIN 69(1-2) 2006 [pdf, doi, bib, researchr, ]
    The applicability of term rewriting to program transformation is limited by the lack of control over rule application and by the context-free nature of rewrite rules. The first problem is addressed by languages supporting user-definable rewriting strategies. The second problem is addressed by the extension of rewriting strategies with scoped dynamic rewrite rules. Dynamic rules are defined at run-time and can access variables available from their definition context. Rules defined within a rule scope are automatically retracted at the end of that scope. In this paper, we explore the design space of dynamic rules, and their application to transformation problems. The technique is formally defined by extending the operational semantics underlying the program transformation language Stratego, and illustrated by means of several program transformations in Stratego, including constant propagation, bound variable renaming, dead code elimination, function inlining, and function specialization.
  • OOPSLA 2004 [doi, bib, researchr, ]
    Application programmer's interfaces give access to domain knowledge encapsulated in class libraries without providing the appropriate notation for expressing domain composition. Since object-oriented languages are designed for extensibility and reuse, the language constructs are often sufficient for expressing domain abstractions at the semantic level. However, they do not provide the right abstractions at the syntactic level. In this paper we describe MetaBorg, a method for providing concrete syntax for domain abstractions to application programmers. The method consists of embedding domain-specific languages in a general purpose host language and assimilating the embedded domain code into the surrounding host code. Instead of extending the implementation of the host language, the assimilation phase implements domain abstractions in terms of existing APIs leaving the host language undisturbed. Indeed, MetaBorg can be considered a method for promoting APIs to the language level. The method is supported by proven and available technology, i.e. the syntax definition formalism SDF and the program transformation language and toolset Stratego/XT. We illustrate the method with applications in three domains: code generation, XML generation, and user-interface construction.
  • GPCE 2002 [doi, bib, researchr, ]
    Meta programs manipulate structured representations, i.e., abstract syntax trees, of programs. The conceptual distance between the concrete syntax meta-programmers use to reason about programs and the notation for abstract syntax manipulation provided by general purpose (meta-) programming languages is too great for many applications. In this paper it is shown how the syntax definition formalism SDF can be employed to fit any meta-programming language with concrete syntax notation for composing and analyzing object programs. As a case study, the addition of concrete syntax to the program transformation language Stratego is presented. The approach is then generalized to arbitrary meta-languages.