Lecture 16: Data-Flow Analysis

Eelco Visser
Lecture | PDF
January 07, 2021

In this lecture we study data-flow analysis and the mathmetical framework of monotone frameworks, a foundation for the definition of data-flow analyses.

We also introduce the FlowSpec language for declarative definition of intra-procedural, flow-sensitive data-flow analyses. FlowSpec provides rules for mapping an AST to a control-flow graph and for defining the transfer functions associated with nodes in the graph.

References

  • Static program analysis is the art of reasoning about the behavior of computer programs without actually running them. This is useful not only in optimizing compilers for producing efficient code but also for automatic error detection and other tools that can help programmers. As known from Turing and Rice, all interesting properties of the behavior of programs written in common programming languages are mathematically undecidable. This means that automated reasoning of software generally must involve approximation. It is also well known that testing may reveal errors but not show their absence. In contrast, static program analysis can - with the right kind of approximations - check all possible executions of the programs and provide guarantees about their properties. The challenge when developing such analyses is how to ensure high precision and efficiency to be practically useful. This teaching material concisely presents the essential principles and algorithms for static program analysis. We emphasize a constraint-based approach where suitable constraint systems conceptually divide analysis into a front-end that generates constraints from program code and a back-end that solves the constraints to produce the analysis results. The style of presentation is intended to be precise but not overly formal. The readers are assumed to be familiar with advanced programming language concepts and the basics of compiler construction. The concepts are explained using a tiny imperative programming language, TIP, which suffices to illustrate the main challenges that arise with mainstream languages.
  • SLE 2017 [doi, bib, researchr, ]
    We present FlowSpec, a declarative specification language for the domain of dataflow analysis. FlowSpec has declarative support for the specification of control flow graphs of programming languages, and dataflow analyses on these control flow graphs. We define the formal semantics of FlowSpec, which is rooted in Monotone Frameworks. We also discuss a prototype implementation of the language, built in the Spoofax Language Workbench. Finally, we evaluate the expressiveness and conciseness of the language with two case studies. These case studies are analyses for Green-Marl, an industrial, domain-specific language for graph processing. The first case study is a classical dataflow analysis, scaled to this full language. The second case study is a domain-specific analysis of Green-Marl.