[162]  Solving quantified linear arithmetic by counterexampleguided instantiation. Formal Methods in System Design (FMSD). 2017. [ bib  list ] 
[161]  Refutationbased synthesis in SMT. Formal Methods in System Design (FMSD). 2017. [ bib  list ] 
[160]  Proactive Synthesis of Recursive TreetoString Functions from Examples. 31st European Conference on ObjectOriented Programming (ECOOP 2017). 2017. [ bib  DOI  http  list ] 
[159]  Towards a Compiler for Reals. ACM Trans. Program. Lang. Syst. (TOPLAS). 2017. [ bib  DOI  list ] 
[158]  ContractBased Resource Verification for Higherorder Functions with Memoization. ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL). 2017. [ bib  list ] 
[157]  SMTBased Checking of PredicateQualified Types for Scala. Scala Symposium. 2016. [ bib  list ] 
[156]  An Update on Deductive Synthesis and Repair in the Leon Tool. 5th Workshop on Synthesis. 2016. [ bib  list ] 
[155]  Translating Scala Programs to Isabelle/HOL (System Description). International Joint Conference on Automated Reasoning (IJCAR). 2016. [ps] [ bib  list ] 
[154]  Programming with Enumerable Sets of Structures. OOPSLA. 2015. [ bib  list ] 
[153]  Synthesizing Java Expressions from FreeForm Queries. OOPSLA. 2015. [ bib  list ] 
[152]  Automating Grammar Comparison. OOPSLA. 2015. [ bib  list ] 
[151]  Deductive Program Repair. ComputerAided Verification (CAV). 2015. [ bib  list ] 
[150]  Counterexample Guided Quantifier Instantiation for Synthesis in SMT. ComputerAided Verification (CAV). 2015. [ bib  list ] 
[149]  CounterExample Complete Verification for HigherOrder Functions. Scala Symposium. 2015. [ bib  list ] 
[148]  Sound Reasoning about Integral Data Types with a Reusable SMT Solver Interface. Scala Symposium. 2015. [ bib  list ] 
[147]  Interactive Synthesis Using FreeForm Queries (Tool Demonstration). International Conference on Software Engineering (ICSE). 2015. [ bib  list ] 
[146]  Developing Verified Software Using Leon (Invited Contribution). NASA Formal Methods (NFM). 2015. [ bib  list ] 
[145]  Induction for SMT Solvers. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2015. [ bib  list ] 
[144]  On Induction for SMT Solvers. EPFL Technical Report EPFLREPORT201755. 2014. [ bib  list ] 
[143]  On Synthesizing Code from FreeForm Queries. EPFL Technical Report EPFLREPORT201606. 2014. [ bib  list ] 
[142]  Synthesizing Functions from Relations in Leon (Invited Contribution). LogicBased Program Synthesis and Transformation (LOPSTR). 2014. [ bib  list ] 
[141]  SciFe: Scala Framework for Efficient Enumeration of Data Structures with Invariants. Scala Workshop. 2014. [ bib  list ] 
[140]  Checking Data Structure Properties Orders of Magnitude Faster. Runtime Verification (RV). 2014. [ bib  list ] 
[139]  Verifying and Synthesizing Software with Recursive Functions (Invited Contribution). 41st International Colloquium on Automata, Languages, and Programming (ICALP). 2014. [ps] [ bib  list ] 
[138]  Symbolic Resource Bound Inference for Functional Programs. Computer Aided Verification (CAV). 2014. [ps] [ bib  list ] 
[137]  On TemplateBased Inference of Rich Invariants in Leon. EPFL Technical Report EPFLREPORT190578. 2013. [ bib  list ] 
[136]  Sound Compilation of Reals. ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL). 2014. [ bib  list ] 
[135]  Synthesis Modulo Recursive Functions. OOPSLA. 2013. [ps] [ bib  list ] 
[134]  Game Programming by Demonstration. SPLASH Onward!. 2013. [ bib  list ] 
[133]  Interpolation for Synthesis on Unbounded Domains. Formal Methods in ComputerAided Design (FMCAD). 2013. [ bib  list ] 
[132]  Synthesis of FixedPoint Programs. Embedded Software (EMSOFT). 2013. [ps] [ bib  list ] 
[131]  Effect Analysis for Programs with Callbacks. Fifth Working Conference on Verified Software: Theories, Tools and Experiments. 2013. [ bib  list ] 
[130]  Classifying and Solving Horn Clauses for Verification. Fifth Working Conference on Verified Software: Theories, Tools and Experiments. 2013. [ bib  list ] 
[129]  Executing Specifications using Synthesis and Constraint Solving (Invited Talk). Runtime Verification (RV). 2013. [ps] [ bib  list ] 
[128]  An Overview of the Leon Verification System: Verification by Translation to Recursive Functions. Scala Workshop. 2013. [ps] [ bib  list ] 
[127]  Automatic Synthesis of OutofCore Algorithms. SIGMOD. 2013. [ps] [ bib  list ] 
[126]  On Verification by Translation to Recursive Functions. EPFL Technical Report EPFLREPORT186233. 2013. [ bib  list ] 
[125]  On Integrating Deductive Synthesis and Verification Systems. EPFL Technical Report EPFLREPORT186043. 2013. [ bib  list ] 
[124]  Disjunctive Interpolants for HornClause Verification. Computer Aided Verification (CAV). 2013. [ bib  list ] 
[123]  Complete Completion using Types and Weights. PLDI. 2013. [ bib  list ] 
[122]  Software Verification and Graph Similarity for Automated Evaluation of Students' Assignments. Information and Software Technology. 2013. [ bib  list ] 
[121] 
Reductions for Synthesis Procedures. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2013.
[ bib 
list ]
A synthesis procedure acts as a compiler for declarative specifications. It accepts a formula describing a relation between inputs and outputs, and generates a function implementing this relation. This paper presents synthesis procedures for data structures. Our procedures are reductions that lift a synthesis procedure for the elements into synthesis procedures for containers storing these elements. We introduce a framework to describe synthesis procedures as systematic applications of inference rules. We show that, by interpreting both synthesis problems and programs as relations, we can derive and modularly prove transformation rules that are widely applicable, thus simplifying both the presentation and the correctness argument.

[120] 
Certifying Solutions for Numerical Constraints. Runtime Verification (RV). 2012.
[ bib 
list ]
A large portion of software is used for numerical computation in mathematics, physics and engineering. Among the aspects that make verification in this domain difficult is the need to quantify numerical errors, such as roundoff errors and errors due to the use of approximate numerical methods. Much of numerical software uses selfstabilizing iterative algorithms, for example, to find solutions of nonlinear equations. To support such algorithms, we present a runtime verification technique that checks, given a nonlinear equation and a tentative solution, whether this value is indeed a solution to within a specified precision. Our technique combines runtime verification approaches with information about the analytical equation being solved. It is independent of the algorithm used for finding the solution and is therefore applicable to a wide range of problems. We have implemented our technique for the Scala programming language using our affine arithmetic library and the macro facility of Scala 2.10.

[119] 
Accelerating Interpolants. Automated Technology for Verification and Analysis (ATVA). 2012.
[ps]
[ bib 
list ]
We present CounterexampleGuided Accelerated Abstraction Refinement (CEGAAR), a new algorithm for verifying infinitestate transition systems. CEGAAR combines interpolationbased predicate discovery in counterexampleguided predicate abstraction with acceleration technique for computing the transitive closure of loops. CEGAAR applies acceleration to dynamically discovered looping patterns in the unfolding of the transition system, and combines overapproximation with underapproximation. It constructs inductive invariants that rule out an infinite family of spurious counterexamples, alleviating the problem of divergence in predicate abstraction without losing its adaptive nature. We present theoretical and experimental justification for the effectiveness of CEGAAR, showing that inductive interpolants can be computed from classical Craig interpolants and transitive closures of loops. We present an implementation of CEGAAR that verifies integer transition systems. We show that the resulting implementation robustly handles a number of difficult transition systems that cannot be handled using interpolationbased predicate abstraction or acceleration alone.

[118] 
A Verification Toolkit for Numerical Transition Systems (Tool Paper). 16th International Symposium on Formal Methods (FM). 2012.
[ bib 
list ]
This paper reports on an effort to create benchmarks and a toolkit for rigorous verification problems, simplifying tool integration and eliminating ambiguities of complex programming language constructs. We focus on Integer Numerical Transition Systems (INTS), which can be viewed as controlflow graphs whose edges are annotated by Presburger arithmetic formulas. We describe the syntax, semantics, a frontend, and a first release of benchmarks for such transition systems. Furthermore, we present Flata and Eldarica, two new verification tools for INTS. The Flata system is based on precise acceleration of the transition relation, while the Eldarica system is based on predicate abstraction with interpolationbased counterexampledriven refinement. The Eldarica verifier uses the Princess theorem prover as a sound and complete interpolating prover for Presburger arithmetic. Both systems can solve several examples for which previous approaches failed and present a useful baseline for verifying integer programs. Our infrastructure is publicly available; we hope that it will spur further research, benchmarking, competitions, and synergistic communication between verification tools.

[117] 
Synthesis for Unbounded Bitvector Arithmetic. International Joint Conference on Automated
Reasoning (IJCAR). 2012.
[ bib 
list ]
We propose to describe computations using QFPAbit, a language of quantifierfree linear arithmetic on unbounded integers with bitvector operations. We describe an algorithm that, given a QFPAbit formula with input and output variables denoting integers, generates an efficient function from a sequence of inputs to a sequence of outputs, whenever such function on integers exists. The starting point for our method is a polynomialtime translation mapping a QFPAbit formula into the sequential circuit that checks the correctness of the input/output relation. From such a circuit, our synthesis algorithm produces solved circuits from inputs to outputs that are no more than singly exponential in size of the original formula. In addition to the general synthesis algorithm, we present techniques that ensure that, for example, multiplication and division with large constants do not lead to an exponential blowup, addressing a practical problem with a previous approach that used the MONA tool to generate the specification automata.

[116] 
Speculative Linearizability. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2012.
[ps]
[ bib 
list ]
Linearizability is a key design methodology for reasoning about implementations of concurrent abstract data types in both shared memory and message passing systems. It provides the illusion that operations execute sequentially and faultfree, despite the asynchrony and faults inherent to a concurrent system, especially a distributed one. A key property of linearizability is interobject composability: a system composed of linearizable objects is itself linearizable. However, devising linearizable objects is very difficult, requiring complex algorithms to work correctly under general circumstances, and often resulting in bad averagecase behavior. Concurrent algorithm designers therefore resort to speculation: optimizing algorithms to handle common scenarios more efficiently. The outcome are even more complex protocols, for which it is no longer tractable to prove their correctness. To simplify the design of efficient yet robust linearizable protocols, we propose a new notion: speculative linearizability. This property is as general as linearizability, yet it allows intraobject composability: the correctness of independent protocol phases implies the correctness of their composition. In particular, it allows the designer to focus solely on the proof of an optimization and derive the correctness of the overall protocol from the correctness of the existing, nonoptimized one. Our notion of protocol phases allows processes to independently switch from one phase to another, without requiring them to reach agreement to determine the change of a phase. To illustrate the applicability of our methodology, we show how examples of speculative algorithms for shared memory and asynchronous message passing naturally fit into our framework. We rigorously define speculative linearizability and prove our intraobject composition theorem in a tracebased as well as an automatonbased model. To obtain a further degree of confidence, we also formalize and mechanically check the theorem in the automatonbased model, using the I/O automata framework within the Isabelle interactive proof assistant.

[115] 
Software Synthesis Procedures. Communications of the ACM. 2012.
[ps]
[ bib 
list ]
Automated synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages software synthesis algorithms should behave in a predictable way: they should succeed for a welldefined class of specifications. We propose to systematically generalize decision procedures into synthesis procedures, and use them to compile implicitly specified computations embedded inside functional and imperative programs. Synthesis procedures are predictable, because they are guaranteed to find code that satisfies the specification whenever such code exists. To illustrate our method, we derive synthesis procedures by extending quantifier elimination algorithms for integer arithmetic and set data structures. We then show that an implementation of such synthesis procedures can extend a compiler to support implicit value definitions and advanced pattern matching.

[114] 
Functional Synthesis for Linear Arithmetic and Sets. Software Tools for Technology Transfer (STTT). 2012.
[ bib 
list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages, synthesis algorithms should behave in a predictable waythey should succeed for a welldefined class of specifications. To guarantee correctness and applicability to software (and not just hardware), these algorithms should also support unbounded data types, such as numbers and data structures. To obtain appropriate synthesis algorithms, we propose to generalize decision procedures into predictable and complete synthesis procedures. Such procedures are guaranteed to find code that satisfies the specification if such code exists. Moreover, we identify conditions under which synthesis will statically decide whether the solution is guaranteed to exist, and whether it is unique. We demonstrate our approach by starting from a quantifier elimination decision procedure for Boolean Algebra of set with Presburger Arithmetic (BAPA) and transforming it into a synthesis procedure. Our procedure also works in the presence of parametric coefficients. We establish results on the size and the efficiency of the synthesized code. We show that such procedures are useful as a language extension with implicit value definitions, and we show how to extend a compiler to support such definitions. Our constructs provide the benefits of synthesis to programmers, without requiring them to learn new concepts, give up a deterministic execution model, or provide code skeletons.

[113] 
Deciding Functional Lists with Sublist Sets. Verified Software: Theories, Tools and Experiments (VSTTE). 2012.
[ps]
[ bib 
list ]
Motivated by the problem of deciding verification conditions for the verification of functional programs, we present new decision procedures for automated reasoning about functional lists. We first show how to decide in NP the satisfiability problem for logical constraints containing equality, constructor, selectors, as well as the transitive sublist relation. We then extend this class of constraints with operators to compute the set of all sublists, and the set of objects stored in a list. Finally, we support constraints on sizes of sets, which gives us the ability to compute list length as well as the number of distinct list elements. We show that the extended theory is reducible to the theory of sets with linear cardinality constraints, and therefore still in NP. This reduction enables us to combine our theory with other decidable theories that impose constraints on sets of objects, which further increases the potential of our decidability result in verification of functional and imperative software.

[112] 
Development and Evaluation of LAV: an SMTBased Error Finding Platform. Verified Software: Theories, Tools and Experiments (VSTTE). 2012.
[ps]
[ bib 
list ]
We present design and evaluation of LAV, a new opensource tool for statically checking program assertions and errors. LAV integrates into the popular LLVM infrastructure for compilation and analysis. LAV uses symbolic execution to construct a firstorder logic formula that models the behavior of each basic blocks. It models the relationships between basic blocks using propositional formulas. By combining these two kinds of formulas LAV generates polynomialsized verification conditions for loopfree code. It uses underapproximating or overapproximating unrolling to handle loops. LAV can pass generated verification conditions to one of the several SMT solvers: Boolector, MathSAT, Yices, and Z3. Our experiments with small 200 benchmarks suggest that LAV is competitive with related tools, so it can be used as an effective alternative for certain verification tasks. The experience also shows that LAV provides significant help in analyzing student programs and providing feedback to students in everyday university practice.

[111] 
Constraints as Control. ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL). 2012.
[ps]
[ bib 
list ]
We present an extension of Scala that supports constraint programming over bounded and unbounded domains. The resulting language, Kaplan, provides the benefits of constraint programming while preserving the existing features of Scala. Kaplan integrates constraint and imperative programming by using constraints as an advanced control structure; the developers use the monadic 'for' construct to iterate over the solutions of constraints or branch on the existence of a solution. The constructs we introduce have simple semantics that can be understood as explicit enumeration of values, but are implemented more efficiently using symbolic reasoning. Kaplan programs can manipulate constraints at runtime, with the combined benefits of typesafe syntax trees and firstclass functions. The language of constraints is a functional subset of Scala, supporting arbitrary recursive function definitions over algebraic data types, sets, maps, and integers. Our implementation runs on a platform combining a constraint solver with a standard virtual machine. For constraint solving we use an algorithm that handles recursive function definitions through fair function unrolling and builds upon the stateofthe art SMT solver Z3. We evaluate Kaplan on examples ranging from enumeration of data structures to execution of declarative specifications. We found Kaplan promising because it is expressive, supporting a range of problem domains, while enabling fullspeed execution of programs that do not rely on constraint programming.

[110] 
Trustworthy Numerical Computation in Scala. OOPSLA. 2011.
[ps]
[ bib 
list ]
Modern computing has adopted the floating point type as a default way to describe computations with real numbers. Thanks to dedicated hardware support, such computations are efficient on modern architectures, even in double precision. However, rigorous reasoning about the resulting programs remains difficult. This is in part due to a large gap between the finite floating point representation and the infiniteprecision realnumber semantics that serves as the developers' mental model. Because programming languages do not provide support for estimating errors, some computations in practice are performed more and some less precisely than needed. We present a library solution for rigorous arithmetic computation. Our numerical data type library tracks a (double) floating point value, but also a guaranteed upper bound on the error between this value and the ideal value that would be computed in the realvalue semantics. Our implementation involves a set of linear approximations based on an extension of affine arithmetic. The derived approximations cover most of the standard mathematical operations, including trigonometric functions, and are more comprehensive than any publicly available ones. Moreover, while interval arithmetic rapidly yields overly pessimistic estimates, our approach remains precise for several computational tasks of interest. We evaluate the library on a number of examples from numerical analysis and physical simulations. We found it to be a useful tool for gaining confidence in the correctness of the computation.

[109] 
Satisfiability Modulo Recursive Programs. Static Analysis Symposium (SAS). 2011.
[ps]
[ bib 
list ]
We present a semidecision procedure for checking expressive correctness properties of recursive firstorder functional programs. In our approach, both properties and programs are expressed in the same language, which is a subset of Scala. We have implemented our procedure and integrated it with the Z3 SMT solver and the Scala compiler. Our procedure is sound for proofs and counterexamples. It is terminating and thus complete for many important classes of specifications, including 1) all verification problems containing counterexamples, 2) functions annotated with inductive postconditions, and 3) abstraction functions and invariants of recursive data types that commonly arise in functional programs. Using our system, we verified detailed correctness properties for functional data structure implementations, as well as Scala syntax tree manipulations. We have found our system to be fast for both finding counterexamples and finding correctness proofs, and to scale to larger programs than alternative techniques.

[108] 
An Efficient Decision Procedure for Imperative Tree Data Structures. ComputerAideded Deduction (CADE). 2011.
[ bib 
list ]
We present a new decidable logic called TREX for expressing constraints about imperative tree data structures. In particular, TREX supports a transitive closure operator that can express reachability constraints, which often appear in data structure invariants. We show that our logic is closed under weakest precondition computation, which enables its use for automated software verification. We further show that satisfiability of formulas in TREX is decidable in NP. The low complexity makes it an attractive alternative to more expensive logics such as monadic secondorder logic (MSOL) over trees, which have been traditionally used for reasoning about tree data structures.

[107] 
Scala to the Power of Z3: Integrating SMT and Programming. ComputerAideded Deduction (CADE) Tool Demo. 2011.
[ bib 
list ]
We describe a system that integrates the SMT solver Z3 with the Scala programming language. The system supports the use of the SMT solver for checking satisfiability, unsatisfiability, as well as solution enumeration. The embedding of formula trees into Scala uses the host type system of Scala to prevent the construction of certain illtyped constraints. The solution enumeration feature integrates into the iteration constructions of Scala and supports writing nondeterministic programs. Using Z3's mechanism of theory extensions, our system also helps users construct custom constraint solvers where the interpretation of predicates and functions is given as Scala code. The resulting system preserves the productivity advantages of Scala while simplifying tasks such as combinatorial search.

[106] 
Interactive Synthesis of Code Snippets. Computer Aided Verification (CAV) Tool Demo. 2011.
[ bib 
list ]
We describe a tool that applies theorem proving technology to synthesize code fragments that use given library functions. To determine candidate code fragments, our approach takes into account polymorphic type constraints as well as test cases. Our tool interactively displays a ranked list of suggested code fragments that are appropriate for the current program point. We have found our system to be useful for synthesizing code fragments for common programming tasks, and we believe it is a good platform for exploring software synthesis techniques.

[105] 
Sets with Cardinality Constraints in Satisfiability Modulo Theories. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2011.
[ps]
[ bib 
list ]
Boolean Algebra with Presburger Arithmetic (BAPA) is a decidable logic that can express constraints on sets of elements and their cardinalities. Problems from verification of complex properties of software often contain fragments that belong to quantifierfree BAPA (QFBAPA). In contrast to many other NPcomplete problems (such as quantifierfree firstorder logic or linear arithmetic), the applications of QFBAPA to a broader set of problems has so far been hindered by the lack of an efficient implementation that can be used alongside other efficient decision procedures. We overcome these limitations by extending the efficient SMT solver Z3 with the ability to reason about cardinality (QFBAPA) constraints. Our implementation uses the DPLL(T) mechanism of Z3 to reason about the toplevel propositional structure of a QFBAPA formula, improving the efficiency compared to previous implementations. Moreover, we present a new algorithm for automatically decomposing QFBAPA formulas. Our algorithm alleviates the exponential explosion of considering all Venn regions, significantly improving the tractability of formulas with many set variables. Because it is implemented as a theory plugin, our implementation enables Z3 to prove formulas that use QFBAPA constructs with constructs from other theories that Z3 supports, as well as with quantifiers. We have applied our implementation to the verification of functional programs; we show it can automatically prove formulas that no automated approach was reported to be able to prove before.

[104] 
Towards Complete Reasoning about Axiomatic Specifications. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2011.
[ps]
[ bib 
list ]
To support verification of expressive properties of functional programs, we consider algebraic style specifications that may relate multiple userdefined functions, and compare multiple invocations of a function for different arguments. We present decision procedures for reasoning about such universally quantified properties of functional programs, using local theory extension methodology. We establish new classes of universally quantified formulas whose satisfiability can be checked in a complete way by finite quantifier instantiation. These classes include singleinvocation axioms that generalize standard function contracts, but also certain manyinvocation axioms, specifying that functions satisfy congruence, injectivity, or monotonicity with respect to abstraction functions, as well as conjunctions of some of these properties. These manyinvocation axioms can specify correctness of abstract data type implementations as well as certain informationflow properties. We also present a decidabilitypreserving construction that enables the same function to be specified using different classes of decidable specifications on different partitions of its domain.

[103]  On Satisfiability Modulo Computable Functions. EPFL Technical Report EPFLREPORT. 2010. [ bib  list ] 
[102]  On Rigorous Numerical Computation as a Scala Library. EPFL Technical Report EPFLREPORT158754. 2010. [ bib  list ] 
[101] 
Phantm: PHP Analyzer for Type Mismatch (Research Demonstration). ACM SIGSOFT Conference on Foundations of Software Engineering (FSE). 2010.
[ps]
[ bib 
list ]
We present Phantm, a static analyzer that uses a flowsensitive analysis to detect type errors in PHP applications. Phantm can infer types for nested arrays, and can leverage runtime information and procedure summaries for more precise results. Phantm found over 200 true problems when applied to three applications with over 50'000 lines of code, including the popular DokuWiki code base.

[100] 
Runtime Instrumentation for Precise FlowSensitive Type Analysis. International Conference on Runtime Verification. 2010.
[ps]
[ bib 
list ]
We describe a combination of runtime information and static analysis for checking properties of complex and configurable systems. The basic idea of our approach is to 1) let the program execute and thereby read the important dynamic configuration data, then 2) invoke static analysis from this runtime state to detect possible errors that can happen in the continued execution. This approach improves analysis precision, particularly with respect to types of global variables and nested data structures. It also enables the resolution of modules that are loaded based on dynamically computed information. We describe an implementation of this approach in a tool that statically computes possible types of variables in PHP applications, including detailed types of nested maps (arrays). PHP is a dynamically typed language; PHP programs extensively use nested value maps, as well as 'include' directives whose arguments are dynamically computed file names. We have applied our analysis tool to over 50'000 lines of PHP code, including the popular DokuWiki software, which has a plugin architecture. The analysis identified 200 problems in the code and in the type hints of the original source code base. Some of these problems can cause exploits, infinite loops, and crashes. Our experiments show that dynamic information simplifies the development of the analysis and decreases the number of false alarms compared to a purely static analysis approach.

[99] 
Synthesis for Regular Specifications over Unbounded Domains. FMCAD. 2010.
[ps]
[ bib 
list ]
Synthesis from declarative specifications is an ambitious automated method for obtaining systems that are correct by construction. Previous work includes synthesis of reactive finitestate systems from linear temporal logic and its fragments. Further recent work focuses on a different application area by doing functional synthesis over unbounded domains, using a modified Presburger arithmetic quantifier elimination algorithm. We present new algorithms for functional synthesis over unbounded domains based on automatatheoretic methods, with advantages in the expressive power and in the efficiency of synthesized code. Our approach synthesizes functions that meet given regular specifications defined over unbounded sequences of input and output bits. Thanks to the translation from weak monadic secondorder logic to automata, this approach supports full Presburger arithmetic as well as bitwise operations on arbitrary length integers. The presence of quantifiers enables finding solutions that optimize a given criterion. Unlike synthesis of reactive systems, our notion of realizability allows functions that require examining the entire input to compute the output. Regardless of the complexity of the specification, our algorithm synthesizes lineartime functions that read the input and directly produce the output. We also describe a technique to synthesize functions with bounded lookahead when possible, which is appropriate for streaming implementations. We implemented our synthesis algorithm and show that it synthesizes efficient functions on a number of examples.

[98] 
Ordered Sets in the Calculus of Data Structures (Invited Paper). CSL. 2010.
[ps]
[ bib 
list ]
Our goal is to identify families of relations that are useful for reasoning about software. We describe such families using decidable quantifierfree classes of logical constraints with a rich set of operations. A key challenge is to define such classes of constraints in a modular way, by combining multiple decidable classes. Working with quantifierfree combinations of constraints makes the combination agenda more realistic and the resulting logics more likely to be tractable than in the presence of quantifiers. Our approach to combination is based on reducing decidable fragments to a common class, Boolean Algebra with Presburger Arithmetic (BAPA). This logic was introduced by Feferman and Vaught in 1959 and can express properties of uninterpreted sets of elements, with set algebra operations and equicardinality relation (consequently, it can also express Presburger arithmetic constraints on cardinalities of sets). Combination by reduction to BAPA allows us to obtain decidable quantifierfree combinations of decidable logics that share BAPA operations. We use the term Calculus of Data Structures to denote a family of decidable constraints that reduce to BAPA. This class includes, for example, combinations of formulas in BAPA, weak monadic secondorder logic of ksuccessors, twovariable logic with counting, and term algebras with certain homomorphisms. The approach of reduction to BAPA generalizes the NelsonOppen combination that forms the foundation of constraint solvers used in software verification. BAPA is convenient as a target for reductions because it admits quantifier elimination and its quantifierfree fragment is NPcomplete. We describe a new member of the Calculus of Data Structures: a quantifierfree fragment that supports 1) boolean algebra of finite and infinite sets of real numbers, 2) linear arithmetic over real numbers, 3) formulas that can restrict chosen set or element variables to range over integers (providing, among others, the power of mixed integer arithmetic and sets of integers), 4) the cardinality operators, stating whether a given set has a given finite cardinality or is infinite, 5) infimum and supremum operators on sets. Among the applications of this logic are reasoning about the externally observable behavior of data structures such as sorted lists and priority queues, and specifying witness functions for the BAPA synthesis problem. We describe an abstract reduction to BAPA for our logic, proving that the satisfiability of the logic is in NP and that it can be combined with the other fragments of the Calculus of Data Structures.

[97] 
On Deciding Functional Lists with Sublist Sets. EPFL Technical Report EPFLREPORT148361. 2010.
[ps]
[ bib 
list ]
Motivated by the problem of deciding verification conditions for the verification of functional programs, we present new decision procedures for automated reasoning about functional lists. We first show how to decide in NP the satisfiability problem for logical constraints containing equality, constructor, selectors, as well as the transitive sublist relation. We then extend this class of constraints with operators to compute the set of all sublists, and the set of objects stored in a list. Finally, we support constraints on sizes of sets, which gives us the ability to compute list length as well as the number of distinct list elements. We show that the extended theory is reducible to the theory of sets with linear cardinality constraints, and therefore still in NP. This reduction enables us to combine our theory with other decidable theories that impose constraints on sets of objects, which further increases the potential of our decidability result in verification of functional and imperative software.

[96] 
On Locality of OneVariable Axioms and Piecewise Combinations. EPFL Technical Report EPFLREPORT148180. 2010.
[ps]
[ bib 
list ]
Local theory extensions provide a complete and efficient way for reasoning about satisfiability of certain kinds of universally quantified formulas modulo a background theory. They are therefore useful in automated reasoning, software verification, and synthesis. In this paper, we 1) provide a sufficient condition for locality of axioms with one variable specifying functions and relations, 2) show how to obtain local axiomatizations from nonlocal ones in some cases, 3) show how to obtain piecewise combination of different local theory extensions.

[95] 
MUNCH  Automated Reasoner for Sets and Multisets (System Description). IJCAR. 2010.
[ps]
[ bib 
list ]
This system description provides an overview of the MUNCH reasoner for sets and multisets. MUNCH takes as the input a formula in a logic that supports expressions about sets, multisets, and integers. Constraints over collections and integers are connected using the cardinality operator. Our logic is a fragment of logics of popular interactive theorem provers, and MUNCH is the first fully automated reasoner for this logic. MUNCH reduces input formulas to equisatisfiable linear integer arithmetic formulas. MUNCH reasoner is publicly available. It is implemented in the Scala programming language and currently uses the SMT solver Z3 to solve the generated integer linear arithmetic constraints.

[94] 
Comfusy: Complete Functional Synthesis (Tool Presentation). CAV. 2010.
[ps]
[ bib 
list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. We present Comfusy, a tool that extends the compiler for the generalpurpose programming language Scala with (nonreactive) functional synthesis over unbounded domains. Comfusy accepts expressions with input and output variables specifying relations on integers and sets. Comfusy symbolically computes the precise domain for the given relation and generates the function from inputs to outputs. The outputs are guaranteed to satisfy the relation whenever the inputs belong to the relation domain. The core of our synthesis algorithm is an extension of quantifier elimination that generates programs to compute witnesses for eliminated variables. We present examples that demonstrate software synthesis using Comfusy, and illustrate how synthesis simplifies software development.

[93] 
Complete Functional Synthesis. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2010.
[ps]
[ bib 
list ]
Synthesis of program fragments from specifications can make programs easier to write and easier to reason about. To integrate synthesis into programming languages, synthesis algorithms should behave in a predictable waythey should succeed for a welldefined class of specifications. They should also support unbounded data types such as numbers and data structures. We propose to generalize decision procedures into predictable and complete synthesis procedures. Such procedures are guaranteed to find code that satisfies the specification if such code exists. Moreover, we identify conditions under which synthesis will statically decide whether the solution is guaranteed to exist, and whether it is unique. We demonstrate our approach by starting from decision procedures for linear arithmetic and data structures and transforming them into synthesis procedures. We establish results on the size and the efficiency of the synthesized code. We show that such procedures are useful as a language extension with implicit value definitions, and we show how to extend a compiler to support such definitions. Our constructs provide the benefits of synthesis to programmers, without requiring them to learn new concepts or give up a deterministic execution model.

[92] 
On Decision Procedures for Ordered Collections. EPFL Technical Report LARAREPORT2010001. 2010.
[ps]
[ bib 
list ]
We describe a decision procedure for a logic that supports 1) finite collections of elements (sets or multisets) 2) the cardinality operator 3) a total order relation on elements, 4) min and max operators on entire collections. Among the applications of this logic are 1) reasoning about the externally observable behavior of data structures such as random access priority queues, 2) specifying witness functions for synthesis problem of set algebra, and 3) reasoning about constraints on orderings arising in termination proofs.

[91] 
Test Generation through Programming in UDITA. International Conference on Software Engineering (ICSE). 2010.
[ps]
[ bib 
http 
list ]
We present an approach for describing tests using nondeterministic test generation programs. To write such programs, we introduce UDITA, a Javabased language with nondeterministic choice operators and an interface for generating linked structures. We also describe new algorithms that generate concrete tests by efficiently exploring the space of all executions of nondeterministic UDITA programs. We implemented our approach and incorporated it into the official, publicly available repository of Java PathFinder (JPF), a popular tool for verifying Java programs. We evaluate our technique by generating tests for data structures, refactoring engines, and JPF itself. Our experiments show that test generation using UDITA is faster and leads to test descriptions that are easier to write than in previous frameworks. Moreover, the novel execution mechanism of UDITA is essential for making test generation feasible. Using UDITA, we have discovered a number of bugs in Eclipse, NetBeans, Sun javac, and JPF.

[90] 
Predicting and Preventing Inconsistencies in Deployed Distributed Systems. ACM Transactions on Computer Systems. 2010.
[ps]
[ bib 
list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This article describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

[89] 
Building a Calculus of Data Structures (invited paper). Verification, Model Checking, and Abstract Interpretation (VMCAI). 2010.
[ps]
[ bib 
list ]
Techniques such as verification condition generation, predicate abstraction, and expressive type systems reduce software verification to proving formulas in expressive logics. Programs and their specifications often make use of data structures such as sets, multisets, algebraic data types, or graphs. Consequently, formulas generated from verification also involve such data structures. To automate the proofs of such formulas we propose a logic (a 'calculus') of such data structures. We build the calculus by starting from decidable logics of individual data structures, and connecting them through functions and sets, in ways that go beyond the frameworks such as NelsonOppen. The result are new decidable logics that can simultaneously specify properties of different kinds of data structures and overcome the limitations of the individual logics. Several of our decidable logics include abstraction functions that map a data structure into its more abstract view (a tree into a multiset, a multiset into a set), into a numerical quantity (the size or the height), or into the truth value of a candidate data structure invariant (sortedness, or the heap property). For algebraic data types, we identify an asymptotic manytoone condition on the abstraction function that guarantees the existence of a decision procedure. In addition to the combination based on abstraction functions, we can combine multiple data structure theories if they all reduce to the same data structure logic. As an instance of this approach, we describe a decidable logic whose formulas are propositional combinations of formulas in: weak monadic secondorder logic of two successors, twovariable logic with counting, multiset algebra with Presburger arithmetic, the BernaysSchönfinkelRamsey class of firstorder logic, and the logic of algebraic data types with the set content function. The subformulas in this combination can share common variables that refer to sets of objects along with the common set algebra operations. Such sound and complete combination is possible because the relations on sets definable in the component logics are all expressible in Boolean Algebra with Presburger Arithmetic. Presburger arithmetic and its new extensions play an important role in our decidability results. In several cases, when we combine logics that belong to NP, we can prove the satisfiability for the combined logic is still in NP.

[88] 
Collections, Cardinalities, and Relations. Verification, Model Checking, and Abstract Interpretation (VMCAI). 2010.
[ps]
[ bib 
list ]
Logics that involve collections (sets, multisets), and cardinality constraints are useful for reasoning about unbounded data structures and concurrent processes. To make such logics more useful in verification this paper extends them with the ability to compute direct and inverse relation and function images. We establish decidability and complexity bounds for the extended logics.

[87] 
Decision Procedures for Algebraic Data Types with Abstractions. ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL). 2010.
[ps]
[ bib 
list ]
We describe a family of decision procedures that extend the decision procedure for quantifierfree constraints on recursive algebraic data types (term algebras) to support recursive abstraction functions. Our abstraction functions are catamorphisms (term algebra homomorphisms) mapping algebraic data type values into values in other decidable theories (e.g. sets, multisets, lists, integers, booleans). Each instance of our decision procedure family is sound; we identify a widely applicable manytoone condition on abstraction functions that implies the completeness. Complete instances of our decision procedure include the following correctness statements: 1) a functional data structure implementation satisfies a recursively specified invariant, 2) such data structure conforms to a contract given in terms of sets, multisets, lists, sizes, or heights, 3) a transformation of a formula (or lambda term) abstract syntax tree changes the set of free variables in the specified way.

[86]  On Complete Functional Synthesis. EPFL Technical Report LARAREPORT2009006. 2009. [ps] [ bib  list ] 
[85] 
On Test Generation through Programming in UDITA. EPFL, IC Technical Report LARAREPORT200905. 2009.
[ps]
[ bib 
http 
list ]
We present an approach for describing tests using nondeterministic test generation programs. To write test generation programs, we introduce UDITA, a Javabased language with nondeterministic choice operators and an interface for generating linked structures. We also describe new algorithms that generate concrete tests by efficiently exploring the space of all executions of nondeterministic UDITA programs. We implemented our approach and incorporated it into the official, publicly available repository of Java PathFinder (JPF), a popular tool for verifying Java programs. We evaluate our technique by generating tests for data structures, refactoring engines, and JPF itself. Our experiments show that test generation using UDITA is faster and leads to test descriptions that are easier to write than in previous frameworks. Moreover, the novel execution mechanism of UDITA is essential for making test generation feasible. Using UDITA, we have discovered a number of previously unknown bugs in Eclipse, NetBeans, Sun javac, and JPF.

[84] 
On Decision Procedures for Collections, Cardinalities, and Relations. EPFL Technical Report LARAREPORT2009004. 2009.
[ps]
[ bib 
list ]
Logics that involve collections (sets, multisets), and cardinality constraints are useful for reasoning about unbounded data structures and concurrent processes. To make such logics more useful in verification this paper extends them with the ability to compute direct and inverse relation and function images. We establish decidability and complexity bounds for the extended logics.

[83] 
On Decision Procedures for Algebraic Data Types with Abstractions. EPFL Technical Report LARAREPORT2009003. 2009.
[ps]
[ bib 
list ]
We describe a parameterized decision procedure that extends the decision procedure for functional recursive algebraic data types (trees) with the ability to specify and reason about abstractions of data structures. The abstract values are specified using recursive abstraction functions that map trees into other data types that have decidable theories. Our result yields a decidable logic which can be used to prove that implementations of functional data structures satisfy recursively specified invariants and conform to interfaces given in terms of sets, multisets, or lists, or to increase the automation in proof assistants.

[82] 
Combining Theories with Shared Set Operations. FroCoS: Frontiers in Combining Systems. 2009.
[ps]
[ bib 
list ]
Motivated by applications in software verification, we explore automated reasoning about the nondisjoint combination of theories of infinitely many finite structures, where the theories share set variables and set operations. We prove a combination theorem and apply it to show the decidability of the satisfiability problem for a class of formulas obtained by applying propositional connectives to formulas belonging to: 1) Boolean Algebra with Presburger Arithmetic (with quantifiers over sets and integers), 2) weak monadic secondorder logic over trees (with monadic secondorder quantifiers), 3) twovariable logic with counting quantifiers (ranging over elements), 4) the EPR class of firstorder logic with equality (with exists*forall* quantifier prefix), and 5) the quantifierfree logic of multisets with cardinality constraints.

[81]  On Combining Theories with Shared Set Operations. EPFL Technical Report LARAREPORT2009002. 2009. [ bib  list ] 
[80] 
Simplifying Distributed System Development. 12th Workshop on Hot Topics in Operating Systems. 2009.
[ps]
[ bib 
list ]
Distributed systems are difficult to design and develop. The difficulties arise both in basic safety correctness properties, and in achieving high performance. As a result of this complexity, the implementation of a distributed system often contains the basic algorithm coupled with an embedded strategy for making choices, such as the choice of a node to interact with. This paper proposes a programming model for distributed systems where 1) the application explicitly exposes the choices (decisions) that it needs to make as well as the objectives that it needs to maximize; 2) the application and the runtime system cooperate to maintain a predictive model of the distributed system and its environment; and 3) the runtime uses the predictive model to resolve the choices as to maximize the objectives. We claim that this programming model results in simpler source code and lower development effort, and can lead to increased performance and robustness to various deployment settings. Our initial results of applying this model to a sample application are encouraging.

[79] 
An Integrated Proof Language for Imperative Programs. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2009.
[ps]
[ bib 
list ]
We present an integrated proof language for guiding the actions of multiple reasoning systems as they work together to prove complex correctness properties of imperative programs. The language operates in the context of a program verification system that uses multiple reasoning systems to discharge generated proof obligations. It is designed to 1) enable developers to resolve key choice points in complex program correctness proofs, thereby enabling automated reasoning systems to successfully prove the desired correctness properties; 2) allow developers to identify key lemmas for the reasoning systems to prove, thereby guiding the reasoning systems to find an effective proof decomposition; 3) enable multiple reasoning systems to work together productively to prove a single correctness property by providing a mechanism that developers can use to divide the property into lemmas, each of which is suitable for a different reasoning system; and 4) enable developers to identify specific lemmas that the reasoning systems should use when attempting to prove other lemmas or correctness properties, thereby appropriately confining the search space so that the reasoning systems can find a proof in an acceptable amount of time. The language includes a rich set of declarative proof constructs that enables developers to direct the reasoning systems as little or as much as they desire. Because the declarative proof statements are embedded into the program as specialized comments, they also serve as verified documentation and are a natural extension of the assertion mechanism found in most program verification systems. We have implemented our integrated proof language in the context of a program verification system for Java and used the resulting system to verify a collection of linked data structure implementations. Our experience indicates that our proof language makes it possible to successfully prove complex program correctness properties that are otherwise beyond the reach of automated reasoning systems.

[78] 
On SetDriven Combination of Logics and Verifiers. EPFL Technical Report LARAREPORT2009001. 2009.
[ bib 
list ]
We explore the problem of automated reasoning about the nondisjoint combination of logics that share set variables and operations. We prove a general combination theorem, and apply it to show the decidability for the quantifierfree combination of formulas in WS2S, twovarible logic with counting, and Boolean Algebra with Presburger Arithmetic. Furthermore, we present an overapproximating algorithm that uses such combined logics to synthesize universally quantified invariants of infinitestate systems. The algorithm simultaneously synthesizes loop invariants of interest, and infers the relationships between sets to exchange the information between logics. We have implemented this algorithm and used it to prove detailed correctness properties of operations of linked data structure implementations.

[77] 
Simplifying Distributed System Development. EPFL Technical Report NSLREPORT2009001. 2009.
[ bib 
list ]
Distributed systems are difficult to design and develop. The difficulties arise both in basic safety correctness properties, and in achieving high performance. As a result of this complexity, the implementation of a distributed system often contains the basic algorithm coupled with an embedded strategy for making choices, such as the choice of a node to interact with. This paper proposes a programming model for distributed systems where 1) the application explicitly exposes the choices (decisions) that it needs to make as well as the objectives that it needs to maximize; 2) the application and the runtime system cooperate to maintain a predictive model of the distributed system and its environment; and 3) the runtime uses the predictive model to resolve the choices as to maximize the objectives. We claim that this programming model results in simpler source code and lower development effort, and can lead to increased performance and robustness to various deployment settings. Our initial results of applying this model to a sample application are encouraging.

[76] 
CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems. 6th USENIX Symp. Networked Systems Design and Implementation (NSDI '09). 2009.
[ps]
[ bib 
.html 
list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

[75] 
Opis: Reliable Distributed Systems in OCaml. ACM SIGPLAN TLDI. 2009.
[ bib 
list ]
Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Several specialized domainspecific languages and extensions of memoryunsafe languages were proposed to aid distributed system development. We present an alternative to these approaches, showing that modern, higherorder, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems. We present Opis, a functionalreactive approach for developing distributed systems in Objective Caml. An Opis protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. Such architecture aids reasoning about event functions both informally and using interactive theorem provers. For example, it facilitates simple termination arguments. Given a protocol description, a developer can use higherorder library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with fullreplay capabilities, 3) apply explicitstate model checking to the distributed system, detecting undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peertopeer overlay protocols, including the Chord distributed hash table and the Cyclon random gossip protocol. We found that using Opis results in high programmer productivity and leads to easily composable protocol descriptions. Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

[74]  On Delayed Choice Execution for Falsification. EPFL, IC Technical Report LARAREPORT200808. 2008. [ bib  list ] 
[73] 
Constraint Solving for Software Reliability and Text Analysis (Research Plan). EPFL Technical Report LARAREPORT2008007. 2008.
[ bib 
list ]
We will develop and implement new algorithms for constraint solving and apply them to construct two classes of tools: 1) bug finding and verification tools building on tools such as Java PathFinder and Jahob; 2) tools for deep semantic analysis of texts containing a mix of English, source code, and mathematical formulas. The starting point for our techniques are constraint solving algorithms developed in the rapidly expanding field of satisfiability modulo theories (SMT). We will use stateofthe art techniques to implement an SMT constraint solver in the Scala programming language running on JVM platforms. One of the distinguishing features of our constraint solver will be the ability to analyze rich constraints that include not only quantifiers, numerical domains and data structures, but also lambda expressions and recursive definitions. To be effective for program analysis, the constraint solver will have a native support for transition systems describing program semantics. This will enable it to tackle sources of exponential explosion in contextsensitive and path sensitive analyses such as symbolic execution. Such analyses can identify a wide range of bugs, many of which can cause crashes and security problems. In the area of text processing, we expect our constraint solver to enable efficient reasoning about rich semantic domains arising in computational semantics of natural language. This capability will make the solver a useful component of tools for creating semantically annotated text and postprocessing search results in scientific and engineering domains. To evaluate this hypothesis, we will develop a tool for analyzing text whose subject is explanation of source code, programming language semantics, compilation, and program analysis.

[72] 
Opis: Reliable Distributed Systems in OCaml. EPFLICNSL Technical Report NSLREPORT2008001. 2008.
[ bib 
list ]
The importance of distributed systems is growing as computing devices become ubiquitous and bandwidth becomes plentiful. Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Recently, several specialized domainspecific languages and extensions of memoryunsafe languages have been proposed to aid distributed system development. In this paper we propose an alternative to these approaches, arguing that modern, higherorder, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems. We present Opis, a functionalreactive approach for developing distributed systems in Objective Caml. In Opis, a protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. This facilitates reasoning about event functions both informally and using interactive provers. For example, this approach leads to simple termination arguments. Given a protocol description, a developer can use higherorder library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with fullreplay capabilities, 3) apply explicitstate model checking to the distributed system and detect any undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peertopeer overlay protocols including the Chord distributed hash table and the Cyclon random gossip protocol. We have found that using Opis results in high programmer productivity and leads to concise and easily composable protocol descriptions. Moreover, Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

[71] 
CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems. EPFLICLARA Technical Report LARAREPORT2008006. 2008.
[ bib 
list ]
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We present a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and the implementation of this approach, termed CrystalBall. Using CrystalBall we identified new bugs in mature Mace implementations of a random overlay tree and the Chord distributed hash table implementation. Furthermore, we show show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

[70] 
Fractional Collections with Cardinality Bounds,
and Mixed Integer Linear Arithmetic with Stars. Computer Science Logic (CSL). 2008.
[ps]
[ bib 
list ]
We present decision procedures for logical constraints that support reasoning about collections of elements such as sets, multisets, and fuzzy sets. Element membership in such collections is given by a characteristic function from a finite universe (of unknown size) to a subset of rational numbers specified by userdefined constraints in mixed linear integerrational arithmetic. Our logic supports standard operators such as union, intersection, difference, or any operation defined pointwise using mixed linear integerrational arithmetic. Moreover, it supports the notion of cardinality of the collection. Deciding formulas in such logic has application in verification of data structures. Our decision procedure reduces satisfiability of formulas with collections to satisfiability of formulas in an extension of mixed linear integerrational arithmetic with an “unbounded sum” operator. Such extension is also interesting in its own right because it can encode reachability problems for a simple class of transition systems. We present a decision procedure for the resulting extension, using a satisfiabilitypreserving transformation that eliminates the unbounded sum operator. Our decidability result subsumes previous special cases for sets and multisets.

[69] 
Linear Arithmetic with Stars. ComputedAided Verification (CAV). 2008.
[ps]
[ bib 
list ]
We consider an extension of integer linear arithmetic with a “star” operator takes closure under vector addition of the solution set of a linear arithmetic subformula. We show that the satisfiability problem for this extended language remains in NP (and therefore NPcomplete). Our proof uses semilinear set characterization of solutions of integer linear arithmetic formulas, as well as a generalization of a recent result on sparse solutions of integer linear programming problems. As a consequence of our result, we present worstcase optimal decision procedures for two NPhard problems that were previously not known to be in NP. The first is the satisfiability problem for a logic of sets, multisets (bags), and cardinality constraints, which has applications in verification, interactive theorem proving, and description logics. The second is the reachability problem for a class of transition systems whose transitions increment the state vector by solutions of integer linear arithmetic formulas.

[68] 
Full Functional Verification of Linked Data Structures. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI). 2008.
[ps]
[ bib 
list ]
We present the first verification of full functional correctness for a range of linked data structure implementations, including mutable lists, trees, graphs, and hash tables. Specifically, we present the use of the Jahob verification system to verify formal specifications, written in classical higherorder logic, that completely capture the desired behavior of the Java data structure implementations (with the exception of properties involving execution time and/or memory consumption). Given that the desired correctness properties include intractable constructs such as quantifiers, transitive closure, and lambda abstraction, it is a challenge to successfully prove the generated verification conditions. Our verification system uses 'integrated reasoning' to split each verification condition into a conjunction of simpler subformulas, then apply a diverse collection of specialized decision procedures, firstorder theorem provers, and, in the worst case, interactive theorem provers to prove each subformula. Techniques such as replacing complex subformulas with stronger but simpler alternatives, exploiting structure inherently present in the verification conditions, and, when necessary, inserting verified lemmas and proof hints into the imperative source code make it possible to seamlessly integrate all of the specialized decision procedures and theorem provers into a single powerful integrated reasoning system. By appropriately applying multiple proof techniques to discharge different subformulas, this reasoning system can effectively prove the complex and challenging verification conditions that arise in this context.

[67] 
On Linear Arithmetic with Stars. EPFL Technical Report LARAREPORT2008005. 2008.
[ps]
[ bib 
list ]
We consider an extension of integer linear arithmetic with a star operator that takes closure under vector addition of the set of solutions of linear arithmetic subformula. We show that the satisfiability problem for this language is in NP (and therefore NPcomplete). Our proof uses a generalization of a recent result on sparse solutions of integer linear programming problems. We present two consequences of our result. The first one is an optimal decision procedure for a logic of sets, multisets, and cardinalities that has applications in verification, interactive theorem proving, and description logics. The second is NPcompleteness of the reachability problem for a class of “homogeneous” transition systems whose transitions are defined using integer linear arithmetic formulas.

[66] 
On Static Analysis for Expressive Pattern Matching. EPFL Technical Report LARAREPORT2008004. 2008.
[ps]
[ bib 
list ]
Pattern matching is a widespread programming language construct that enables definitions of values by cases, generalizing ifthenelse and case statements. The cases in a pattern matching expression should be exhaustive: when the value does not match any of the cases, the expression throws a runtime exception. Similarly, each pattern should be reachable, and, if possible, patterns should be disjoint to facilitate reasoning. Current compilers use simple analyses to check patterns. Such analyses ignore pattern guards, use static types to approximate possible expression values, and do not take into account properties of userdefined functions. We present a design and implementation of a new analysis of pattern matching expressions. Our analysis detects a wider class of errors and reports fewer false alarms than previous approaches. It checks disjointness, reachability, and exhaustiveness of patterns by expressing these conditions as formulas and proving them using decision procedures and theorem provers. It achieves precision by propagating possible values through nested expressions and approximating patternmatching guards with formulas. It supports userdefined “extractor” functions in patterns by relying on specifications of relationships between the domains of such functions. The result is the first analysis that enables verified, declarative pattern matching with guards in the presence of data abstraction. We have implemented our analysis and describe our experience in checking a range of pattern matching expressions in a subset of the Scala programming language.

[65] 
Decision Procedures for Multisets with Cardinality Constraints. 9th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). 2008.
[ps]
[ bib 
list ]
Applications in software verification and interactive theorem proving often involve reasoning about sets of objects. Cardinality constraints on such collections also arise in these scenarios. Multisets arise for analogous reasons as sets: abstracting the content of linked data structure with duplicate elements leads to multisets. Interactive theorem provers such as Isabelle specify theories of multisets and prove a number of theorems about them to enable their use in interactive verification. However, the decidability and complexity of constraints on multisets is much less understood than for constraints on sets. The first contribution of this paper is a polynomialspace algorithm for deciding expressive quantifierfree constraints on multisets with cardinality operators. Our decision procedure reduces in polynomial time constraints on multisets to constraints in an extension of quantifierfree Presburger arithmetic with certain “unbounded sum” expressions. We prove bounds on solutions of resulting constraints and describe a polynomialspace decision procedure for these constraints. The second contribution of this paper is a proof that adding quantifiers to a constraint language containing subset and cardinality operators yields undecidable constraints. The result follows by reduction from Hilbert's 10th problem.

[64] 
Runtime Checking for Separation Logic. 9th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). 2008.
[ bib 
list ]
Separation logic is a popular approach for specifying properties of recursive mutable data structures. Several existing systems verify a subclass of separation logic specifications using static analysis techniques. Checking data structure specifications during program execution is an alternative to static verification: it can enforce the sophisticated specifications for which static verification fails, and it can help debug incorrect specifications and code by detecting concrete counterexamples to their validity. This paper presents Separation Logic Invariant ChecKer (SLICK), a runtime checker for separation logic specifications. We show that, although the recursive style of separation logic predicates is well suited for runtime execution, the implicit footprint and existential quantification make efficient runtime checking challenging. To address these challenges we introduce a coloring technique for efficiently checking method footprints and describe techniques for inferring values of existentially quantified variables. We have implemented our runtime checker in the context of a tool for enforcing specifications of Java programs. Our experience suggests that our runtime checker is a useful companion to a static verifier for separation logic specifications.

[63] 
Decision Procedures for Multisets with
Cardinality Constraints. EPFL Technical Report LARAREPORT2007002. 2007.
[ bib 
list ]
Applications in software verification and interactive theorem proving often involve reasoning about sets of objects. Cardinality constraints on such collections also arise in these applications. Multisets arise in these applications for analogous reasons as sets: abstracting the content of linked data structure with duplicate elements leads to multisets. Interactive theorem provers such as Isabelle specify theories of multisets and prove a number of theorems about them to enable their use in interactive verification. However, the decidability and complexity of constraints on multisets is much less understood than for constraints on sets. The first contribution of this paper is a polynomialspace algorithm for deciding expressive quantifierfree constraints on multisets with cardinality operators. Our decision procedure reduces in polynomial time constraints on multisets to constraints in an extension of quantifierfree Presburger arithmetic with certain “unbounded sum” expressions. We prove bounds on solutions of resulting constraints and describe a polynomialspace decision procedure for these constraints. The second contribution of this paper is a proof that adding quantifiers to a constraint language containing subset and cardinality operators yields undecidable constraints. The result follows by reduction from Hilbert's 10th problem.

[62] 
Runtime Checking for Separation Logic. EPFL Technical Report LARAREPORT2007003. 2007.
[ bib 
list ]
Separation logic is a popular approach for specifying properties of recursive mutable data structures. Several existing systems verify a subclass of separation logic specifications using static analysis techniques. Checking data structure specifications during program execution is an alternative to static verification: it can enforce the sophisticated specifications for which static verification fails, and it can help debug incorrect specifications and code by detecting concrete counterexamples to their validity. This paper presents Separation Logic Invariant ChecKer (SLICK), a runtime checker for separation logic specifications. We show that, although the recursive style of separation logic predicates is well suited for runtime execution, the implicit footprint and existential quantification make efficient runtime checking challenging. To address these challenges we introduce a coloring technique for efficiently checking method footprints and describe techniques for inferring values of existentially quantified variables. We have implemented our runtime checker in the context of a tool for enforcing specifications of Java programs. Our experience suggests that our runtime checker is a useful companion to a static verifier for separation logic specifications.

[61] 
Towards Efficient Satisfiability Checking for
Boolean Algebra with Presburger Arithmetic. Conference on Automateded Deduction (CADE21). 2007.
[ps]
[ bib 
list ]
Boolean Algebra with Presburger Arithmetic (BAPA) is a decidable logic that combines 1) Boolean algebra of sets of uninterpreted elements (BA) and 2) Presburger arithmetic (PA). BAPA can express relationships between integer variables and cardinalities of unbounded sets. In combination with other decision procedures and theorem provers, BAPA is useful for automatically verifying quantitative properties of data structures. This paper examines QFBAPA, the quantifierfree fragment of BAPA. The computational complexity of QFBAPA satisfiability was previously unknown; previous QFBAPA algorithms have nondeterministic exponential time complexity due to an explosion in the number of introduced integer variables. This paper shows, for the first time, how to avoid such exponential explosion. We present an algorithm for checking satisfiability of QFBAPA formulas by reducing them to formulas of quantifierfree PA, with only O(n log(n)) increase in formula size. We prove the correctness of our algorithm using a theorem about sparse solutions of integer linear programming problems. This is the first proof that QFBAPA satisfiability is in NP and therefore NPcomplete. We implemented our algorithm in the context of the Jahob verification system. Our preliminary experiments suggest that our algorithm, although not necessarily better for proving formula unsatisfiability, is more effective in detecting formula satisfiability than previous approaches.

[60] 
Polynomial Constraints for Sets with Cardinality Bounds. Foundations of Software Science and Computation Structures (FOSSACS). 2007.
[ps]
[ bib 
list ]
Logics that can reason about sets and their cardinality bounds are useful in program analysis, program verification, databases, and knowledge bases. This paper presents a class of constraints on sets and their cardinalities for which the satisfiability and the entailment problems are computable in polynomial time. Our class of constraints, based on treeshaped formulas, is unique in being simultaneously tractable and able to express 1) that a set is a union of other sets, 2) that sets are disjoint, and 3) that a set has cardinality within a given range. As the main result we present a polynomialtime algorithm for checking entailment of our constraints.

[59]  Verifying Complex Properties using Symbolic Shape Analysis. Workshop on Heap Abstraction and Verification (collocated with ETAPS). 2007. [ps] [ bib  list ] 
[58] 
Runtime Checking for Program Verification. Workshop on Runtime Verification. 2007.
[ bib 
list ]
The process of verifying that a program conforms to its specification is often hampered by errors in both the program and the specification. A runtime checker that can evaluate formal specifications can be useful for quickly identifying such errors. This paper describes our preliminary experience with incorporating runtime checking into the Jahob verification system and discusses some lessons we learned in this process. One of the challenges in building a runtime checker for a program verification system is that the language of invariants and assertions is designed for simplicity of semantics and tractability of proofs, and not for runtime checking. Some of the more challenging constructs include existential and universal quantification, set comprehension, specification variables, and formulas that refer to past program states. In this paper, we describe how we handle these constructs in our runtime checker, and describe directions for future work.

[57] 
Modular Data Structure Verification. PhD Thesis, EECS Department, Massachusetts Institute of Technology. 2007.
[ps]
[ bib 
http 
list ]
This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java programs with dynamically allocated data structures. Developers write Jahob specifications in classical higherorder logic (HOL); Jahob reduces the verification problem to deciding the validity of HOL formulas. I present a new method for proving HOL formulas by combining automated reasoning techniques. My method consists of 1) splitting formulas into individual HOL conjuncts, 2) soundly approximating each HOL conjunct with a formula in a more tractable fragment and 3) proving the resulting approximation using a decision procedure or a theorem prover. I present three concrete logics; for each logic I show how to use it to approximate HOL formulas, and how to decide the validity of formulas in this logic. First, I present an approximation of HOL based on a translation to firstorder logic, which enables the use of existing resolutionbased theorem provers. Second, I present an approximation of HOL based on field constraint analysis, a new technique that enables decision procedures for special classes of graphs (such as monadic secondorder logic over trees) to be applied to arbitrary graphs. Third, I present an approximation using Boolean Algebra with Presburger Arithmetic (BAPA), a logic that combines reasoning about sets of elements with reasoning about cardinalities of sets. BAPA can express relationships between sizes of data structures and invariants that correlate data structure size with integer variables. I present the first implementation of a BAPA decision procedure, and establish the exact complexity bounds for BAPA and quantifierfree BAPA. Together, these techniques enabled Jahob to modularly and automatically verify data structure implementations based on singly and doublylinked lists, trees with parent pointers, priority queues, and hash tables. In particular, Jahob was able to prove that data structure implementations satisfy their specifications, maintain key data structure invariants expressed in a rich logical notation, and never produce runtime errors such as null dereferences or out of bounds accesses.

[56] 
QuantifierFree Boolean Algebra with Presburger Arithmetic is NPcomplete. MIT CSAIL Technical Report MITCSAILTR2007001. 2007.
[ps]
[ bib 
http 
list ]
Boolean Algebra with Presburger Arithmetic (BAPA) combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of unbounded finite sets and can be used to express verification conditions in verification of data structure consistency properties. In this report I consider the QuantifierFree fragment of Boolean Algebra with Presburger Arithmetic (QFBAPA). Previous algorithms for QFBAPA had nondeterministic exponential time complexity. In this report I show that QFBAPA is in NP, and is therefore NPcomplete. My result yields an algorithm for checking satisfiability of QFBAPA formulas by converting them to polynomially sized formulas of quantifierfree Presburger arithmetic. I expect this algorithm to substantially extend the range of QFBAPA problems whose satisfiability can be checked in practice.

[55] 
Using FirstOrder Theorem Provers in the Jahob Data Structure Verification System. Verification, Model Checking and Abstract Interpretation. 2007.
[ps]
[ bib 
list ]
This paper presents our integration of efficient resolutionbased theorem provers into the Jahob data structure verification system. Our experimental results show that this approach enables Jahob to automatically verify the correctness of a range of complex dynamically instantiable data structures, such as hash tables and search trees, without the need for interactive theorem proving or techniques tailored to individual data structures. Our primary technical results include: (1) a translation from higherorder logic to firstorder logic that enables the application of resolutionbased theorem provers and (2) a proof that eliminating type (sort) information in formulas is both sound and complete, even in the presence of a generic equality operator. Our experimental results show that the elimination of type information often dramatically decreases the time required to prove the resulting formulas. These techniques enabled us to verify complex correctness properties of Java programs such as a mutable set implemented as an imperative linked list, a finite map implemented as a functional ordered tree, a hash table with a mutable array, and a simple library system example that uses these container data structures. Our system verifies (in a matter of minutes) that data structure operations correctly update the finite map, that they preserve data structure invariants (such as ordering of elements, membership in appropriate hash table buckets, or relationships between sets and relations), and that there are no runtime errors such as null dereferences or array out of bounds accesses.

[54] 
On Using FirstOrder Theorem Provers in a the Jahob Data Structure Verification System. MIT Technical Report MITCSAILTR2006072. 2006.
[ps]
[ bib 
http 
list ]
This paper presents our integration of efficient resolutionbased theorem provers into the Jahob data structure verification system. Our experimental results show that this approach enables Jahob to automatically verify the correctness of a range of complex dynamically instantiable data structures, including data structures such as hash tables and search trees, without the need for interactive theorem proving or techniques tailored to individual data structures. Our primary technical results include: (1) a translation from higherorder logic to firstorder logic that enables the application of resolutionbased theorem provers and (2) a proof that eliminating type (sort) information in formulas is both sound and complete, even in the presence of a generic equality operator. Our experimental results show that the elimination of type information dramatically decreases the time required to prove the resulting formulas. These techniques enabled us to verify complex correctness properties of Java programs such as a mutable set implemented as an imperative linked list, a finite map implemented as a functional ordered tree, a hash table with a mutable array, and a simple library system example that uses these container data structures. Our system verifies (in a matter of minutes) that data structure operations correctly update the finite map, that they preserve data structure invariants (such as ordering of elements, membership in appropriate hash table buckets, or relationships between sets and relations), and that there are no runtime errors such as null dereferences or array out of bounds accesses.

[53] 
Modular Pluggable Analyses for Data Structure Consistency. IEEE Transactions on Software Engineering (TSE). 2006.
[ps]
[ bib 
http 
list ]
Hob is a program analysis system that enables the focused application of multiple analyses to different modules in the same program. In our approach, each module encapsulates one or more data structures and uses membership in abstract sets to characterize how objects participate in data structures. Each analysis verifies that the implementation of the module 1) preserves important internal data structure consistency properties and 2) correctly implements a set algebra interface that characterizes the effects of operations on the data structure. Collectively, the analyses use the set algebra to 1) characterize how objects participate in multiple data structures and to 2) enable the interanalysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We implemented our system and deployed several pluggable analyses, including a flag analysis for modules in which abstract set membership is determined by a flag field in each object, a PALE shape analysis plugin, and a theorem proving plugin for analyzing arbitrarily complicated data structures. Our experience shows that our system can effectively 1) verify the consistency of data structures encapsulated within a single module and 2) combine analysis results from different analysis plugins to verify properties involving objects shared by multiple modules analyzed by different analyses.

[52] 
On Verifying Complex Properties using Symbolic Shape Analysis. MaxPlanck Institute for Computer Science Technical Report MPII200621. 2006.
[ bib 
http 
list ]
One of the main challenges in the verification of software systems is the analysis of statically unbounded data structures with dynamic memory allocation, such as linked data structures and arrays. We describe Bohne, a new analysis for verifying data structures. Bohne verifies data structure operations and shows that 1) the operations preserve data structure invariants and 2) the operations satisfy their specifications expressed in terms of changes to the set of objects stored in the data structure. During the analysis, Bohne infers loop invariants in the form of disjunctions of universally quantified Boolean combinations of formulas, represented as sets of binary decision diagrams. To synthesize loop invariants of this form, Bohne uses a combination of decision procedures for Monadic SecondOrder Logic over trees, SMTLIB decision procedures (currently CVC Lite), and an automated reasoner within the Isabelle interactive theorem prover. This architecture shows that synthesized loop invariants can serve as a useful communication mechanism between different decision procedures. In addition, Bohne uses field constraint analysis, a combination mechanism that enables the use of uninterpreted function symbols within formulas of Monadic SecondOrder Logic over trees. Using Bohne, we have verified operations on data structures such as linked lists with iterators and back pointers, trees with and without parent pointers, twolevel skip lists, array data structures, and sorted lists. We have deployed Bohne in the Hob and Jahob data structure analysis systems, enabling us to combine Bohne with analyses of data structure clients and apply it in the context of larger programs. This paper describes the Bohne algorithm, the techniques that Bohne uses to reduce the amount of annotations and the running time of the analysis. We also describe the analysis architecture that enables Bohne to effectively take advantage of multiple reasoning procedures when proving complex invariants.

[51] 
Deciding Boolean Algebra with Presburger Arithmetic. Journal of Automated Reasoning. 2006.
[ps]
[ bib 
http 
list ]
We describe an algorithm for deciding the firstorder multisorted theory BAPA, which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of unbounded finite sets, and supports arbitrary quantification over sets and integers. Our original motivation for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations arise when verifying programs with annotations containing quantifiers, or when proving simulation relation conditions for refinement and equivalence of program fragments. Furthermore, BAPA constraints can be used for proving the termination of programs that manipulate data structures, as well as in constraint database query evaluation and loop invariant inference. We give a formal description of an algorithm for deciding BAPA. We analyze our algorithm and show that it has optimal alternating time complexity, and that the complexity of BAPA matches the complexity of PA. Because it works by a reduction to PA, our algorithm yields the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. When restricted to BA formulas, the algorithm can be used to decide BA in optimal alternating time. Furthermore, the algorithm can eliminate individual quantifiers from a formula with free variables and therefore perform projection onto a desirable set of variables. We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure consistency checking of Java programs; our experience suggest that a straightforward implementation of the algorithm is effective on nontrivial formulas as long as the number of set variables is small. We also report on a new algorithm for solving the quantifierfree fragment of BAPA.

[50] 
An overview of the Jahob analysis system: Project Goals and Current Status. NSF Next Generation Software Workshop. 2006.
[ps]
[ bib 
list ]
We present an overview of the Jahob system for modular analysis of data structure properties. Jahob uses a subset of Java as the implementation language and annotations with formulas in a subset of Isabelle as the specification language. It uses monadic secondorder logic over trees to reason about reachability in linked data structures, the Isabelle theorem prover and NelsonOppen style theorem provers to reason about highlevel properties and arrays, and a new technique to combine reasoning about constraints on uninterpreted function symbols with other decision procedures. It also incorporates new decision procedures for reasoning about sets with cardinality constraints. The system can infer loop invariants using new symbolic shape analysis. Initial results in the use of our system are promising; we are continuing to develop and evaluate it.

[49] 
Field Constraint Analysis. Proc. Int. Conf. Verification, Model Checking, and Abstract Interpratation. 2006.
[ps]
[ bib 
list ]
We introduce field constraint analysis, a new technique for verifying data structure invariants. A field constraint for a field is a formula specifying a set of objects to which the field can point. Field constraints enable the application of decidable logics to data structures which were originally beyond the scope of these logics, by verifying the backbone of the data structure and then verifying constraints on fields that crosscut the backbone in arbitrary ways. Previously, such crosscutting fields could only be verified when they were uniquely determined by the backbone, which significantly limits the range of analyzable data structures. Field constraint analysis permits nondeterministic field constraints on crosscutting fields, which allows the verificiation of invariants for data structures such as skip lists. Nondeterministic field constraints also enable the verification of invariants between data structures, yielding an expressive generalization of static type declarations. The generality of our field constraints requires new techniques. We present one such technique and prove its soundness. We have implemented this technique as part of a symbolic shape analysis deployed in the context of the Hob system for verifying data structure consistency. Using this implementation we were able to verify data structures that were previously beyond the reach of similar techniques.

[48] 
On Field Constraint Analysis. MIT CSAIL Technical Report MITCSAILTR2005072, MITLCSTR1010. 2005.
[ps]
[ bib 
list ]
We introduce field constraint analysis, a new technique for verifying data structure invariants. A field constraint for a field is a formula specifying a set of objects to which the field can point. Field constraints enable the application of decidable logics to data structures which were originally beyond the scope of these logics, by verifying the backbone of the data structure and then verifying constraints on fields that crosscut the backbone in arbitrary ways. Previously, such crosscutting fields could only be verified when they were uniquely determined by the backbone, which significantly limited the range of analyzable data structures. Our field constraint analysis permits nondeterministic field constraints on crosscutting fields, which allows to verify invariants of data structures such as skip lists. Nondeterministic field constraints also enable the verification of invariants between data structures, yielding an expressive generalization of static type declarations. The generality of our field constraints requires new techniques, which are orthogonal to the traditional use of structure simulation. We present one such technique and prove its soundness. We have implemented this technique as part of a symbolic shape analysis deployed in the context of the Hob system for verifying data structure consistency. Using this implementation we were able to verify data structures that were previously beyond the reach of similar techniques.

[47] 
Implications of a Data Structure Consistency Checking System. International conference on Verified Software: Theories, Tools, Experiments (VSTTE, IFIP Working Group 2.3 Conference). 2005.
[ps]
[ bib 
list ]
We present a framework for verifying that programs correctly preserve important data structure consistency properties. Results from our implemented system indicate that our system can effectively enable the scalable verification of very precise data structure consistency properties within complete programs. Our system treats both internal properties, which deal with a single data structure implementation, and external properties, which deal with properties that involve multiple data structures. A key aspect of our system is that it enables multiple analysis and verification packages to productively interoperate to analyze a single program. In particular, it supports the targeted use of very precise, unscalable analyses in the context of a larger analysis and verification system. The integration of different analyses in our system is based on a common setbased specification language: precise analyses verify that data structures conform to set specifications, whereas scalable analyses verify relationships between data structures and preconditions of data structure operations. There are several reasons why our system may be of interest in a broader program analysis and verification effort. First, it can ensure that the program satisfies important data structure consistency properties, which is an important goal in and of itself. Second, it can provide information that insulates other analysis and verification tools from having to deal directly with pointers and data structure implementations, thereby enabling these tools to focus on the key properties that they are designed to analyze. Finally, we expect other developers to be able to leverage its basic structuring concepts to enable the scalable verification of other program safety and correctness properties.

[46] 
Relational Analysis of Algebraic Datatypes. 10th European Soft. Eng. Conf. (ESEC) and 13th
Symp. Foundations of Software Engineering (FSE). 2005.
[ps]
[ bib 
list ]
We present a technique that enables the use of finite model finding to check the satisfiability of certain formulas whose intended models are infinite. Such formulas arise when using the language of sets and relations to reason about structured values such as algebraic datatypes. The key idea of our technique is to identify a natural syntactic class of formulas in relational logic for which reasoning about infinite structures can be reduced to reasoning about finite structures. As a result, when a formula belongs to this class, we can use existing finite model finding tools to check whether the formula holds in the desired infinite model.

[45] 
On Algorithms and Complexity for Sets with Cardinality Constraints. MIT CSAIL Technical Report 2005.
[ps]
[ bib 
http 
list ]
Typestate systems ensure many desirable properties of imperative programs, including initialization of object fields and correct use of stateful library interfaces. Abstract sets with cardinality constraints naturally generalize typestate properties: relationships between the typestates of objects can be expressed as subset and disjointness relations on sets, and elements of sets can be represented as sets of cardinality one. In addition, sets with cardinality constraints provide a natural language for specifying operations and invariants of data structures. Motivated by these program analysis applications, this paper presents new algorithms and new complexity results for constraints on sets and their cardinalities. We study several classes of constraints and demonstrate a tradeoff between their expressive power and their complexity. Our first result concerns a quantifierfree fragment of Boolean Algebra with Presburger Arithmetic. We give a nondeterministic polynomialtime algorithm for reducing the satisfiability of sets with symbolic cardinalities to constraints on constant cardinalities, and give a polynomialspace algorithm for the resulting problem. The best previously existing algorithm runs in exponential space and nondeterministic exponential time. In a quest for more efficient fragments, we identify several subclasses of sets with cardinality constraints whose satisfiability is NPhard. Finally, we identify a class of constraints that has polynomialtime satisfiability and entailment problems and can serve as a foundation for efficient program analysis. We give a system of rewriting rules for enforcing certain consistency properties of these constraints and show how to extract complete information from constraints in normal form. This result implies the soundness and completeness of our algorithms.

[44] 
An Algorithm for Deciding BAPA: Boolean Algebra with Presburger Arithmetic. 20th International Conference on Automated Deduction, CADE20. 2005.
[ps]
[ bib 
list ]
We describe an algorithm for deciding the firstorder multisorted theory BAPA, which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of a priory unbounded finite sets, and supports arbitrary quantification over sets and integers. Our motivation for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations arise when verifying programs with annotations containing quantifiers, or when proving simulation relation conditions for refinement and equivalence of program fragments. Furthermore, BAPA constraints can be used for proving the termination of programs that manipulate data structures, and have applications in constraint databases. We give a formal description of a decision procedure for BAPA, which implies the decidability of BAPA. We analyze our algorithm and obtain an elementary upper bound on the running time, thereby giving the first complexity bound for BAPA. Because it works by a reduction to PA, our algorithm yields the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. Our algorithm can also be used to yield an optimal decision procedure for BA through a reduction to PA with bounded quantifiers. We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure consistency checking of Java programs; our experience with the algorithm is promising.

[43] 
On Relational Analysis of Algebraic Datatypes. MIT Technical Report 985. 2005.
[ps]
[ bib 
list ]
We present a technique that enables the use of finite model finding to check the satisfiability of certain formulas whose intended models are infinite. Such formulas arise when using the language of sets and relations to reason about structured values such as algebraic datatypes. The key idea of our technique is to identify a natural syntactic class of formulas in relational logic for which reasoning about infinite structures can be reduced to reasoning about finite structures. As a result, when a formula belongs to this class, we can use existing finite model finding tools to check whether the formula holds in the desired infinite model.

[42] 
Hob: A Tool for Verifying Data Structure Consistency. 14th International Conference on Compiler Construction
(tool demo). 2005.
[ps]
[ bib 
list ]
This tool demonstration presents Hob, a system for verifying data structure consistency for programs written in a generalpurpose programming language. Our tool enables the focused application of multiple communicating static analyses to different modules in the same program. Using our tool throughout the program development process, we have successfully identified several bugs in both specifications and implementations of programs.

[41] 
CrossCutting Techniques in Program Specification and Analysis. 4th International Conference on
AspectOriented Software Development. 2005.
[ps]
[ bib 
list ]
We present three aspectoriented constructs (formats, scopes, and defaults) that, in combination with a specification language based on abstract sets of objects, enable the modular application of multiple arbitrarily precise (and therefore arbitrarily unscalable) analyses to scalably verify data structure consistency properties in sizable programs. Formats use a form of field introduction to group together the declarations of all of the fields that together comprise a given data structure. Scopes and defaults enable the developer to state certain data structure consistency properties once in a single specification construct that cuts across the preconditions and postconditions of the procedures in the system. Standard approaches, in contrast, scatter and duplicate such properties across the preconditions and postconditions. We have implemented a prototype implementation, specification, analysis, and verification system based on these constructs and used this system to successfully verify a range of data structure consistency properties in several programs. Most previous research in the field of aspectoriented programming has focused on the use of aspectoriented concepts in design and implementation. Our experience indicates that aspectoriented concepts can also be extremely useful for specification, analysis, and verification.

[40] 
Decision Procedures for SetValued Fields. Electr. Notes Theor. Comput. Sci.; Proc. Abstract Interpretation of
ObjectOriented Languages. 2005.
[ps]
[ bib 
list ]
A central feature of current objectoriented languages is the ability to dynamically instantiate userdefined container data structures such as lists, trees, and hash tables. Implementations of these data structures typically use references to dynamically allocated objects, which complicates reasoning about the resulting program. Reasoning is simplified if data structure operations are specified in terms of abstract sets of objects associated with each data structure. For example, an insertion into a data structure in this approach becomes simply an insertion into a dynamically changing setvalued field of an object, as opposed to a manipulation of a dynamically linked structure attached to the object. In this paper we explore reasoning techniques for programs that manipulate data structures specified using setvalued abstract fields associated with container objects. We compare the expressive power and the complexity of specification languages based on 1) decidable prefix vocabulary classes of firstorder logic, 2) twovariable logic with counting, and 3) NelsonOppen combinations of multisorted theories. Such specification logics can be used for verification of objectoriented programs with supplied invariants. Moreover, by selecting an appropriate subset of properties expressible in such logic, the decision procedures for these logics enable automated computation of lattice operations in an abstract interpretation domain, as well as automated computation of abstract program semantics.

[39] 
Generalized Typestate Checking for Data Structure Consistency. Verification, Model Checking and Abstract Interpretation. 2005.
[ps]
[ bib 
list ]
We present an analysis to verify abstract set specifications for programs that use object field values to determine the membership of objects in abstract sets. In our approach, each module may encapsulate several data structures and use membership in abstract sets to characterize how objects participate in its data structures. Each module's specification uses set algebra formulas to characterize the effects of its operations on the abstract sets. The program may define abstract set membership in a variety of ways; arbitrary analyses (potentially with multiple analyses applied to different modules in the same program) may verify the corresponding set specifications. The analysis we present in this paper verifies set specifications by constructing and verifying set algebra formulas whose validity implies the validity of the set specifications. We have implemented our analysis and annotated several programs (752500 lines of code) with set specifications. We found that our original analysis algorithm did not scale; this paper describes several optimizations that improve the scalability of our analysis. It also presents experimental data comparing the original and optimized versions of our analysis.

[38]  File Refinement. The Archive of Formal Proofs. 2004. [ bib  http  list ] 
[37]  Binary Search Trees. The Archive of Formal Proofs. 2004. [ bib  http  list ] 
[36]  Modular Pluggable Analyses for Data Structure Consistency. Monterey Workshop on Software Engineering Tools: Compatibility and Integration. 2004. [ bib  list ] 
[35] 
On Decision Procedures for SetValued Fields. MIT CSAIL Technical Report 975. 2004.
[ps]
[ bib 
list ]
An important feature of objectoriented programming languages is the ability to dynamically instantiate userdefined container data structures such as lists, trees, and hash tables. Programs implement such data structures using references to dynamically allocated objects, which allows data structures to store unbounded numbers of objects, but makes reasoning about programs more difficult. Reasoning about objectoriented programs with complex data structures is simplified if data structure operations are specified in terms of abstract sets of objects associated with each data structure. For example, an insertion into a data structure in this approach becomes simply an insertion into a dynamically changing setvalued field of an object, as opposed to a manipulation of a dynamically linked structure linked to the object. In this paper we explore reasoning techniques for programs that manipulate data structures specified using setvalued abstract fields associated with container objects. We compare the expressive power and the complexity of specification languages based on 1) decidable prefix vocabulary classes of firstorder logic, 2) twovariable logic with counting, and 3) NelsonOppen combinations of multisorted theories. Such specification logics can be used for verification of objectoriented programs with supplied invariants. Moreover, by selecting an appropriate subset of properties expressible in such logic, the decision procedures for these logics yield automated computation of lattice operations in abstract interpretation domain, as well as automated computation of abstract program semantics.

[34] 
Combining Theorem proving with Static Analysis for Data Structure Consistency. International Workshop on Software Verification and Validation. 2004.
[ps]
[ bib 
list ]
We describe an approach for combining theorem proving techniques with static analysis to analyze data structure consistency for programs that manipulate heterogeneous data structures. Our system uses interactive theorem proving and shape analysis to verify that data structure implementations conform to set interfaces. A simpler static analysis then uses the verified set interfaces to verify properties that characterize how shared objects participate in multiple data structures. We have successfully applied this technique to several programs and found that theorem proving within circumscribed regions of the program combined with static analysis enables the verification of largescale program properties.

[33] 
On Spatial Conjunction as SecondOrder Logic. MIT CSAIL Technical Report 970. 2004.
[ps]
[ bib 
http 
list ]
Spatial conjunction is a powerful construct for reasoning about dynamically allocated data structures, as well as concurrent, distributed and mobile computation. While researchers have identified many uses of spatial conjunction, its precise expressive power compared to traditional logical constructs was not previously known. In this paper we establish the expressive power of spatial conjunction. We construct an embedding from firstorder logic with spatial conjunction into secondorder logic, and more surprisingly, an embedding from full second order logic into firstorder logic with spatial conjunction. These embeddings show that the satisfiability of formulas in firstorder logic with spatial conjunction is equivalent to the satisfiability of formulas in secondorder logic. These results explain the great expressive power of spatial conjunction and can be used to show that adding unrestricted spatial conjunction to a decidable logic leads to an undecidable logic. As one example, we show that adding unrestricted spatial conjunction to twovariable logic leads to undecidability. On the side of decidability, the embedding into secondorder logic immediately implies the decidability of firstorder logic with a form of spatial conjunction over trees. The embedding into spatial conjunction also has useful consequences: because a restricted form of spatial conjunction in twovariable logic preserves decidability, we obtain that a correspondingly restricted form of secondorder quantification in twovariable logic is decidable. The resulting language generalizes the firstorder theory of boolean algebra over sets and is useful in reasoning about the contents of data structures in objectoriented languages.

[32] 
On Our Experience with Modular Pluggable Analyses. MIT CSAIL Technical Report 965. 2004.
[ps]
[ bib 
list ]
We present a technique that enables the focused application of multiple analyses to different modules in the same program. Our research has two goals: 1) to address the scalability limitations of precise analyses by focusing the analysis on only those parts of the program that are relevant to the properties that the analysis is designed to verify, and 2) to enable the application of specialized analyses that verify properties of specific classes of data structures to programs that simultaneously manipulate several different kinds of data structures. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the interanalysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We characterize the key soundness property that an analysis plugin must satisfy to successfully participate in our system and present several analysis plugins that satisfy this property: a flag plugin that analyzes modules in which abstract set membership is determined by a flag field in each object, and a graph types plugin that analyzes modules in which abstract set membership is determined by reachability properties of objects stored in treelike data structures.

[31] 
The FirstOrder Theory of Sets with Cardinality Constraints is Decidable. MIT CSAIL Technical Report 958. 2004.
[ps]
[ bib 
http 
list ]
We show that the decidability of the firstorder theory of the language that combines Boolean algebras of sets of uninterpreted elements with Presburger arithmetic operations. We thereby disprove a recent conjecture that this theory is undecidable. Our language allows relating the cardinalities of sets to the values of integer variables, and can distinguish finite and infinite sets. We use quantifier elimination to show the decidability and obtain an elementary upper bound on the complexity. Precise program analyses can use our decidability result to verify representation invariants of data structures that use an integer field to represent the number of stored elements.

[30] 
Verifying a File System Implementation. Sixth International Conference on Formal Engineering Methods. 2004.
[ps]
[ bib 
list ]
We present a correctness proof for a basic file system implementation. This implementation contains key elements of standard Unix file systems such as inodes and fixedsize disk blocks. We prove the implementation correct by establishing a simulation relation between the specification of the file system (which models the file system as an abstract map from file names to sequences of bytes) and its implementation (which uses fixedsize disk blocks to store the contents of the files). We used the Athena proof system to represent and validate our proof. Our experience indicates that Athena's use of blockstructured natural deduction, support for structural induction and proof abstraction, and seamless integration with highperformance automated theorem provers were essential to our ability to successfully manage a proof of this size.

[29] 
Generalized Records and Spatial Conjunction in Role Logic. International Static Analysis Symposium. 2004.
[ps]
[ bib 
list ]
We have previously introduced role logic as a notation for describing properties of relational structures in shape analysis, databases and knowledge bases. A natural fragment of role logic corresponds to twovariable logic with counting and is therefore decidable. We show how to use role logic to describe open and closed records, as well the dual of records, inverse records. We observe that the spatial conjunction operation of separation logic naturally models record concatenation. Moreover, we show how to eliminate the spatial conjunction of formulas of quantifier depth one in firstorder logic with counting. As a result, allowing spatial conjunction of formulas of quantifier depth one preserves the decidability of twovariable logic with counting. This result applies to twovariable role logic fragment as well. The resulting logic smoothly integrates type system and predicate calculus notation and can be viewed as a natural generalization of the notation for constraints arising in role analysis and similar shape analysis approaches.

[28] 
On Generalized Records and Spatial Conjunction in Role Logic. MIT CSAIL Technical Report 942. 2004.
[ps]
[ bib 
http 
list ]
We have previously introduced role logic as a notation for describing properties of relational structures in shape analysis, databases and knowledge bases. A natural fragment of role logic corresponds to twovariable logic with counting and is therefore decidable. We show how to use role logic to describe open and closed records, as well the dual of records, inverse records. We observe that the spatial conjunction operation of separation logic naturally models record concatenation. Moreover, we show how to eliminate the spatial conjunction of formulas of quantifier depth one in firstorder logic with counting. As a result, allowing spatial conjunction of formulas of quantifier depth one preserves the decidability of twovariable logic with counting. This result applies to twovariable role logic fragment as well. The resulting logic smoothly integrates type system and predicate calculus notation and can be viewed as a natural generalization of the notation for constraints arising in role analysis and similar shape analysis approaches.

[27] 
On Verifying a File System Implementation. MIT CSAIL Technical Report 946. 2004.
[ps]
[ bib 
list ]
We present a correctness proof for a basic file system implementation. This implementation contains key elements of standard Unix file systems such as inodes and fixedsize disk blocks. We prove the implementation correct by establishing a simulation relation between the specification of the file system (which models the file system as an abstract map from file names to sequences of bytes) and its implementation (which uses fixedsize disk blocks to store the contents of the files). We used the Athena proof checker to represent and validate our proof. Our experience indicates that Athena's use of blockstructured natural deduction, support for structural induction and proof abstraction, and seamless connection with highperformance automated theorem provers were essential to our ability to successfully manage a proof of this size.

[26] 
Boolean Algebra of Shape Analysis Constraints. Verification, Model Checking and Abstract Interpretation. 2004.
[ps]
[ bib 
list ]
The parametric shape analysis framework of Sagiv, Reps, and Wilhelm [45, 46] uses threevalued structures as dataflow lattice elements to represent sets of states at different program points. The recent work of Yorsh, Reps, Sagiv, Wilhelm [48, 50] introduces a family of formulas in (classical, twovalued) logic that are isomorphic to threevalued structures [46] and represent the same sets of concrete states. In this paper we introduce a larger syntactic class of formulas that has the same expressive power as the formulas in [48]. The formulas in [48] can be viewed as a normal form of the formulas in our syntactic class; we give an algorithm for transforming our formulas to this normal form. Our formulas make it obvious that the constraints are closed under all boolean operations and therefore form a boolean algebra. Our algorithm also gives a reduction of the entailment and the equivalence problems for these constraints to the satisfiability problem.

[25] 
On computing the fixpoint of a set of boolean equations. Microsoft Research Technical Report MSRTR200308. 2003.
[ps]
[ bib 
http 
list ]
This paper presents a method for computing a least fixpoint of a system of equations over booleans. The resulting computation can be significantly shorter than the result of iteratively evaluating the entire system until a fixpoint is reached.

[24] 
On Modular Pluggable Analyses Using Set Interfaces. MIT CSAIL Technical Report 933. 2003.
[ps]
[ bib 
list ]
We present a technique that enables the focused application of multiple analyses to different modules in the same program. Our research has two goals: 1) to address the scalability limitations of precise analyses by focusing the analysis on only those parts of the program that are relevant to the properties that the analysis is designed to verify, and 2) to enable the application of specialized analyses that verify properties of specific classes of data structures to programs that simultaneously manipulate several different kinds of data structures. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the interanalysis communication required to verify properties that depend on multiple modules analyzed by different analyses. We characterize the key soundness property that an analysis plugin must satisfy to successfully participate in our system and present several analysis plugins that satisfy this property: a flag plugin that analyzes modules in which abstract set membership is determined by a flag field in each object, and a graph types plugin that analyzes modules in which abstract set membership is determined by reachability properties of objects stored in treelike data structures.

[23] 
Generalized Typestate Checking Using Set Interfaces and Pluggable Analyses. SIGPLAN Notices. 2004.
[ps]
[ bib 
list ]
We present a generalization of standard typestate systems in which the typestate of each object is determined by its membership in a collection of abstract typestate sets. This generalization supports typestates that model participation in abstract data types, composite typestates that correspond to membership in multiple sets, and hierarchical typestates. Because membership in typestate sets corresponds directly to participation in data structures, our typestate system characterizes global sharing patterns. In our approach, each module encapsulates a data structure and uses membership in abstract sets to characterize how objects participate in its data structure. Each analysis verifies that the implementation of the module 1) preserves important internal data structure representation invariants and 2) conforms to a specification that uses formulas in a set algebra to characterize the effects of operations on the data structure. The analyses use the common set abstraction to 1) characterize how objects participate in multiple data structures and to 2) enable the interanalysis communication required to verify properties that depend on multiple modules analyzed by different analyses.

[22] 
On Role Logic. MIT CSAIL Technical Report 925. 2003.
[ps]
[ bib 
http 
list ]
We present role logic, a notation for describing properties of relational structures in shape analysis, databases, and knowledge bases. We construct role logic using the ideas of de Bruijn's notation for lambda calculus, an encoding of firstorder logic in lambda calculus, and a simple rule for implicit arguments of unary and binary predicates. The unrestricted version of role logic has the expressive power of firstorder logic with transitive closure. Using a syntactic restriction on role logic formulas, we identify a natural fragment RL^{2} of role logic. We show that the RL^{2} fragment has the same expressive power as twovariable logic with counting C^{2}, and is therefore decidable. We present a translation of an imperative language into the decidable fragment RL^{2}, which allows compositional verification of programs that manipulate relational structures. In addition, we show how RL^{2} encodes boolean shape analysis constraints and an expressive description logic.

[21] 
On the Boolean Algebra of Shape Analysis Constraints. MIT CSAIL Technical Report 2003.
[ps]
[ bib 
list ]
Shape analysis is a promising technique for statically verifying and extracting properties of programs that manipulate complex data structures. We introduce a new characterization of constraints that arise in parametric shape analysis based on manipulation of threevalued structures as dataflow facts. We identify an interesting syntactic class of firstorder logic formulas that captures the meaning of threevalued structures under concretization. This class is broader than previously introduced classes, allowing for a greater flexibility in the formulation of shape analysis constraints in program annotations and internal analysis representations. Threevalued structures can be viewed as one possible normal form of the formulas in our class. Moreover, we characterize the meaning of threevalued structures under “tight concretization”. We show that the seemingly minor change from concretization to tight concretization increases the expressive power of threevalued structures in such a way that the resulting constraints are closed under all boolean operations. We call the resulting constraints boolean shape analysis constraints. The main technical contribution of this paper is a natural syntactic characterization of boolean shape analysis constraints as arbitrary boolean combinations of firstorder sentences of certain form, and an algorithm for transforming such boolean combinations into the normal form that corresponds directly to threevalued structures. Our result holds in the presence of arbitrary shape analysis instrumentation predicates. The result enables the reduction (without any approximation) of the entailment and the equivalence of shape analysis constraints to the satisfiability of shape analysis constraints. When the satisfiability of the constraints is decidable, our result implies that the entailment and the equivalence of the constraints are also decidable, which enables the use of constraints in a compositional shape analysis with a predictable behavior.

[20] 
Structural Subtyping of NonRecursive Types is Decidable. Eighteenth Annual IEEE Symposium on Logic in Computer Science (LICS). 2003.
[ps]
[ bib 
list ]
We show that the firstorder theory of structural subtyping of nonrecursive types is decidable, as a consequence of a more general result on the decidability of term powers of decidable theories. Let Σ be a language consisting of function symbols and let C (with a finite or infinite domain C) be an Lstructure where L is a language consisting of relation symbols. We introduce the notion of Σtermpower of the structure C, denoted P_{Σ}(C). The domain of P_{Σ}(C) is the set of Σterms over the set C. P_{Σ}(C) has one term algebra operation for each f in Σ, and one relation for each r inL defined by lifting operations of C to terms over C. We extend quantifier elimination for term algebras and apply the FefermanVaught technique for quantifier elimination in products to obtain the following result. Let K be a family of Lstructures and K_{P} the family of their Σtermpowers. Then the validity of any closed formula F on K_{P} can be effectively reduced to the validity of some closed formula q(F) on K. Our result implies the decidability of the firstorder theory of structural subtyping of nonrecursive types with covariant constructors, and the construction generalizes to contravariant constructors as well.

[19] 
Existential Heap Abstraction Entailment is Undecidable. 10th Annual International Static Analysis Symposium. 2003.
[ps]
[ bib 
list ]
In this paper we study constraints for specifying properties of data structures consisting of linked objects allocated in the heap. Motivated by heap summary graphs in role analysis and shape analysis we introduce the notion of regular graph constraints. A regular graph constraint is a graph representing the heap summary; a heap satisfies a constraint if and only if the heap can be homomorphically mapped to the summary. Regular graph constraints form a very simple and natural fragment of the existential monadic secondorder logic over graphs. One of the key problems in a compositional static analysis is proving that procedure preconditions are satisfied at every call site. For role analysis, precondition checking requires determining the validity of implication, i.e., entailment of regular graph constraints. The central result of this paper is the undecidability of regular graph constraint entailment. The undecidability of the entailment problem is surprising because of the simplicity of regular graph constraints: in particular, the satisfiability of regular graph constraints is decidable. Our undecidability result implies that there is no complete algorithm for statically checking procedure preconditions or postconditions, simplifying static analysis results, or checking that given analysis results are correct. While incomplete conservative algorithms for regular graph constraint entailment checking are possible, we argue that heap specification languages should avoid secondorder existential quantification in favor of explicitly specifying a criterion for summarizing objects.

[18] 
InPlace Refinement for Effect Checking. Workshop on Automated Verification of
InfiniteState Systems. 2003.
[ps]
[ bib 
list ]
The refinement calculus is a powerful framework for reasoning about programs, specifications, and refinement relations between programs and specifications. In this paper we introduce a new refinement calculus construct, inplace refinement. We use inplace refinement to prove the correctness of a technique for checking the refinement relation between programs and specifications. The technique is applicable whenever the specification is an idempotent predicate transformer, as is the case for most procedure effects. Inplace refinement is a predicate on the current program state. A command inplace refines a specification in a given state if the effect of every execution of the command in the state is no worse then the effect of some execution of the specification in the state. We demonstrate the usefulness of the inplace refinement construct by showing the correctness of a precise technique for checking effects of commands in a computer program. The technique is precise because it takes into account the set of possible states in which each command can execute, using the information about the controlflow and expressions of conditional commands. This precision is particularly important for handling aliasing in objectoriented programs that manipulate dynamically allocated data structures. We have implemented the technique as a part of a sideeffect checker for the programming language C#.

[17] 
On the Theory of Structural Subtyping. MIT LCS Technical Report 879. 2003.
[ps]
[ bib 
http 
list ]
We show that the firstorder theory of structural subtyping of nonrecursive types is decidable. Let Σ be a language consisting of function symbols (representing type constructors) and C a decidable structure in the relational language L containing a binary relation <=. C represents primitive types; <= represents a subtype ordering. We introduce the notion of Σtermpower of C, which generalizes the structure arising in structural subtyping. The domain of the Σtermpower of C is the set of Σterms over the set of elements of C. We show that the decidability of the firstorder theory of C implies the decidability of the firstorder theory of the Σtermpower of C. Our decision procedure makes use of quantifier elimination for term algebras and FefermanVaught theorem. Our result implies the decidability of the firstorder theory of structural subtyping of nonrecursive types.

[16] 
Typestate Checking and Regular Graph Constraints. MIT LCS Technical Report 863. 2002.
[ps]
[ bib 
http 
list ]
We introduce regular graph constraints and explore their decidability properties. The motivation for regular graph constraints is 1) type checking of changing types of objects in the presence of linked data structures, 2) shape analysis techniques, and 3) generalization of similar constraints over trees and grids. We define a subclass of graphs called heaps as an abstraction of the data structures that a program constructs during its execution. We prove that determining the validity of implication for regular graph constraints over the class of heaps is undecidable. We show undecidability by exhibiting a characterization of certain corresponder graphs in terms of presence and absence of homomorphisms to a finite number of fixed graphs. The undecidability of implication of regular graph constraints implies that there is no algorithm that will verify that procedure preconditions are met or that the invariants are maintained when these properties are expressed in any specification language at least as expressive as regular graph constraints.

[15] 
Role Analysis. ACM SIGACTSIGPLAN Symposium on Principles of Programming Languages (POPL). 2002.
[ps]
[ bib 
list ]
We present a new role system in which the type (or role) of each object depends on its referencing relationships with other objects, with the role changing as these relationships change. Roles capture important object and data structure properties and provide useful information about how the actions of the program interact with these properties. Our role system enables the programmer to specify the legal aliasing relationships that define the set of roles that objects may play, the roles of procedure parameters and object fields, and the role changes that procedures perform while manipulating objects. We present an interprocedural, compositional, and contextsensitive role analysis algorithm that verifies that a program maintains role constraints.

[14] 
Designing an Algorithm for Role Analysis. Master's Thesis, MIT LCS. 2001.
[ps]
[ bib 
list ]
This thesis presents a system for specifying constraints on dynamically changing referencing relationships of heap objects, and an analysis for static verification of these constraints. The constraint specification system is based on the concept of role. The role of an object depends, in large part, on its aliasing relationships with other objects, with the role of each object changing as its aliasing relationships change. In this way roles capture object and data structure properties such as unique references, membership of objects in data structures, disjointness of data structures, absence of representation exposure, bidirectional associations, treeness, and absence or presence of cycles in the heap. Roles generalize linear types by allowing multiple aliases of heap objects that participate in recursive data structures. Unlike graph grammars and graph types, roles contain sufficiently general constraints to conservatively approximate any data structure. We give a semantics for mutually recursive role definitions and derive properties of roles as an invariant specification language. We introduce a programming model that allows temporary violations of role constraints. We describe a static role analysis for verifying that a program conforms to the programming model. The analysis uses fixpoint computation to synthesize loop invariants in each procedure. We introduce a procedure interface specification language and its semantics. We present an interprocedural, compositional, and contextsensitive role analysis that verifies that a program respects the role constraints across procedure calls.

[13] 
Roles are really great!. MIT LCS Technical Report 822. 2001.
[ps]
[ bib 
list ]
We present a new role system for specifying changing referencing relationships of heap objects. The role of an object depends, in large part, on its aliasing relationships with other objects, with the role of each object changing as its aliasing relationships change. Roles therefore capture important object and data structure properties and provide useful information about how the actions of the program interact with these properties. Our role system enables the programmer to specify the legal aliasing relationships that define the set of roles that objects may play, the roles of procedure parameters and object fields, and the role changes that procedures perform while manipulating objects. We present an interprocedural, compositional, and contextsensitive role analysis algorithm that verifies that a program respects the role constraints.

[12] 
A Language for Role Specifications. Workshop on Languages and Compilers for Parallel Computing. 2001.
[ps]
[ bib 
list ]
This paper presents a new language for identifying the changing roles that objects play over the course of the computation. Each object's pointsto relationships with other objects determine the role that it currently plays. Roles therefore reflect the object's membership in specific data structures, with the object's role changing as it moves between data structures. We provide a programming model which allows the developer to specify the roles of objects at different points in the computation. The model also allows the developer to specify the effect of each operation at the granularity of role changes that occur in identified regions of the heap.

[11] 
Object Models, Heaps, and Interpretations. MIT LCS Technical Report 816. 2001.
[ps]
[ bib 
list ]
This paper explores the use of object models for specifying verifiable heap invariants. We define a simple language based on sets and relations and illustrate its use through examples. We give formal semantics of the language by translation into predicate calculus and interpretation of predicates in terms of objects and references in the program heap.

[10]  Numerical Representations as Purely Functional Data Structures: A New Approach. INFORMATICA, Institute of Mathematics and Informatics, Vilnius. 2002. [ bib  list ] 
[9]  Types and confluence in lambda calculus. 3rd Panhellenic Logic Symposium. 2001. [ bib  list ] 
[8]  Confluence of untyped lambda calculus via simple types. Proceedings of the 7th Italian Conference on Theoretical Computer Science, ICTCS 2001. 2001. [ bib  list ] 
[7]  Reducibility method for termination properties of typed lambda terms. Fifth International Workshop on Termination. 2001. [ bib  list ] 
[6] 
Modular Language Specifications in Haskell. Theoretical Aspects of Computer Science
with practical application. 2000.
[ps]
[ bib 
list ]
We propose a framework for specification of programming language semantics, abstract and concrete syntax, and lexical structure. The framework is based on Modular Monadic Semantics and allows independent specification of various language features. Features such as arithmetics, conditionals, exceptions, state and nondeterminism can be freely combined into working interpreters, facilitating experiments in language design. A prototype implementation of this system in Haskell is described and possibilities for more sophisticated interpreter generator are outlined.

[5]  Modular Interpreters in Haskell. B.Sc. Thesis, University of Novi Sad. 2000. [ps] [ bib  list ] 
[4]  Numerical Representations as Purely Functional Data Structures. XIV Conference on Applied Mathematics PRIM, Palić. 2000. [ bib  list ] 
[3]  Reducibility Method in Simply Typed Lambda Calculus. XIV Conference on Applied Mathematics PRIM, Palić. 2000. [ bib  list ] 
[2] 
Developing a Multigrid Solver for Standing Wave Equation. Proceedings of the 28th Dr. Bessie F. Lawrence
International Summer Science Institute Participants,
Weizmann Institute of Science. 1996.
[ps]
[ bib 
list ]
In this paper multigrid technique is adapted for solving standing onedimensional wave equation with radiation boundary conditions. Solver, consisting of wave cycle and ray cycle, uses GaussSeidel and Kaczmarz relaxation sweeps and is aimed to work efficiently for all error components.

[1] 
PLS: Programming Language for Simulations. Proceedings of the Petnica Science Center Seminar '93. 1993.
[ bib 
list ]
I describe design and implementation of a programming language PLS for writing simulations of physical processes. PLS supports descriptions of dynamically changing components of systems and simplifies the specifications of interactions between multiple objects. The language is implemented as a translator from Pascal. I give EBNF syntax of the language, the translation rules from PLS to Pascal, and present an example of a PLS program for simulation of nbody gravitational interaction.
