Verification Projects
Proofs and program analysis
Integration of bounded and unbounded verification
Supervisor:
Nadia Polikarpova
Suitability: master thesis
Status: available
Bounded constraint solvers such as Alloy,
cannot be used to prove that a program is correct.
However, when a program or its specification is incorrect,
such tools are very good at finding compact counterexamples
and presenting them to the user in an understandable way.
The goal of this project is to build a bounded checking back-end
for a an existing program verifier
and find out if it helps users in understanding failed verification attempts.
Proof-Transforming Compilation
Supervisor:
Martin Nordio
Proof-transforming compilers (PTC) are similar to certifying compilers
in proof-carrying code,
but take a source proof as input and produce the bytecode proof.
An important property of proof-transforming compilers
is that they do not have to b e trusted.
If the compiler produces a wrong specification
or a wrong proof for a component,
the proof checker of the component platform will reject the component.
We are developing a library to model source proofs and bytecode proofs.
Furthermore, we are developing a proof-transforming compiler
that takes an Eiffel proof (in a Hoare-style logic)
and produces a bytecode proof (CIL bytecode).
ProveIt user interface improvement
Supervisor:
Stephan van Staden
Suitability: Master's thesis
Status: available
ProveIt
is a powerful O-O verification tool based on separation logic.
The user interface of ProveIt
is lagging a bit behind theoretical developments.
Error reporting,
the selection of classes and/or methods for verification,
and the presentation of Hoare-style proofs
are prime candidates for improvement.
The goal of this project is to improve the user-interface of ProveIt
to make it one of the best automatic O-O verification tools out there.
Proof management
Supervisors:
Stephan van Staden
and
Julian Tschannen
Suitability: Master's thesis
Status: available
Proof tools such as EVE Proofs and
ProveIt
must currently re-verify the whole program if a change occurs.
Since verification is modular,
this technique can be improved by managing proofs
and re-verifying only the changed method and its clients.
Creation of a sophisticated system
where re-verification happens only when needed
and preferably in the background
is the topic of this project.
Integration of tests and proofs
Supervisors:
Stephan van Staden
and
Julian Tschannen
Suitability: Master's thesis
Status: available
Testing and correctness proofs can benefit from each other.
Testing can provide concrete reasons why a correctness proof failed,
and successfully proving a method would make testing it pointless.
The goal of this project is to find innovative ways
to integrate tests and proofs.
Automated test and proof tools are already integrated in EiffelStudio
and will provide the springboard for this project.
INTEGERs with contracts
Supervisor:
Stephan van Staden
Suitability: Bachelor thesis, Master's thesis
Status: available
The EiffelStudio compiler translates operations on INTEGERs
directly to those on C ints for efficiency reasons.
Contracts of INTEGER classes are discarded in this process,
thereby making errors such as overflow extremely hard to detect.
For example,
this overflow-related precondition of plus would be discarded,
allowing overflow to happen silently:
(Current < 0 ^ other < 0) implies
(Current >= (min_value - other)) and
(Current > 0 ^ other > 0) implies
(Current <= (max_value - other)).
The goal of this project is
to enable contracts for INTEGER classes by modifying the compiler.
Although undesirable in a production setting,
tools such as AutoTest will be able to exploit these contracts
for testing purposes.
Overflow-avoiding preconditions
Supervisor: Stephan van StadenSuitability: Master's thesis
Status: available
Programmers like to think about INTEGERs as mathematical objects with infinite precision. There is only a finite number of INTEGERs, however, and overflow occurs if the result of an operation is either too big or too small. Restrictions on the magnitude of INTEGERs are rarely included in contracts, thereby making many routines susceptible to overflow (e.g. make of class ARRAY). The goal of this project is to infer weakest preconditions for routines with respect to overflow. Thus an additional precondition clause will be calculated that, if satisfied, will ensure absence of overflow in the routine body. This can be extended to include other kinds of arithmetic errors, e.g. division by zero.
Proof of absence of CATCALLs
Supervisor: Stephan van StadenSuitability: Master's thesis
Status: available
CATCALLs are runtime errors related to Eiffel's covariant type system. In particular, o.r (a) might fail if o has dynamic type T, a has dynamic type U, and routine r applicable to objects of type T expects its argument to be an instance of a subtype of U. The goal of this project is to verify absence of CATCALLs in Eiffel programs. The analysis should be sound yet fine-grained enough to verify useful programs, which entails that control-flow should be taken into account. Whenever a potential CATCALL is identified, the user must be informed and the scenario that could trigger it should be presented.
Proof of absence of agent call errors
Supervisor:
Stephan van Staden
Suitability: Master's thesis
Status: available
Agent call errors can happen
due to the conformance rules for TUPLEs in Eiffel.
For example, an agent of type
PROCEDURE [ANY, TUPLE [INTEGER, STRING]]
can be assigned to an entity e of type
PROCEDURE [ANY, TUPLE [INTEGER]].
Calling e.call ([2]) will yield an error
because the agent attached to e
expects an additional STRING argument.
The goal of this project is to prove the absence of such errors
by static analysis of the code.
Simple techniques should already be helpful,
and more sophisticated techniques (e.g. data-flow analysis)
can enhance such a tool's power.
Why Boogie?: Supporting multiple program proving engines
Supervisor: Carlo A. Furia
Suitability: Bachelor or Master's thesis
Status: available
Verifiers increasingly rely on back-end program provers to
perform verification. Program provers constitute an
intermediate layer between the front-end -- that handles the
specifics of the high-level language, such as Eiffel or C# --
and the back-end theorem prover that discharges low-level
verification conditions. The prover takes care of generating
verification conditions efficiently from an intermediate
language that is still human-readable but abstracts away
high-level features such as inheritance. Two commonly used
general-purpose program provers are Boogie (from Microsoft
Research) and Why3 (developed by INRIA). In this project, you
will develop a translation of Boogie into Why3. The
translation will make it possible to use Why3 as
a back-end for verifiers that output Boogie code.
Sequence master: Reasoning automatically about sequence-manipulating programs
Supervisor: Carlo A. Furia
Suitability: Bachelor, Master's thesis, or EiffelStudio lab
Status: available
Sequences represent the salient features of many data structures,
such as lists, arrays, and queues.
Reasoning about programs manipulating such data structures
requires a logic of sequences and a decision procedure for it.
Recent work developed a decidable theory of sequences
based on the first-order theory of concatenation
(http://arxiv.org/abs/1001.2100).
The decision procedure relies on the decidability
of the quantifier-free fragment of the theory of concatenation
with a sophisticated algorithm
for solving equations over sequences known as Makanin's algorithm
after his inventor. There is also a
QFIS prototype verifier
based on this theory of sequences, which, however, does not use
Makanin's algorithm but relies on a partial encoding for SMT solvers.
There are several distinct projects based on this background.
- An implementation of Makanin's algorithm and of the original decision procedure for the theory of sequence.
- A redesign and improvement of the QFIS verifier, to make it extensible and robust.
- The development of annotations (exact or approximate) for the operations offered by common data-structure libraries in the form of formulae of the theory of sequences. This will make it possible to verify automatically client code of these libraries.
Contract-based testing
AutoTest: An environment for automatic test generation based on Design by Contract
Supervisor: Yi Wei
If contracts - preconditions, postconditions, class invariants -
are systematically associated with classes,
they provide an invaluable source of information
for producing systematic tests,
directly based on the software's expressly intended semantics.
A first version of a tool
which automatically generates test cases from contracts,
runs the tests and then evaluates the results,
has been developed.
The goal of this project is to extend the existing tool
with new functionalities.
-
Guided test input generation
Status: available as master thesis or semester thesis
In long testing sessions, AutoTest spends most of the time generating duplicated test inputs. The key to come up with new test inputs is to know how to change the state of existing objects. For example, calling extend (v) can increase the count of a list by 1. This knowledge can be learnt from test cases that are already generated by AutoTest.
In the previous AutoFix work, we developed a simple object behavior model, which describes the behavior of commands such as extend. This behavior model is further enhanced in more recent work on the semantic search system. In this project, we'll going to use the object behavior model to guide test input generation, with the goal to reach more object states more quickly. -
Explaining the reason of program failures
Status: available as master thesis or semester thesis
When AutoTest finds a fault in program, it provides a test case, which reproduces the relevant failure with an exception trace. It will be very helpful to also provide some hints on why the program fails. In this project, we will use both static information (for example, control flow graph of the faulty feature) and dynamic information (for example, object states retrieved through debugger) to find the reason of program failures. -
Tuple generation
Status: available as semester thesis or SE-lab project
Tuples are convenient to group a list of objects of different types, for example register_person ([name, age, is_married]). Unfortunately, AutoTest cannot test such features because it does not know how to generate tuple objects. In this project, we'll going to help AutoTest to generate tuple objects.
Extracting code usage from code repository
Supervisor: Yi Wei
Status: Available as master or semester thesis
Many researchers try to synthesize code,
that, after all, is a long term dream of programmers.
The synthesized code should
1) performs as specified, and
2) looks natural to humans.
For both points,
learning from existing human-written code seems to be a good start.
As more and more open source software is added to code repositories
such as Origo,
Google Code
or Koders,
it is possible to learn how code is actually used.
In this project, we'll explore ways to extract code usage patterns
from code repositories.
Topics include:
- Search code snippets based on keywords — How to identify code for LINKED_LIST
- Usage pattern — How LINKED_LIST is used? forth after start?
- Interface matching — Which classes have similar interface? LINKED_LIST and ARRAYED_LIST?
- Topic extraction — What does this loop do? Sorting?
Programs that fix themselves
Diff library in Eiffel (Diffeif)
Supervisor: Max (Yu) Pei
Status: Available as semester thesis or bachelor thesis
Programmers often need to compare two pieces of code in their daily
job, and this project aims to develop a library to ease the pain of
making such comparisons manually. The library contains two closely
related parts: a diff tool that compares automatically the code and
reports only the differences between them, in lines, words, or even AST
nodes, and a diff widget that takes the output report from the diff
tool and highlights the differences in GUI.
Fault Localization for OO Programs (Floop)
Supervisor: Max (Yu) Pei
Status: Available as semester thesis or master thesis
Existing
fault localization techniques mostly target at procedure-oriented
programs: even though some of them were evaluated using OO programs as
subjects, the techniques are not designed with OO specific features,
e.g. polymorphism, in mind. Intuitively, information about classes and
objects in the program should be useful in locating faults. In this
project, we will try to incorporate different object information into
the localization process and evaluate the performance gain/penalty, if
any.
Spectrum-based Fault Localization in Programs with Contracts
Supervisor: Max (Yu) Pei
Status: Available as semester thesis or master thesis
Spectrum-based fault localization techniques compare program behaviors in passing and failing executions and prioritize program elements in the descending order of their likelihood of being the cause of a failure. Such techniques usually report their results as a list of program elements, i.e. expressions, locations, etc, which are the most suspicious. In the context of Design-by-Contract, fault localization can be done on the basis of the assumption that faults usually trigger contract violations, and that the recipient feature of the exception raised due to a contract violation is responsible for the fault, i.e. being faulty. In order to approve, or disprove, this assumption, you need to implement in this project several fault localization techniques in Eiffel, and compare the results with the scope of the recipient feature.
Improving software specifications
FOL4All: Complex loop invariants with a theorem prover
Supervisor: Carlo A. Furia
Suitability: Master's thesis or Research in computer science
Status: available
Loop invariants are necessary to verify any non-trivial program.
At the same time, they are more difficult for a programmer
to write than other annotations (such as pre and postconditions).
This suggests that techniques to automate the inference of loop
invariants can have a very significant impact and improve
software verification in practice.
This project considers an approach
[KV09]
to find loop invariants
based on translating the semantics of loops into first-order
formulas. These formulas are then analyzed with a fully-automated
saturation theorem prover
(such as Vampire).
The theorem prover generates all consequences of the loop's semantics,
among which are loop invariants.
The approach is best-effort,
because the generation of the possible invariants
cannot be exhaustive, but it can be effective in practice
and generate complex invariants for some loops.
In this project, you will develop and implement a loop analysis engine
based on translation of loops in first-order formulas
to be manipulated by a saturation theorem prover. The translation
leverages intermediate languages that provide various abstraction
levels, from the program source down to first-order logic.
LAAID (Loop Analysis And IDentikit): An empirical analysis of loops
Supervisor: Carlo A. Furia
Suitability: Bachelor thesis or EiffelStudio lab
Status: available
Loops are the hardest part when it comes to reasoning about a program.
Several techniques for automating reasoning about loops
have been developed,
each placing a particular restriction on the loops it can handle
(e.g., the number of nested loops,
the type of operations in the loop body, etc.).
It is not known, however, how does the "usual" loop look like,
how complex it is,
and how far is the state-of-the-art from being able to handle
these most frequently occurring loops automatically.
In this project, you will develop a few metrics
to characterize the complexity of loops
with respect to the currently available technology for automated proofs,
and you will mine codebases in various languages
to determine what kind of loops are most commonly found in programs.
EVE-Pink: Discovering loop invariants by postcondition weakening
Supervisor: Carlo A. Furia
Suitability: Bachelor thesis, Master's thesis, or EiffelStudio lab
Status: available
Loop invariants are an indispensable ingredient
to perform correctness proofs of programs.
At the same time, annotating a loop with significant invariants
is often difficult and tricky.
The availability of postconditions can help in this regard,
as loop invariants can often be expressed
as a weakening of postconditions.
In this project,
you will be asked to implement a loop invariant inference technique
based on postcondition weakening.
The implementation will re-design and improve the
current prototype,
and integrate it within a development environment.
All my invariants: Cataloguing loop invariant patterns
Supervisor: Carlo A. Furia
Suitability: Bachelor thesis
Status: available
In spite of the extensive amount of available research
on loop invariant inference,
a rich catalog of algorithms
fully annotated with their contracts and invariants is still lacking.
Most "classic" works on formal program construction
list only a limited number of patterns
that link a loop's postcondition to its invariants.
This projects aims at advancing the knowledge of these patterns
by compiling a catalog of diverse algorithms
complete with detailed postconditions and loop invariants.
The algorithms will be selected appropriately,
fully annotated, and implemented in a verification environment
to mechanically check the consistency of the annotations
and their adequacy.
Model-based contracts and EiffelBase2
Supervisor: Nadia Polikarpova
Complete model-based contracts
are an approach to providing full functional specifications
for reusable components.
EiffelBase2
is a data structure library for Eiffel
specified using model-based contracts.
We offer several projects
connected to further development of the approach and the library.
-
Model-based contracts for Java / C# collections
Suitability: bachelor thesis, research in computer science, open-source EiffelStudio lab, or master thesis
Status: not available
Our preliminary studies show that complete contracts combined with automated contract-based testing can help reveal subtle faults in Eiffel classes. In this project you would have to provide contracts, similar to ones found in EiffelBase2, for java.util or .NET collections. Then you would have to perform an experiment to find out if complete contracts can help find subtle faults (real or injected) in Java / C# code as well. -
Efficient model classes
Suitability: bachelor thesis, research in computer science or open-source EiffelStudio lab
Status: available
Specifications in EiffelBase2 rely on model classes from a library called MML (Mathematical Model Library). Model classes are immutable classes that represent mathematical notions such as sets, maps or sequences. Current implementation of MML is naive and inefficient, which makes runtime checking of model-based contracts very slow. Your task in this project would be to provide a more efficient implementation for model classes and justify your choice empirically.
Dynamic contract inference
Supervisor:
Jason (Yi) Wei
Dynamic contract inference is a technique
to generate software specification automatically,
based on observations made during program execution.
We implemented this technique in the
AutoInfer tool;
however there are still many potential improvements to the tool
and many exciting new ideas to try out:
-
Program point inheritance
(Making inferred assertions consistent with behavioral inheritance)
Status: Available
When contracts are inferred independently for different classes in an inheritance hierarchy, contradictions may appear. Consider the following example: let class A define a feature f: INTEGER and class B inherit from A and redefine f. Let the implementation of f in A be such that it always returns 0, and in B - such that it always returns 1. Then the tool will most likely infer postconditions ensure Result = 0 for A.f and ensure then Result = 1 for B.f. Here is the contradiction: the full (flattened) postcondition of B.f in this case cannot be satisfied.
The solution is to consider the return values of both A.f and B.f when inferring the postcondition for A.f. Then we'll probably get something like ensure Result = 0 or Result = 1 for A.f and again ensure then Result = 1 for B.f, which is perfectly ok.
Things are even more delicate with preconditions. For more details you can address the paper "Dynamically discovering likely interface invariants" by Csallner and Smaragdakis, where the idea originally comes from. They consider the JML model of behavioral inheritance, which slightly differs from the one used in Eiffel, but the main idea is the same.
In this project you will be asked to implement the program point inheritance in AutoInfer. -
Variable comparability analysis for Eiffel
Status: Currently not available
One of the reasons for inferred assertions to be uninteresting is that they are comparing unrelated variables. An example is person.shoe_size < person.bank_account_number, which is probably always true for a certain implementation, but is not a meaningful semantic property of a program. For other programming languages there are tools that perform static analysis of the code to find out which variables are comparable and which are not. This information can be then supplied to the contract inference tool to make it infer comparisons only between comparable variables. Comparability analysis is done, e.g., based on whether two variables participate together in assignments and equality/comparison expressions. Your goal in this project is to implement such a comparability analysis tool for Eiffel. -
Inferring conditional assertions
Status: Available
We would like to be able to infer contract clauses of the form P or Q or P implies Q. Doing this requires splitting the set of value samples for a program point into two according to some predicate and then inferring properties separately for the two subsets. The main challenge here is to choose interesting splitting predicates.
In this project you will have to come up with a set of rules for extracting interesting splitting predicates from the source code.