Misplaced Pages

KeY

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The KeY tool is used in formal verification of Java programs. It accepts specifications written in the Java Modeling Language to Java source files. These are transformed into theorems of dynamic logic and then compared against program semantics that are likewise defined in terms of dynamic logic. KeY is significantly powerful in that it supports both interactive (i.e. by hand) and fully automated correctness proofs. Failed proof attempts can be used for a more efficient debugging or verification-based testing . There have been several extensions to KeY in order to apply it to the verification of C programs or hybrid systems . KeY is jointly developed by Karlsruhe Institute of Technology , Germany; Technische Universität Darmstadt , Germany; and Chalmers University of Technology in Gothenburg, Sweden and is licensed under the GPL .

#838161

47-458: The usual user input to KeY consists of a Java source file with annotations in JML. Both are translated to KeY's internal representation, dynamic logic . From the given specifications, several proof obligations arise which are to be discharged, i.e. a proof has to be found. To this ends, the program is symbolically executed with the resulting changes to program variables stored in so-called updates . Once

94-428: A measure with each step of an algorithm. The measure is taken from the domain of a well-founded relation , such as from the ordinal numbers . If the measure "decreases" according to the relation along every possible step of the algorithm, it must terminate, because there are no infinite descending chains with respect to a well-founded relation. Some types of termination analysis can automatically generate or imply

141-451: A total function. It is closely related to the halting problem , which is to determine whether a given program halts for a given input and which is undecidable . The termination analysis is even more difficult than the Halting problem: the termination analysis in the model of Turing machines as the model of programs implementing computable functions would have the goal of deciding whether

188-483: A concept called updates , which are similar to substitutions but are only applied once the program modality has been eliminated. Syntactically, updates are consist of parallel (side-effect free) assignments written in curly braces in front of a modality. An example of symbolic execution with updates: [ x = 3 ; x = x + 1 ; ] x   = ˙   4 {\displaystyle [x=3;x=x+1;]x\ {\dot {=}}\ 4}

235-416: A copy of the program state at the branch instruction as well as a path constraint. In this example, the path constraint is λ * 2 == 12 for the if branch and λ * 2 != 12 for the else branch. Both paths can be symbolically executed independently. When paths terminate (e.g., as a result of executing fail() or simply exiting), symbolic execution computes a concrete value for λ by solving

282-464: A finite set of methods being able to extract relevant information out of a given program. A method might look at how variables change with respect to some loop condition (possibly showing termination for that loop), other methods might try to transform the program's calculation to some mathematical construct and work on that, possibly getting information about the termination behaviour out of some properties of this mathematical model. But because each method

329-410: A given Turing machine is a total Turing machine , and this problem is at level Π 2 0 {\displaystyle \Pi _{2}^{0}} of the arithmetical hierarchy and thus is strictly more difficult than the Halting problem. Now as the question whether a computable function is total is not semi-decidable , each sound termination analyzer (i.e. an affirmative answer

376-446: A program grows exponentially with an increase in program size and can even be infinite in the case of programs with unbounded loop iterations. Solutions to the path explosion problem generally use either heuristics for path-finding to increase code coverage, reduce execution time by parallelizing independent paths, or by merging similar paths. One example of merging is veritesting , which "employs static symbolic execution to amplify

423-449: A separate location. The problem with treating each array element separately is that a reference such as "A[i]" can only be specified dynamically, when the value for i has a concrete value. Programs interact with their environment by performing system calls , receiving signals, etc. Consistency problems may arise when execution reaches components that are not under control of the symbolic execution tool (e.g., kernel or libraries). Consider

470-591: Is a deductive verification tool for hybrid systems based on a calculus for the differential dynamic logic dL [2] . It extends the KeY tool with computer algebra systems like Mathematica and corresponding algorithms and proof strategies such that it can be used for practical verification of hybrid systems . KeYmaera has been developed at the University of Oldenburg and the Carnegie Mellon University . The name of

517-429: Is a version of a first-order dynamic logic tailored to Java Card programs. As such, it for example allows statements (formulas) like ϕ → [ α ] ψ {\displaystyle \phi \rightarrow [\alpha ]\psi } , which intuitively says that the post-condition ψ {\displaystyle \psi } must hold in all program states reachable by executing

SECTION 10

#1733094253839

564-585: Is logically equivalent to x   = ˙   0 → x   = ˙   0 {\displaystyle x\ {\dot {=}}\ 0\rightarrow x\ {\dot {=}}\ 0} . As this example shows, symbolic execution in dynamic logic is very similar to calculating weakest preconditions . Both [ α ] ψ {\displaystyle [\alpha ]\psi } and w p ( α , ψ ) {\displaystyle wp(\alpha ,\psi )} essentially denote

611-474: Is never given for a non-terminating program) is incomplete , i.e. must fail in determining termination for infinitely many terminating programs, either by running forever or halting with an indefinite answer. A termination proof is a type of mathematical proof that plays a critical role in formal verification because total correctness of an algorithm depends on termination. A simple, general method for constructing termination proofs involves associating

658-535: Is only able to "see" some specific reasons for (non)termination, even through combination of such methods one cannot cover all possible reasons for (non)termination. Recursive functions and loops are equivalent in expression; any expression involving loops can be written using recursion, and vice versa. Thus the termination of recursive expressions is also undecidable in general. Most recursive expressions found in common usage (i.e. not pathological ) can be shown to terminate through various means, usually depending on

705-569: Is possible. There is also a dual modality ⟨ α ⟩ {\displaystyle \langle \alpha \rangle } which includes termination . This dynamic logic can be seen as a special multi-modal logic (with an infinite number of modalities) where for each Java block α {\displaystyle \alpha } there are modalities [ α ] {\displaystyle [\alpha ]} and ⟨ α ⟩ {\displaystyle \langle \alpha \rangle } . At

752-550: Is shown to be constructible from just fundamental first-order axioms (such as equality e   = ˙   e {\displaystyle e\ {\dot {=}}\ e} ). During that, program modalities are eliminated by symbolic execution . For instance, the formula x   = ˙   0 → [ x + + ; ] x   = ˙   1 {\displaystyle x\ {\dot {=}}\ 0\rightarrow [x++;]x\ {\dot {=}}\ 1}

799-501: Is to annotate the types over which we can recurse with size annotations and allow recursive calls only on smaller arguments. Sized types are implemented in Agda as a syntactic extension. There are several research teams that work on new methods that can show (non)termination. Many researchers include these methods into programs that try to analyze the termination behavior automatically (so without human interaction). An ongoing aspect of research

846-405: Is transformed to { x := 3 } [ x = x + 1 ; ] x   = ˙   4 {\displaystyle \{x:=3\}[x=x+1;]x\ {\dot {=}}\ 4} in the first step and to { x := 4 } [ ] x   = ˙   4 {\displaystyle \{x:=4\}[]x\ {\dot {=}}\ 4} in

893-840: The Java Card program α {\displaystyle \alpha } in any state that satisfies the pre-condition ϕ {\displaystyle \phi } . This is equivalent to { ϕ } α { ψ } {\displaystyle \{\phi \}\alpha \{\psi \}} in Hoare calculus if ϕ {\displaystyle \phi } and ψ {\displaystyle \psi } are purely first order. Dynamic logic, however, extends Hoare logic in that formulas may contain nested program modalities such as [ α ] {\displaystyle [\alpha ]} , or that quantification over formulas which contain modalities

940-410: The accumulated path constraints on each path. These concrete values can be thought of as concrete test cases that can, e.g., help developers reproduce bugs. In this example, the constraint solver would determine that in order to reach the fail() statement, λ would need to equal 6. Symbolically executing all feasible program paths does not scale to large programs. The number of feasible paths in

987-536: The bottom and deduction steps go upwards. The proof can be seen in the figure on the right. The Symbolic Execution Debugger visualizes the control flow of a program as a symbolic execution tree that contains all feasible execution paths through the program up to a certain point. It is provided as a plugin to the Eclipse development platform. KeY is usable as a model-based testing tool that can generate unit tests for Java programs. The model from which test data and

SECTION 20

#1733094253839

1034-428: The concept to the analysis of mathematical expressions. Consider the program below, which reads in a value and fails if the input is 6. During a normal execution ("concrete" execution), the program would read a concrete input value (e.g., 5) and assign it to y . Execution would then proceed with the multiplication and the conditional branch, which would evaluate to false and print OK . During symbolic execution,

1081-613: The definition of the expression itself. As an example, the function argument in the recursive expression for the factorial function below will always decrease by 1; by the well-ordering property of natural numbers , the argument will eventually reach 1 and the recursion will terminate. Termination check is very important in dependently typed programming language and theorem proving systems like Coq and Agda . These systems use Curry-Howard isomorphism between programs and proofs. Proofs over inductively defined data types were traditionally described using induction principles. However, it

1128-412: The effect of dynamic symbolic execution". Symbolic execution is used to reason about a program path-by-path which is an advantage over reasoning about a program input-by-input as other testing paradigms use (e.g. dynamic program analysis ). However, if few inputs take the same path through the program, there is little savings over testing each of the inputs separately. Symbolic execution is harder when

1175-536: The entire system state. Symbolic execution tools based on virtual machines solve the environment problem by forking the entire VM state. For example, in S2E each state is an independent VM snapshot that can be executed separately. This approach alleviates the need for writing and maintaining complex models and allows virtually any program binary to be executed symbolically. However, it has higher memory usage overheads (VM snapshots may be large). The concept of symbolic execution

1222-403: The existence of a termination proof. An example of a programming language construct which may or may not terminate is a loop , as they can be run repeatedly. Loops implemented using a counter variable as typically found in data processing algorithms will usually terminate, demonstrated by the pseudocode example below: If the value of SIZE_OF_DATA is non-negative, fixed and finite,

1269-468: The following example: This program opens a file and, based on some condition, writes different kind of data to the file. It then later reads back the written data. In theory, symbolic execution would fork two paths at line 5 and each path from there on would have its own copy of the file. The statement at line 11 would therefore return data that is consistent with the value of "condition" at line 5. In practice, file operations are implemented as system calls in

1316-402: The heart of the KeY system lies a first-order theorem prover based on a sequent calculus . A sequent is of the form Γ ⊢ Δ {\displaystyle \Gamma \vdash \Delta } where Γ {\displaystyle \Gamma } (assumptions) and Δ {\displaystyle \Delta } (propositions) are sets of formulas with

1363-401: The intuitive meaning that ⋀ γ ∈ Γ γ → ⋁ δ ∈ Δ δ {\displaystyle \bigwedge _{\gamma \in \Gamma }\gamma \rightarrow \bigvee _{\delta \in \Delta }\delta } holds true. By means of deduction , an initial sequent representing the proof obligation

1410-490: The kernel, and are outside the control of the symbolic execution tool. The main approaches to address this challenge are: Executing calls to the environment directly. The advantage of this approach is that it is simple to implement. The disadvantage is that the side effects of such calls will clobber all states managed by the symbolic execution engine. In the example above, the instruction at line 11 would return "some datasome other data" or "some other datasomedata" depending on

1457-487: The loop condition to be fulfilled. In termination analysis one may also try to determine the termination behaviour of some program depending on some unknown input. The following example illustrates this problem. Here the loop condition is defined using some value UNKNOWN, where the value of UNKNOWN is not known (e.g. defined by the user's input when the program is executed). Here the termination analysis must take into account all possible values of UNKNOWN and find out that in

KeY - Misplaced Pages Continue

1504-411: The loop will eventually terminate, assuming process_data terminates too. Some loops can be shown to always terminate or never terminate through human inspection. For example, the following loop will, in theory, never stop. However, it may halt when executed on a physical machine due to arithmetic overflow : either leading to an exception or causing the counter to wrap to a negative value and enabling

1551-536: The need for compilation and installation. KeY-Hoare is built on top of KeY and features a Hoare calculus with state updates. State updates are a means of describing state transitions in a Kripke structure . This calculus can be seen as a subset to the one that is used in the main branch of KeY. Due to the simplicity of the Hoare calculus, this implementation is essentially meant to exemplify formal methods in undergraduate classes. KeYmaera [1] (previously called HyKeY)

1598-623: The possible case of UNKNOWN = 0 (as in the original example) the termination cannot be shown. There is, however, no general procedure for determining whether an expression involving looping instructions will halt, even when humans are tasked with the inspection. The theoretical reason for this is the undecidability of the Halting Problem: there cannot exist some algorithm which determines whether any given program stops after finitely many computation steps. In practice one fails to show termination (or non-termination) because every algorithm works with

1645-465: The program has been processed completely, there remains a first-order logic proof obligation. At the heart of the KeY system lies a first-order theorem prover based on sequent calculus , which is used to close the proof. Interference rules are captured in so called taclets which consist of an own simple language to describe changes to a sequent. The theoretical foundation of KeY is a formal logic called Java Card DL. DL stands for Dynamic Logic. It

1692-402: The program reads a symbolic value (e.g., λ ) and assigns it to y . The program would then proceed with the multiplication and assign λ * 2 to z . When reaching the if statement, it would evaluate λ * 2 == 12 . At this point of the program, λ could take any value, and symbolic execution can therefore proceed along both branches, by "forking" two paths. Each path gets assigned

1739-421: The program would. It thus arrives at expressions in terms of those symbols for expressions and variables in the program, and constraints in terms of those symbols for the possible outcomes of each conditional branch. Finally, the possible inputs that trigger a branch can be determined by solving the constraints. The field of symbolic simulation applies the same concept to hardware. Symbolic computation applies

1786-429: The proof with the premise x ≥ 0 ∧ y ≥ 0 {\displaystyle x\geq 0\land y\geq 0} and the to-be-shown conclusion z   = ˙   x ⋅ y {\displaystyle z\ {\dot {=}}\ x\cdot y} . Note that tableaux of sequent calculi are usually written "upside-down", i.e., the starting sequent appears at

1833-400: The same memory location can be accessed through different names ( aliasing ). Aliasing cannot always be recognized statically, so the symbolic execution engine can't recognize that a change to the value of one variable also changes the other. Since an array is a collection of many distinct values, symbolic executors must either treat the entire array as one value or treat each array element as

1880-437: The same thing – with two exceptions: Firstly, w p {\displaystyle wp} is a function of some meta-calculus while [ α ] ψ {\displaystyle [\alpha ]\psi } really is a formula of the given calculus. Secondly, symbolic execution runs through the program forward just as an actual execution would. To save intermediate results of assignments, KeY introduces

1927-425: The second step. The modality then is empty and "backwards application" of the update to the postcondition yields a precondition where x {\displaystyle x} could take any value. Suppose one wants to prove that the following method calculates the product of some non-negative integers x {\displaystyle x} and y {\displaystyle y} . One thus starts

KeY - Misplaced Pages Continue

1974-599: The sequential ordering of the states. Modeling the environment. In this case, the engine instruments the system calls with a model that simulates their effects and that keeps all the side effects in per-state storage. The advantage is that one would get correct results when symbolically executing programs that interact with the environment. The disadvantage is that one needs to implement and maintain many potentially complex models of system calls. Tools such as KLEE, Cloud9, and Otter take this approach by implementing models for file system operations, sockets, IPC , etc. Forking

2021-481: The symbolic execution of Abstract State Machines , that was developed at ETH Zürich . This variant is no longer supported. Symbolic execution In computer science , symbolic execution (also symbolic evaluation or symbex ) is a means of analyzing a program to determine what inputs cause each part of a program to execute . An interpreter follows the program, assuming symbolic values for inputs rather than obtaining actual inputs as normal execution of

2068-532: The test case are derived consists of a formal specification (provided in JML ) and a symbolic execution tree of the implementation under test which is computed by the KeY system. KeY is free software written in Java and licensed under GPL . It can be downloaded from the project website in source; currently there are no pre-compiled binaries available. As another possibility, KeY can be executed directly via Java web start without

2115-464: The tool was chosen as a homophone to Chimera , the hybrid animal from ancient Greek mythology. KeYmaeraX [3] developed at the Carnegie Mellon University is the successor of KeYmaera. It has been completely rewritten. KeY for C is an adaption of the KeY System to MISRA C , a subset of the C programming language . This variant is no longer supported. There is also an adaptation to use KeY for

2162-459: Was found later that describing a program via a recursively defined function with pattern matching is a more natural way of proving than using induction principles directly. Unfortunately, allowing non-terminating definitions leads to logical inconsistency in type theories , which is why Agda and Coq have termination checkers built-in. One of the approaches to termination checking in dependently typed programming languages are sized types. The main idea

2209-548: Was introduced academically in the 1970s with descriptions of: the Select system, the EFFIGY system, the DISSECT system, and Clarke's system. Termination analysis In computer science , termination analysis is program analysis which attempts to determine whether the evaluation of a given program halts for each input. This means to determine whether the input program computes

#838161