Program synthesis - Misplaced Pages

In computer science , program synthesis is the task to construct a program that provably satisfies a given high-level formal specification . In contrast to program verification , the program is to be constructed rather than given; however, both fields make use of formal proof techniques, and both comprise approaches of different degrees of automation. In contrast to automatic programming techniques, specifications in program synthesis are usually non- algorithmic statements in an appropriate logical calculus .

#384615

45-504: The primary application of program synthesis is to relieve the programmer of the burden of writing correct, efficient code that satisfies a specification. However, program synthesis also has applications to superoptimization and inference of loop invariants . During the Summer Institute of Symbolic Logic at Cornell University in 1957, Alonzo Church defined the problem to synthesize a circuit from mathematical requirements. Even though

90-987: A {\displaystyle a} , if both premises hold, then the conclusion b ∨ c {\displaystyle b\vee c} is true. Resolution rule can be generalized to first-order logic to: where ϕ {\displaystyle \phi } is a most general unifier of L 1 {\displaystyle L_{1}} and L 2 ¯ {\displaystyle {\overline {L_{2}}}} , and Γ 1 {\displaystyle \Gamma _{1}} and Γ 2 {\displaystyle \Gamma _{2}} have no common variables. The clauses P ( x ) , Q ( x ) {\displaystyle P(x),Q(x)} and ¬ P ( b ) {\displaystyle \neg P(b)} can apply this rule with [ b / x ] {\displaystyle [b/x]} as unifier. Here x

135-399: A context-free grammar of expressions that constrains the syntax of valid solutions. For example, to synthesize a function f that returns the maximum of two integers, the logical specification might look like this: ( f ( x , y ) = x ∨ f ( x , y ) = y ) ∧ f ( x , y ) ≥ x ∧ f ( x , y ) ≥ y and the grammar might be: where "ite" stands for "if-then-else". The expression would be

180-422: A derivation of the empty clause using only resolution, enhanced by factoring. An example for an unsatisfiable clause set for which factoring is needed to derive the empty clause is: Since each clause consists of two literals, so does each possible resolvent. Therefore, by resolution without factoring, the empty clause can never be obtained. Using factoring, it can be obtained e.g. as follows: Generalizations of

225-482: A human can write additional rules. The term superoptimization was first coined by Alexia Massalin in the 1987 paper Superoptimizer: A Look at the Smallest Program . The label "program optimization" has been given to a field that does not aspire to optimize but only to improve. This misnomer forced Massalin to call her system a superoptimizer, which is actually an optimizer to find an optimal program. In 1992,

270-672: A minimalist, yet Turing-complete , purely functional programming language , consisting of conditional, recursion, and arithmetic and other operators is supported. Case studies performed within this framework synthesized algorithms to compute e.g. division , remainder , square root , term unification , answers to relational database queries and several sorting algorithms . Proof rules include: Murray has shown these rules to be complete for first-order logic . In 1986, Manna and Waldinger added generalized E-resolution and paramodulation rules to handle also equality; later, these rules turned out to be incomplete (but nevertheless sound ). As

315-445: A similar way, line 14 yields line 15 and then line 16 by resolution. Also, the second case, x ≤ M ∧ y ≤ M ∧ y = M {\displaystyle x\leq M\land y\leq M\land y=M} in line 13, is handled similarly, yielding eventually line 18. In a last step, both cases (i.e. lines 16 and 18) are joined, using the resolution rule from line 58; to make that rule applicable,

360-467: A small p {\displaystyle p} is replaced multiple times with a larger G [ true ] {\displaystyle G[{\textit {true}}]} and/or G [ false ] {\displaystyle G[{\textit {false}}]} . As an example, starting from the user-given assumptions the Murray rule can be used as follows to infer a contradiction: For

405-687: A toy example, a functional program to compute the maximum M {\displaystyle M} of two numbers x {\displaystyle x} and y {\displaystyle y} can be derived as follows. Starting from the requirement description " The maximum is larger than or equal to any given number, and is one of the given numbers ", the first-order formula ∀ X ∀ Y ∃ M : X ≤ M ∧ Y ≤ M ∧ ( X = M ∨ Y = M ) {\displaystyle \forall X\forall Y\exists M:X\leq M\land Y\leq M\land (X=M\lor Y=M)}

450-412: A tree representation also makes it clear to see how the resolution rule is related to a special case of the cut-rule , restricted to atomic cut-formulas. However, tree representations are not as compact as set or list representations, because they explicitly show redundant subderivations of clauses that are used more than once in the derivation of the empty clause. Graph representations can be as compact in

495-696: A user-given first-order specification formula . For that formula, a proof is constructed, thereby also synthesizing a functional program from unifying substitutions. The framework is presented in a table layout, the columns containing: Initially, background knowledge, pre-conditions, and post-conditions are entered into the table. After that, appropriate proof rules are applied manually. The framework has been designed to enhance human readability of intermediate formulas: contrary to classical resolution , it does not require clausal normal form , but allows one to reason with formulas of arbitrary structure and containing any junctors (" non-clausal resolution "). The proof

SECTION 10

#1733114524385

540-482: A valid solution, because it conforms to the grammar and the specification. From 2014 through 2019, the yearly Syntax-Guided Synthesis Competition (or SyGuS-Comp) compared the different algorithms for program synthesis in a competitive event. The competition used a standardized input format, SyGuS-IF, based on SMT-Lib 2 . For example, the following SyGuS-IF encodes the problem of synthesizing the maximum of two integers (as presented above): A compliant solver might return

585-478: Is a single valid inference rule that produces a new clause implied by two clauses containing complementary literals. A literal is a propositional variable or the negation of a propositional variable. Two literals are said to be complements if one is the negation of the other (in the following, ¬ c {\displaystyle \lnot c} is taken to be the complement to c {\displaystyle c} ). The resulting clause contains all

630-645: Is a variable and b is a constant. Here we see that In first-order logic, resolution condenses the traditional syllogisms of logical inference down to a single rule. To understand how resolution works, consider the following example syllogism of term logic : Or, more generally: To recast the reasoning using the resolution technique, first the clauses must be converted to conjunctive normal form (CNF). In this form, all quantification becomes implicit: universal quantifiers on variables ( X , Y , ...) are simply omitted as understood, while existentially-quantified variables are replaced by Skolem functions . So

675-544: Is also possible to use a SMT solver to approach the problem, vastly improving the search efficiency (although inputs more complex than a basic block remains out of reach). In 2001, goal-directed superoptimizing was demonstrated in the Denali project by Compaq research. In 2006, answer set declarative programming was applied to superoptimization in the Total Optimisation using Answer Set Technology (TOAST) project at

720-422: Is complete if augmented by appropriate logical transformation rules. Traugott uses the rule where the exponents of p {\displaystyle p} indicate the polarity of its occurrences. While G [ true ] {\displaystyle G[{\textit {true}}]} and G [ false ] {\displaystyle G[{\textit {false}}]} are built as before,

765-523: Is complete when t r u e {\displaystyle {\it {true}}} has been derived in the Goals column, or, equivalently, f a l s e {\displaystyle {\it {false}}} in the Assertions column. Programs obtained by this approach are guaranteed to satisfy the specification formula started from; in this sense they are correct by construction . Only

810-489: Is false. In order for the premise a ∨ b {\displaystyle a\vee b} to be true, b {\displaystyle b} must be true. Alternatively, suppose a {\displaystyle a} is true. In order for the premise ¬ a ∨ c {\displaystyle \neg a\vee c} to be true, c {\displaystyle c} must be true. Therefore, regardless of falsehood or veracity of

855-522: Is intended to be simplified using rules like q ∧ true ⟹ q {\displaystyle q\land {\textit {true}}\implies q} , etc. In order to prevent generating useless trivial resolvents, the rule shall be applied only when p {\displaystyle p} has at least one "negative" and "positive" occurrence in F {\displaystyle F} and G {\displaystyle G} , respectively. Murray has shown that this rule

900-414: Is obtained as its formal translation. This formula is to be proved. By reverse Skolemization , the specification in line 10 is obtained, an upper- and lower-case letter denoting a variable and a Skolem constant , respectively. After applying a transformation rule for the distributive law in line 11, the proof goal is a disjunction, and hence can be split into two cases, viz. lines 12 and 13. Turning to

945-462: Is possible to encode program synthesis problems in Boolean logic and use algorithms for the Boolean satisfiability problem to automatically find programs. In 2013, a unified framework for program synthesis problems called Syntax-guided Synthesis (stylized SyGuS) was proposed by researchers at UPenn , UC Berkeley , and MIT . The input to a SyGuS algorithm consists of a logical specification along with

SECTION 20

#1733114524385

990-448: Is valid; the step 15→16 established that both cases 16 and 18 are complementary. Since both line 16 and 18 comes with a program term, a conditional expression results in the program column. Since the goal formula true {\displaystyle {\textit {true}}} has been derived, the proof is done, and the program column of the " true {\displaystyle {\textit {true}}} " line contains

1035-537: The University of Bath . Superoptimization can be used to automatically generate general-purpose peephole optimizers . Several superoptimizers are available for free download. Resolution (logic) In mathematical logic and automated theorem proving , resolution is a rule of inference leading to a refutation-complete theorem-proving technique for sentences in propositional logic and first-order logic . For propositional logic, systematically applying

1080-461: The validity of a sentence under a set of axioms. This resolution technique uses proof by contradiction and is based on the fact that any sentence in propositional logic can be transformed into an equivalent sentence in conjunctive normal form . The steps are as follows. One instance of this algorithm is the original Davis–Putnam algorithm that was later refined into the DPLL algorithm that removed

1125-424: The 1969 automata-theoretic approach by Büchi and Landweber , and the works by Manna and Waldinger (c. 1980). The development of modern high-level programming languages can also be understood as a form of program synthesis. The early 21st century has seen a surge of practical interest in the idea of program synthesis in the formal verification community and related fields. Armando Solar-Lezama showed that it

1170-575: The GNU Superoptimizer (GSO) was developed to integrate into the GNU Compiler Collection (GCC). Later work further developed and extended these ideas. Traditionally, superoptimizing is performed via exhaustive brute-force search in the space of valid instruction sequences. This is a costly method, and largely impractical for general-purpose compilers. Yet, it has been shown to be useful in optimizing performance-critical inner loops. It

1215-468: The above resolution rule have been devised that do not require the originating formulas to be in clausal normal form . These techniques are useful mainly in interactive theorem proving where it is important to preserve human readability of intermediate result formulas. Besides, they avoid combinatorial explosion during transformation to clause-form, and sometimes save resolution steps. For propositional logic, Murray and Manna and Waldinger use

1260-735: The first case, resolving line 12 with the axiom in line 1 leads to instantiation of the program variable M {\displaystyle M} in line 14. Intuitively, the last conjunct of line 12 prescribes the value that M {\displaystyle M} must take in this case. Formally, the non-clausal resolution rule shown in line 57 above is applied to lines 12 and 1, with yielding ¬ ( {\displaystyle \lnot (} true ∧ false ) ∧ ( x ≤ x ∧ y ≤ x ∧ true ) {\displaystyle )} , which simplifies to x ≤ x ∧ y ≤ x {\displaystyle x\leq x\land y\leq x} . In

1305-409: The following output: Counter-example guided inductive synthesis (CEGIS) is an effective approach to building sound program synthesizers. CEGIS involves the interplay of two components: a generator which generates candidate programs, and a verifier which checks whether the candidates satisfy the specification. Given a set of inputs I , a set of possible programs P , and a specification S ,

1350-659: The formula F [ G [ true ] , ¬ G [ false ] ] {\displaystyle F[G[{\textit {true}}],\lnot G[{\textit {false}}]]} is obtained by replacing each positive and each negative occurrence of p {\displaystyle p} in F {\displaystyle F} with G [ true ] {\displaystyle G[{\textit {true}}]} and G [ false ] {\displaystyle G[{\textit {false}}]} , respectively. Similar to Murray's approach, appropriate simplifying transformations are to be applied to

1395-400: The given formula. This source of combinatorial explosion was eliminated in 1965 by John Alan Robinson 's syntactical unification algorithm , which allowed one to instantiate the formula during the proof "on demand" just as far as needed to keep refutation completeness . The clause produced by a resolution rule is sometimes called a resolvent . The resolution rule in propositional logic

Program synthesis - Misplaced Pages Continue

1440-499: The goal of program synthesis is to find a program p in P such that for all inputs i in I , S ( p , i ) holds. CEGIS is parameterized over a generator and a verifier: CEGIS runs the generator and verifier run in a loop, accumulating counter-examples: Implementations of CEGIS typically use SMT solvers as verifiers. CEGIS was inspired by counterexample-guided abstraction refinement (CEGAR). The framework of Manna and Waldinger , published in 1980, starts from

1485-406: The literals that do not have complements. Formally: where The above may also be written as: Or schematically as: We have the following terminology: The clause produced by the resolution rule is called the resolvent of the two input clauses. It is the principle of consensus applied to clauses rather than terms. When the two clauses contain more than one pair of complementary literals,

1530-413: The need for explicit representation of the resolvents. This description of the resolution technique uses a set S as the underlying data-structure to represent resolution derivations. Lists , Trees and Directed Acyclic Graphs are other possible and common alternatives. Tree representations are more faithful to the fact that the resolution rule is binary. Together with a sequent notation for clauses,

1575-411: The number of clauses as list representations and they also store structural information regarding which clauses were resolved to derive each resolvent. a ∨ b , ¬ a ∨ c b ∨ c {\displaystyle {\frac {a\vee b,\quad \neg a\vee c}{b\vee c}}} In plain language: Suppose a {\displaystyle a}

1620-422: The preparatory step 15→16 was needed. Intuitively, line 18 could be read as "in case x ≤ y {\displaystyle x\leq y} , the output y {\displaystyle y} is valid (with respect to the original specification), while line 15 says "in case y ≤ x {\displaystyle y\leq x} , the output x {\displaystyle x}

1665-509: The program. Superoptimization Superoptimization is the process where a compiler automatically finds the optimal sequence for a loop-free sequence of instructions. Real-world compilers generally cannot produce genuinely optimal code, and while most standard compiler optimizations only improve code partly, a superoptimizer's goal is to find the optimal sequence, the canonical form . Superoptimizers can be used to improve conventional optimizers by highlighting missed opportunities so

1710-453: The question is, how does the resolution technique derive the last clause from the first two? The rule is simple: To apply this rule to the above example, we find the predicate P occurs in negated form in the first clause, and in non-negated form in the second clause. X is an unbound variable, while a is a bound value (term). Unifying the two produces the substitution Discarding the unified predicates, and applying this substitution to

1755-426: The remaining predicates (just Q ( X ), in this case), produces the conclusion: For another example, consider the syllogistic form Or more generally, In CNF, the antecedents become: (Note that the variable in the second clause was renamed to make it clear that variables in different clauses are distinct.) Now, unifying Q ( X ) in the first clause with ¬ Q ( Y ) in the second clause means that X and Y become

1800-520: The resolution rule acts as a decision procedure for formula unsatisfiability, solving the (complement of the) Boolean satisfiability problem . For first-order logic , resolution can be used as the basis for a semi-algorithm for the unsatisfiability problem of first-order logic , providing a more practical method than one following from Gödel's completeness theorem . The resolution rule can be traced back to Davis and Putnam (1960); however, their algorithm required trying all ground instances of

1845-441: The resolution rule can be applied (independently) for each such pair; however, the result is always a tautology . Modus ponens can be seen as a special case of resolution (of a one-literal clause and a two-literal clause). is equivalent to When coupled with a complete search algorithm , the resolution rule yields a sound and complete algorithm for deciding the satisfiability of a propositional formula, and, by extension,

Program synthesis - Misplaced Pages Continue

1890-462: The resolvent. Traugott proved his rule to be complete, provided ∧ , ∨ , → , ¬ {\displaystyle \land ,\lor ,\rightarrow ,\lnot } are the only connectives used in formulas. Traugott's resolvent is stronger than Murray's. Moreover, it does not introduce new binary junctors, thus avoiding a tendency towards clausal form in repeated resolution. However, formulas may grow longer when

1935-799: The rule where p {\displaystyle p} denotes an arbitrary formula, F [ p ] {\displaystyle F[p]} denotes a formula containing p {\displaystyle p} as a subformula, and F [ true ] {\displaystyle F[{\textit {true}}]} is built by replacing in F [ p ] {\displaystyle F[p]} every occurrence of p {\displaystyle p} by true {\displaystyle {\textit {true}}} ; likewise for G {\displaystyle G} . The resolvent F [ true ] ∨ G [ false ] {\displaystyle F[{\textit {true}}]\lor G[{\textit {false}}]}

1980-423: The same variable anyway. Substituting this into the remaining clauses and combining them gives the conclusion: The resolution rule, as defined by Robinson, also incorporated factoring, which unifies two literals in the same clause, before or during the application of resolution as defined above. The resulting inference rule is refutation-complete, in that a set of clauses is unsatisfiable if and only if there exists

2025-429: The work only refers to circuits and not programs, the work is considered to be one of the earliest descriptions of program synthesis and some researchers refer to program synthesis as "Church's Problem". In the 1960s, a similar idea for an "automatic programmer" was explored by researchers in artificial intelligence. Since then, various research communities considered the problem of program synthesis. Notable works include

#384615