1 Introduction

Quantifier Elimination (qelim) is used in many automated reasoning tasks including program synthesis [18], exist-forall solving [8, 9], quantified SMT [5], and Model Checking [17]. Complete qelim, even when possible, is computationally expensive, and solvers often approximate it. We call these approximations quantifier reductions, to separate them from qelim. The difference is that quantifier reduction might leave some free variables in the formula.

For example, Z3 [19] performs quantifier reduction, called QeLite, by greedily substituting variables by definitions syntactically appearing in the formulas. While it is very useful, it is necessarily sensitive to the order in which variables are substituted and depends on definitions appearing explicitly in the formula. Even though it may seem that these shortcomings need to be tolerated to keep QeLite fast, in this paper we show that it is not actually the case; we propose an egraph-based algorithm, QEL, to perform fast quantifier reduction that is complete relative to some semantic properties of the formula.

Egraph [20] is a data structure that compactly represents infinitely many terms and their equivalence classes. It was initially proposed as a decision procedure for EUF [20] and used for theorem proving (e.g., Simplify [7]). Since then, the applications of egraphs have grown. Egraphs are now used as term rewrite systems in equality saturation [15, 23], for theory combination in SMT solvers [7, 21], and for term abstract domains in Abstract Interpretation [6, 10, 12].

Using egraphs for rewriting or other formula manipulations (like qelim) requires a special operation, called extract, that converts nodes in the egraph back into terms. Term extraction was not considered when egraphs were first designed [20]. As far as we know, extraction was first studied in the application of egraphs for compiler optimization. Specifically, equality saturation [15, 22] is an optimization technique over egraphs that consists in populating an egraph with many equivalent terms inferred by applying rules. When the egraph is saturated, i.e., applying the rules has no effect, the equivalent term that is most desired, e.g., smallest in size, is extracted. This is a recursive process that extracts each sub-term by choosing one representative among its equivalents.

Application of egraphs to rewriting have recently resurged driven by the egg library [24] and the associated workshopFootnote 1. In [24], the authors show, once again, the power and versatility of this data structure. Motivated by applications of equality saturation, they provide a generic and efficient framework equipped with term extraction, based on an extensible class analysis.

Egraphs seem to be the perfect data-structure to address the challenges of quantifier reduction: they allow reasoning about infinitely many equivalent terms and consider all available variable definitions and orderings at once. However, things are not always what they appear. The key to quantifier reduction is finding ground (i.e., variable-free) representatives for equivalence classes with free variables. This goes against existing techniques for term extraction since it requires selecting larger, rather than smaller, terms to be representatives. Selecting representatives carelessly makes term extraction diverge. To our surprise, this problem has not been studied so far. In fact, egg [24] incorrectly claims that any representative function can be used with its term extraction, while the implementation diverges. In this paper, we bridge this gap by providing necessary and sufficient conditions for a representative function to be admissible for term extraction as defined in [15, 24]. Furthermore, we extend extraction from terms to formulas to enable extracting a formula of the egraph.

Our main contribution is a new quantifier reduction algorithm, called QEL. Building on the term extraction described above, it is formulated as finding a representative function that maximizes the number of ground terms as representatives. Furthermore, it greedily attempts to represent variables without ground representatives in terms of other variables, thus further reducing the number of variables in the output. We show that QEL is complete relative to ground definitions entailed by the formula. Specifically, QEL guarantees to eliminate a variable if it is equivalent to a ground term.

Whenever an application requires eliminating all free variables, incomplete techniques such as QeLite or QEL are insufficient. In this case, qelim is under-approximated using a Model-based Projection (MBP) that uses a model M of a formula to guide under-approximation using equalities and variable definitions that are consistent with M. In this paper, we show that MBP can be implemented using our new techniques for QEL together with the machinery from equality saturation. Just like SMT solvers use egraphs as glue to combine different theory solvers, we use egraphs as glue to combine projection for different theories. In particular, we give an algorithm for MBP in the combined theory of Arrays and Algebraic DataTypes (ADTs). The algorithm uses insights from QEL to produce less under-approximate MBPs.

We implemented QEL and the new MBP using egraphs inside the state-of-art SMT solver Z3  [19]. Our implementation (referred to as Z3eg) replaces the existing QeLite and MBP. We evaluate our algorithms in two contexts. First, inside the QSAT [5] algorithm for quantified satisfiability. The performance of QSAT in Z3eg is improved, compared to QSAT in Z3, when ADTs are involved. Second, we evaluate our algorithms inside the Constrained Horn Clause (CHC) solver Spacer  [17]. Our experiments show that Spacer in Z3eg solves many more benchmarks containing nested Arrays and ADTs.

Related Work. Quantifier reduction by variable substitution is widely used in quantified SMT [5, 11]. To our knowledge, we are the first to look at this problem semantically and provide an algorithm that guarantees that the variable is eliminated if the formula entails that it has a ground definition.

Term extraction for egraphs comes from equality saturation [15, 22]. The egg Rust library [24] is a recent implementation of equality saturation that supports rewriting and term extraction. However, we did not use egg because we integrated QEL within Z3 and built it using Z3 data structures instead.

Model-based projection was first introduced for the Spacer CHC solver for LIA and LRA [17] and extended to the theory of Arrays [16] and ADTs [5]. Until now, it was implemented by syntactic rewriting. Our egraph-based MBP implementation is less sensitive to syntax and, more importantly, allows for combining MBPs of multiple theories for MBP of the combination. As a result, our MBP is more general and less model dependent. Specifically, it requires fewer model equalities and produces more general under-approximations than [5, 16].

Outline. The rest of the paper is organized as follows. Section 2 provides background. Section 3 introduces term extraction, extends it to formulas, and characterizes representative-based term extraction for egraphs. Section 4 presents QEL, our algorithm for fast quantifier reduction that is relatively complete. Section 5 shows how to compute MBP combining equality saturation and the ideas from Sect. 4 for the theories of ADTs and Arrays. All algorithms have been implemented in Z3 and evaluated in Sect. 6.

2 Background

We assume the reader is familiar with multi-sorted first-order logic (FOL) with equality and the theory of equality with uninterpreted functions (EUF) (for an introduction see, e.g. [4]). We use \(\approx \) to denote the designated logical equality symbol. For simplicity of presentation, we assume that the FOL signature \(\varSigma \) contains only functions (i.e., no predicates) and constants (i.e., 0-ary functions). To represent predicates, we assume the FOL signature has a designated sort \(\textsf{Bool}\), and two \(\textsf{Bool}\) constants \(\top \) and \(\bot \), representing true, and false respectively. We then use \(\textsf{Bool}\)-valued functions to represent predicates, using \(P(a) \approx \top \) and \(P(a) \approx \bot \) to mean that P(a) is true or false, respectively. Informally, we continue to write P(a) and \(\lnot P(a)\) as a syntactic sugar for \(P(a) \approx \top \) and \(P(a) \approx \bot \), respectively. We use lowercase letters like a, b for constants, and f, g for functions, and uppercase letters like P, Q for \(\textsf{Bool}\) functions that represent predicates. We denote by \(\psi ^{\exists }\) the existential closure of \(\psi \).

Quantifier Elimination (qelim). Given a quantifier-free (QF) formula \(\varphi \) with free variables \(\boldsymbol{v}\), quantifier elimination of \(\varphi ^\exists \) is the problem of finding a QF formula \(\psi \) with no free variables such that \(\psi \equiv \varphi ^\exists \). For example, a qelim of \(\exists a\cdot (a \approx x \wedge f(a) > 3)\) is \(f(x) > 3\); and, there is no qelim of \(\exists x \cdot (f(x) > 3)\), because it is impossible to restrict f to have “at least one value in its range that is greater than 3” without a quantifier.

Model Based Projection (MBP). Let \(\varphi \) be a formula with free variables \(\boldsymbol{v} \), and M a model of \(\varphi \). A model-based projection of \(\varphi \) relative to M is a QF formula \(\psi \) such that \(\psi \Rightarrow \varphi ^\exists \) and \(M \models \psi \). That is, \(\psi \) has no free variables, is an under-approximation of \(\varphi \), and satisfies the designated model M, just like \(\varphi \). MBP is used by many algorithms to under-approximate qelim, when the computation of qelim is too expensive or, for some reason, undesirable.

Egraphs. An egraph is a well-known data structure to compactly represent a set of terms and an equivalence relation on those terms [20]. Throughout the paper, we assume that graphs have an ordered successor relation and use n[i] to denote the ith successor (child) of a node n. An out-degree of a node n, \(\texttt{deg}(n)\), is the number of edges leaving n. Given a node n, \(\texttt {parents}(n)\) denotes the set of nodes with an outgoing edge to n and \(\texttt{children}(n)\) denotes the set of nodes with an incoming edge from n.

Definition 1

Let \(\varSigma \) be a first-order logic signature. An egraph is a tuple \(G = \langle N , E , L, \texttt{root} \rangle \), where

  1. (a)

    \(\langle N , E \rangle \) is a directed acyclic graph,

  2. (b)

    \(L\) maps nodes to function symbols in \(\varSigma \) or logical variables, and

  3. (c)

    \(\texttt{root}: N \mapsto N \) maps a node to its root such that the relation \(\rho _{\texttt{root}} \triangleq \{ (n, n') \mid \texttt{root} (n) = \texttt{root} (n') \}\) is an equivalence relation on N that is closed under congruence: \((n,n') \in \rho _{\texttt{root}}\) whenever n and \(n'\) are congruent under \(\texttt{root} \), i.e., whenever \(L (n) = L (n')\), \(\texttt{deg}(n) = \texttt{deg}(n') > 0\), and, \(\forall 1 \le i \le \texttt{deg}(n) \cdot (n[i],n'[i]) \in \rho _{\texttt{root}}\).

Given an egraph G, the class of a node \(n \in G\), \( class (n) \triangleq \rho _{\texttt{root}}(n)\), is the set of all nodes that are equivalent to n. The term of n, \( term (n)\), with \(L (n) = f\) is f if \(\texttt{deg}(n) = 0\) and \(f( term (n[1]),\ldots , term (n[\texttt{deg}(n)]))\), otherwise. We assume that the terms of different nodes are different, and refer to a node n by its term.

Fig. 1.
figure 1

Example egraph of \(\varphi _{1}\).

An example of an egraph \(G = \langle N , E , L, \texttt{root} \rangle \) is shown in Fig. 1. A symbol f inside a circle depicts a node n with label \(L (n) = f\), solid black and dashed red arrows depict \( E \) and \(\texttt{root} \), respectively. The order of the black arrows from left to right defines the order of the children. In our examples, we refer to a specific node i by its number using \(\textsf{N}(i)\) or its term, e.g., \(\textsf{N}(k + 1)\). A node n without an outgoing red arrow is its own root. A set of nodes connected to the same node with red edges forms an equivalence class. In this example, \(\texttt{root} \) defines the equivalence classes \(\{\textsf{N}(3), \textsf{N}(4), \textsf{N}(5), \textsf{N}(6)\}\), \(\{\textsf{N}(8), \textsf{N}(9)\}\), and a class for each of the remaining nodes. Examples of some terms in G are \( term (\textsf{N}(9)) = y\) and \( term (\textsf{N}(5)) = read (a,y)\).

An Egraph of a Formula. We consider formulas that are conjunctions of equality literals (recall that we represent predicate applications by equality literals). Given a formula \(\varphi ~\triangleq ~(t_1 \approx u_1 \wedge \cdots \wedge t_k \approx u_k)\), an egraph from \(\varphi \) is built (following the standard procedure [20]) by creating nodes for each \(t_i\) and \(u_i\), recursively creating nodes for their subexpressions, and merging the classes of each pair \(t_i\) and \(u_i\), computing the congruence closure for \(\texttt{root} \). We write \( egraph (\varphi )\) for an egraph of \(\varphi \) constructed via some deterministic procedure based on the recipe above. Figure 1 shows an \( egraph (\varphi _1)\) of \(\varphi _{1}\). The equality \(z \approx read (a,x)\) is captured by \(\textsf{N}(3)\) and \(\textsf{N}(4)\) belonging to the same class (i.e., red arrow from \(\textsf{N}(4)\) to \(\textsf{N}(3)\)). Similarly, the equality \(x \approx y\) is captured by a red arrow from \(\textsf{N}(9)\) to \(\textsf{N}(8)\). Note that by congruence, \(\varphi _{1}\) implies \( read (a,x) \approx read (a,y)\), which, by transitivity, implies that \(k + 1 \approx read (a,x)\). In Fig. 1, this corresponds to red arrows from \(\textsf{N}(5)\) and \(\textsf{N}(6)\) to \(\textsf{N}(3)\). The predicate application \(3 > z\) is captured by the red arrow from \(\textsf{N}(1)\) to \(\textsf{N}(0)\). From now on, we omit \(\top \) and \(\bot \) and the corresponding edges from figures to avoid clutter.

Fig. 2.
figure 2

Different egraph interpretations for \(\varphi _{2}\).

Explicit and Implicit Equality. Note that egraphs represent equality implicitly by placing nodes with equal terms in the same equivalence class. Sometimes, it is necessary to represent equality explicitly, for example, when using egraphs for equality-aware rewriting (e.g., in egg [24]). To represent equality explicitly, we introduce a binary \(\textsf{Bool}\) function \( eq \) and write \( eq (a, b)\) for an equality that has to be represented explicitly. We change the \( egraph \) algorithm to treat \( eq (a, b)\) as both a function application, and as a logical equality \(a \approx b\): when processing term \( eq (a, b)\), the algorithm both adds \( eq (a, b)\) to the egraph, and merges the nodes for a and b into one class. For example, Fig. 2 shows three different interpretations of a formula \(\varphi _{2}\) with equality interpreted: implicitly (as in [20]), explicitly (as in [24]), and both implicitly and explicitly (as in this paper).

3 Extracting Formulas from Egraphs

Egraphs were proposed as a decision procedure for EUF [20] – a setting in which converting an egraph back to a formula, or extracting, is irrelevant. Term extraction has been studied in the context of equality saturation and term rewriting [15, 24]. However, existing literature presents extraction as a heuristic, and, to the best of our knowledge, has not been exhaustively explored. In this section, we fill these gaps in the literature and extend extraction from terms to formulas.

Term Extraction. We begin by recalling how to extract the term of a node. The function \(\texttt{ntt} \) (node-to-term) in Fig. 3 does an extraction parametrized by a representative function \(\texttt{repr}: N \mapsto N\) (same as in [24]). A function \(\texttt{repr}\) assigns each class a unique representative node (i.e., nodes in the same class are mapped to the same representative) so that \(\rho _{\texttt{root}} = \rho _{\texttt{repr}}\). The function \(\texttt{ntt} \) extracts a term of a node recursively, similarly to \( term \), except that the representatives of the children of a node are used instead of the actual children. We refer to terms built in this way by \(\texttt{ntt} (n, \texttt{repr}) \) and omit \(\texttt{repr}\) when it is clear from the context.

As an example, consider \(\texttt{repr} _1 \triangleq \{\textsf{N}(3), \textsf{N}(8))\}\) for Fig. 1. For readability, we denote representative functions by sets of nodes that are the class representatives, omitting \(\textsf{N}(\top )\) that always represents its class, and omitting all singleton classes. Thus, \(\texttt{repr} _1\) maps all nodes in \( class (\textsf{N}(3))\) to \(\textsf{N}(3)\), nodes in \( class (\textsf{N}(8))\) to \(\textsf{N}(8)\), nodes in \( class (\textsf{N}(\top ))\) to \(\textsf{N}(\top )\), and all singleton classes to themselves. For example, \(\texttt{ntt} (\textsf{N}(5))\) extracts \( read (a,x)\), since \(\textsf{N}(9)\) has as representative \(\textsf{N}(8)\).

Fig. 3.
figure 3

Producing formulas from an egraph.

Formula Extraction. Let \(G = egraph (\varphi )\) be an egraph of some formula \(\varphi \). A formula \(\psi \) is a formula of G, written \( isFormula (G,\psi )\), if \(\psi ^\exists \equiv \varphi ^\exists \).

Figure 3 shows an algorithm \(\mathtt {to\_formula} (\texttt{repr}, S)\) to compute a formula \(\psi \) that satisfies \( isFormula (G, \psi )\) for a given egraph G. In addition to \(\texttt{repr} \), \(\mathtt {to\_formula} \) is parameterized by a set of nodes \(S \subseteq N \) to excludeFootnote 2. To produce the equalities corresponding to the classes, for each representative r, for each \(n \in ( class (r) \setminus \{r\})\) the output formula has a literal \(\texttt{ntt} (r) \approx \texttt{ntt} (n) \). For example, using \(\texttt{repr} _1\) for the egraph in Fig. 1, we obtain for \( class (\textsf{N}(8))\), \((x \approx y)\); for \( class (\textsf{N}(3))\), \((z \approx read (a,x) \wedge z \approx read (a,x) \wedge z \approx k + 1)\); and for \( class (\textsf{N}(0))\), \((\top \approx 3 > z)\). The final result (slightly simplified) is: \( x \approx y \wedge z \approx read (a,x) \wedge z \approx k + 1 \wedge 3 > z\).

Let \(G = egraph (\varphi )\) for some formula \(\varphi \). Note that, \(\psi \) computed by \(\mathtt {to\_formula} \) is not syntactically the same as \(\varphi \). That is, \(\mathtt {to\_formula} \) is not an inverse of \( egraph \). Furthermore, since \(\mathtt {to\_formula} \) commits to one representative per class, it is limited in what formulas it can generate. For example, since \(x \approx y\) is in \(\varphi _{1}\), for any \(\texttt{repr} \), \(\varphi _{1}\) cannot be the result of \(\mathtt {to\_formula} \), because the output can contain only one of \( read (a,x)\) or \( read (a,y)\).

Representative Functions. The representative function is instrumental for determining the terms that appear in the extracted formula. To illustrate the importance of representative choice, consider the formula \(\varphi _{4}\) of Fig. 4 and its egraph \(G_4 = egraph (\varphi _{4})\). For now, ignore the blue dotted lines. For \(\texttt{repr} _{4a}\), \(\mathtt {to\_formula}\) obtains \(\psi _a \triangleq (x \approx g(6)\wedge f(x) \approx 6 \wedge y \approx 6)\). For \(\texttt{repr} _{4b}\), \(\mathtt {to\_formula}\) produces \(\psi _b \triangleq (g(6) \approx x \wedge f(g(6)) \approx 6 \wedge y \approx 6)\). In some applications (like qelim considered in this paper) \(\psi _b\) is preferred to \(\psi _a\): simply removing the literals \(g(6) \approx x\) and \(y \approx 6\) from \(\psi _b\) results in a formula equivalent to \(\exists x, y \cdot \varphi _{4}\) that does not contain variables. Consider a third representative choice \(\texttt{repr} _{4c}\), for node \(\textsf{N}(1)\), \(\texttt{ntt} \) does not terminate: to produce a term for \(\textsf{N}(1)\), a term for \(\textsf{N}(3)\), the representative of its child, \(\textsf{N}(2)\), is required. Similarly to produce a term for \(\textsf{N}(3)\), a term for the representative of its child node \(\textsf{N}(5)\), \(\textsf{N}(1)\), is necessary. Thus, none of the terms can be extracted with \(\texttt{repr} _{4c}\).

For extraction, representative functions \(\texttt{repr} \) are either provided explicitly or implicitly (as in [24]), the latter by associating a cost to nodes and/or terms and letting the representative be a node with minimal cost. However, observe that not all costs guarantee that the chosen \(\texttt{repr} \) can be used (the computation does not terminate). For example, the ill-defined \(\texttt{repr} _{4c}\) from above is a representative function that satisfies the cost function that assigns function applications cost 0 and variables and constants cost 1. A commonly used cost function is term AST size, which is sufficient to ensure termination of \(\texttt{ntt} (n,\texttt{repr})\).

We are thus interested in characterizing representative functions motivated by two observations: not every cost function guarantees that \(\texttt{ntt} (n) \) terminates; and the kind of representative choices that are most suitable for qelim (\(\texttt{repr} _{4b}\)) cannot be expressed over term AST size.

Fig. 4.
figure 4

Egraphs of \(\varphi _{4}\) with \(G_\texttt{repr} \) (Color figure online).

Definition 2

Given an egraph \(G = \langle N , E , L, \texttt{root} \rangle \), a representative function \(\texttt{repr}: N \rightarrow N\) is admissible for G if

  1. (a)

    \(\texttt{repr}\) assigns a unique representative per class,

  2. (b)

    \(\rho _\texttt{root} = \rho _\texttt{repr} \), and

  3. (c)

    the graph \(G_\texttt{repr} \) is acyclic, where \(G_\texttt{repr} = \langle N , E_{\texttt{repr}} \rangle \) and \( E_{\texttt{repr}} \triangleq \{ (n, \texttt{repr} (c)) \mid c \in \texttt {children}(n), n \in N \}\).

Dotted blue edges in the graphs of Fig. 4 show the corresponding \(G_\texttt{repr} \). Intuitively, for each node n, all reachable nodes in \(G_{\texttt{repr}}\) are the nodes whose \(\texttt{ntt}\) term is necessary to produce the \(\texttt{ntt} (n) \). Observe that \(G_{\texttt{repr} _{4c}}\) has a cycle, thus, \(\texttt{repr} _{4c}\) is not admissible.

Theorem 1

Given an egraph G and a representative function \(\texttt{repr}\), the function \(G.\mathtt {to\_formula} (\texttt{repr},\emptyset )\) terminates with result \(\psi \) such that \( isFormula (G,\psi )\) iff \(\texttt{repr} \) is admissible for G.

To the best of our knowledge, Theorem 1 is the first complete characterization of all terms of a node that can be obtained by extraction based on class representatives (via describing all admissible \(\texttt{repr}\), note that the number is finite). This result contradicts [24], where it is claimed to be possible to extract a term of a node for any cost function. The counterexample is \(\texttt{repr} _{4c}\). Importantly, this characterization allows us to explore representative functions outside those in the existing literature, which, as we show in the next section, is key for qelim.

4 Quantifier Reduction

Quantifier reduction is a relaxation of quantifier elimination: given two formulas \(\varphi \) and \(\psi \) with free variables \(\boldsymbol{v}\) and \(\boldsymbol{u}\), respectively, \(\psi \) is a quantifier reduction of \(\varphi \) if \(\boldsymbol{u} \subseteq \boldsymbol{v}\) and \(\varphi ^{\exists } \equiv \psi ^{\exists }\). If \(\boldsymbol{u} \) is empty, then \(\psi \) is a quantifier elimination of \(\varphi ^\exists \). Note that quantifier reduction is possible even when quantifier elimination is not (e.g., for EUF). We are interested in an efficient quantifier reduction algorithm (that can be used as pre-processing for qelim), even if a complete qelim is possible (e.g., for LIA). In this section, we present such an algorithm called QEL.

Intuitively, QEL is based on the well-known substitution rule: \((\exists x \cdot x \approx t \wedge \varphi ) \equiv \varphi [x \mapsto t]\). A naive implementation of this rule, called QeLite in Z3, looks for syntactic definitions of the form \(x \approx t\) for a variable x and an x-free term t and substitutes x with t. While efficient, QeLite is limited because of: (a) dependence on syntactic equality in the formula (specifically, it misses implicit equalities due to transitivity and congruence); (b) sensitivity to the order in which variables are eliminated (eliminating one variable may affect available syntactic equalities for another); and (c) difficulty in dealing with circular equalities such as \(x \approx f(x)\).

For example, consider the formula \(\varphi _{4}(x, y)\) in Fig. 4. Assume that y is eliminated first using \(y \approx f(x)\), resulting in \(x \approx g(f(x)) \wedge f(x) \approx 6\). Now, x cannot be eliminated since the only equality for x is circular. Alternatively, assume that QeLite somehow noticed that by transitivity, \(\varphi _{4}\) implies \(y \approx 6\), and obtains \((\exists y \cdot \varphi _{4}) \triangleq x \approx g(6) \wedge f(x) \approx 6\). This time, \(x \approx g(6)\) can be used to obtain \(f(g(6)) \approx 6\) that is a qelim of \(\varphi _{4}^\exists \). Thus, both the elimination order and implicit equalities are crucial.

In QEL, we address the above issues by using an egraph data structure to concisely capture all implicit equalities and terms. Furthermore, egraphs allow eliminating multiple variables together, ensuring that a variable is eliminated if it is equivalent (explicitly or implicitly) to a ground term in the egraph.

figure a

Pseudocode for QEL is shown in Algorithm 1. Given an input formula \(\varphi \), QEL first builds its egraph G (line 1). Then, it finds a representative function \(\texttt{repr} \) that maps variables to equivalent ground terms, as much as possible (line 2). Next, it further reduces the remaining free variables by refining \(\texttt{repr} \) to map each variable x to an equivalent x-free (but not variable-free) term (line 3). At this point, QEL is committed to the variables to eliminate. To produce the output, \(\mathtt {find\_core}\) identifies the subset of the nodes of G, which we call core, that must be considered in the output (line 4). Finally, \(\mathtt {to\_formula}\) converts the core of G to the resulting formula (line 5). We show that the combination of these steps is even stronger than variable substitution.

To illustrate QEL, we apply it on \(\varphi _{1}\) and its egraph G from Fig. 1. The function \(\mathtt {find\_defs}\) returns \(\texttt{repr} = \{\textsf{N}(6),\textsf{N}(8)\}\)Footnote 3. Node \(\textsf{N}(6)\) is the only node with a ground term in the equivalence class \( class (\textsf{N}(3))\). This corresponds to the definition \(z \approx k + 1\). Node \(\textsf{N}(8)\) is chosen arbitrarily since \( class (\textsf{N}(8))\) has no ground terms. There is no refinement possible, so \(\mathtt {refine\_defs} \) returns \(\texttt{repr} \). The core is \(N \setminus \{\textsf{N}(3),\textsf{N}(5),\textsf{N}(9)\}\). Nodes \(\textsf{N}(3)\) and \(\textsf{N}(9)\) are omitted because they correspond to variables with definitions (under \(\texttt{repr}\)), and \(\textsf{N}(5)\) is omitted because it is congruent to \(\textsf{N}(4)\) so only one of them is needed. Finally, \(\mathtt {to\_formula} \) produces \(k + 1 \approx read (a,x) \wedge 3 > k + 1\). Variables z and y are eliminated.

In the rest of this section we present QEL in detail and QEL ’s key properties.

Finding Ground Definitions. Ground variable definitions are found by selecting a representative function \(\texttt{repr}\) that ensures that the maximum number of terms in the formula are rewritten into ground equivalent ones, which, in turn, means finding a ground definition for all variables that have one.

Computing a representative function \(\texttt{repr}\) that is admissible and ensures finding ground definitions when they exist is not trivial. Naive approaches for identifying ground terms, such as iterating arbitrarily over the classes and selecting a representative based on \( term (n)\) are not enough – \( term (n)\) may not be in the output formula. It is also not possible to make a choice based on \(\texttt{ntt} (n) \), since, in general, it cannot be yet computed (\(\texttt{repr}\) is not known yet).

Fig. 5.
figure 5

Egraphs including (Color figure online) of \(\varphi _{5}\).

Admissibility raises an additional challenge since choosing a node that appears to be a definition (e.g., not a leaf) may cause cycles in \(G_\texttt{repr} \). For example, consider \(\varphi _{5}\) of Fig. 5. Assume that \(\textsf{N}(1)\) and \(\textsf{N}(4)\) are chosen as representatives of their equivalence classes. At this point, \(G_\texttt{repr} \) has two edges: \(\langle \textsf{N}(5),\textsf{N}(4)\rangle \) and \(\langle \textsf{N}(2),\textsf{N}(1)\rangle \), shown by blue dotted lines in Fig. 5a. Next, if either \(\textsf{N}(2)\) or \(\textsf{N}(5)\) are chosen as representatives (the only choices in their class), then \(G_\texttt{repr} \) becomes cyclic (shown in blue in Fig. 5a). Furthermore, backtracking on representative choices needs to be avoided if we are to find a representative function efficiently.

figure c

Algorithm 2 finds a representative function \(\texttt{repr} \) while overcoming these challenges. To ensure that the computed representative function is admissible (without backtracking), Algorithm 2 selects representatives for each class using a “bottom up” approach. Namely, leaves cannot be part of cycles in \(G_\texttt{repr} \) because they have no outgoing edges. Thus, they can always be safely chosen as representatives. Similarly, a node whose children have already been assigned representatives in this way (leaves initially), will also never be part of a cycle in \(G_\texttt{repr} \). Therefore, these nodes are also safe to be chosen as representatives.

This intuition is implemented in \(\mathtt {find\_defs} \) by initializing \(\texttt{repr} \) to be undefined (\(\bigstar \)) for all nodes, and maintaining a workset, \( todo \), containing nodes that, if chosen for the remaining classes (under the current selection), maintain acyclicity of \(G_\texttt{repr} \). The initialization of \( todo \) includes leaves only. The specific choice of leaves ensures that ground definitions are preferred, and we return to it later. After initialization, the function \(\texttt{process} \) extracts an element from \( todo \) and sets it as the representative of its class if the class has not been assigned yet (lines 9 and 10). Once a class representative has been chosen, on lines 11 to 14, the parents of all the nodes in the class such that all the children have been chosen (the condition on line 13) are added to \( todo \).

So far, we discussed how admissibility of \(\texttt{repr} \) is guaranteed. To also ensure that ground definitions are found whenever possible, we observe that a similar bottom up approach identifies terms that can be rewritten into ground ones. This builds on the notion of constructively ground nodes, defined next.

A class c is ground if c contains a constructively ground, or c-ground for short, node n, where a node n is c-ground if either (a) \( term (n)\) is ground, or (b) n is not a leaf and the class \( class (n[i])\) of every child n[i] is ground. Note that nodes labeled by variables are never c-ground.

In the example in Fig. 1, \( class (\textsf{N}(7))\) and \( class (\textsf{N}(8))\) are not ground, because all their nodes represent variables; \( class (\textsf{N}(6))\) is ground because \(\textsf{N}(6)\) is c-ground. Nodes \(\textsf{N}(4)\) and \(\textsf{N}(5)\) are not c-ground because the class of \(\textsf{N}(8)\) (a child of both nodes) is not ground. Interestingly, \(\textsf{N}(1)\) is c-ground, because \( class (\textsf{N}(3)) = class (\textsf{N}(6))\) is ground, even though its term \(3 > z\) is not ground.

Ground classes and c-ground nodes are of interest because whenever \(\varphi \models term (n) \approx t\) for some node n and ground term t, then \( class (n)\) is ground, i.e., it contains a c-ground node, where c-ground nodes can be found recursively starting from ground leaves. Furthermore, the recursive definition ensures that when the aforementioned c-ground nodes are selected as representatives, the corresponding terms w.r.t. \(\texttt{repr}\) are ground.

As a result, to maximize the ground definitions found, we are interested in finding an admissible representative function \(\texttt{repr} \) that is maximally ground, which means that for every node \(n \in N\), if \( class (n)\) is ground, then \(\texttt{repr} (n)\) is c-ground. That means that c-ground nodes are always chosen if they exist.

Theorem 2

Let \(G= egraph (\varphi )\) be an egraph and \(\texttt{repr} \) an admissible representative function that is maximally ground. For all \(n \in N\), if \(\varphi \models term (n) \approx t\) for some ground term t, then \(\texttt{repr} (n)\) is c-ground and \(\texttt{ntt} (\texttt{repr} (n)) \) is ground.

We note that not every choice of c-ground nodes as representatives results in an admissible representative function. For example, consider the formula \(\varphi _{4}\) of Fig. 4 and its egraph. All nodes except for \(\textsf{N}(5)\) and \(\textsf{N}(2)\) are c-ground. However, a \(\texttt{repr}\) with \(\textsf{N}(3)\) and \(\textsf{N}(1)\) as representatives is not admissible. Intuitively, this is because the “witness” for c-groundness of \(\textsf{N}(1)\) in \( class (\textsf{N}(2))\) is \(\textsf{N}(4)\) and not \(\textsf{N}(3)\). Therefore, it is important to incorporate the selection of c-ground representatives into the bottom up procedure that ensures admissibility of \(\texttt{repr} \).

To promote c-ground nodes over non c-ground in the construction of an admissible representative function, \(\mathtt {find\_defs} \) chooses representatives in two steps. First, only the ground leaves are processed (line 2). This ensures that c-ground representatives are chosen while guaranteeing the absence of cycles. Then, the remaining leaves are added to \( todo \) (line 4). This triggers representative selection of the remaining classes (those that are not ground).

We illustrate \(\mathtt {find\_defs} \) with two examples. For \(\varphi _{4}\) of Fig. 4, there is only one leaf that is ground, \(\textsf{N}(4)\), which is added to \( todo \) on line 2, and \( todo \) is processed. \(\textsf{N}(4)\) is chosen as representative and, as a consequence, its parent \(\textsf{N}(1)\) is added to \( todo \). \(\textsf{N}(1)\) is chosen as representative so \(\textsf{N}(3)\), even though added to the queue later, is not chosen as representative, obtaining \(\texttt{repr} _{4b} = \{\textsf{N}(4),\textsf{N}(1)\}\). For \(\varphi _{5}\) of Fig. 5, no nodes are added to \( todo \) on line 2. \(\textsf{N}(3)\) and \(\textsf{N}(6)\) are added on line 4. In \(\texttt{process} \), both are chosen as representatives obtaining, \(\texttt{repr} _{5b}\).

Algorithm 2 guarantees that \(\texttt{repr} \) is maximally ground. Together with Theorem 2, this implies that all terms that can be rewritten into ground equivalent ones will be rewritten, which, in turn, means that for each variable that has a ground definition, its representative is one such definition.

Finding Additional (Non-ground) Definitions. At this point, QEL found ground definitions while avoiding cycles in \(G_\texttt{repr} \). However, this does not mean that as many variables as possible are eliminated. A variable can also be eliminated if it can be expressed as a function of other variables. This is not achieved by \(\mathtt {find\_defs}\). For example, in \(\texttt{repr} _{5b}\) both variables are representatives, hence none is eliminated, even though, since \(x \approx g(f(y))\), x could be eliminated in f5 by rewriting x as a function of y, allowing to eliminate x by rewriting it as a function of y, g(f(y)). Algorithm 3 shows function \(\mathtt {refine\_defs} \) that refines maximally ground \(\texttt{repr} {s}\) to further find such definitions while keeping admissibility and ground maximality. This is done by greedily attempting to change class representatives if they are labeled with a variable. \(\mathtt {refine\_defs} \) iterates over the nodes in the class checking if there is a different node that is not a variable and that does not create a cycle in \(G_\texttt{repr} \) (line 6). The resulting \(\texttt{repr} \) remains maximally ground because representatives of ground classes are not changed.

figure d

For example, let us refine \(\texttt{repr} _{5b} = \{\textsf{N}(3),\textsf{N}(6),\textsf{N}(5)\}\) obtained for \(\varphi _{5}\). Assume that x is processed first. For \( class (\textsf{N}(x))\), changing the representative to \(\textsf{N}(1)\) does not introduce a cycle (see Fig. 5c), so \(\textsf{N}(1)\) is selected. Next, for \( class (\textsf{N}(y))\), choosing \(\textsf{N}(4)\) causes \(G_\texttt{repr} \) to be cyclic since \(\textsf{N}(1)\) was already chosen (Fig. 5a), so the representative of \( class (\textsf{N}(y))\) is not changed. The final refinement is \(\texttt{repr} _{5c} = \{\textsf{N}(1),\textsf{N}(6),\textsf{N}(5)\}\).

At this point, QEL found a representative function \(\texttt{repr} \) with as many ground definitions as possible and attempted to refine \(\texttt{repr} \) to have fewer variables as representatives. Next, QEL finds a core of the nodes of the egraph, based on \(\texttt{repr} \), that will govern the translation of the egraph to a formula. While \(\texttt{repr}\) determines the semantic rewrites of terms that enable variable elimination, it is the use of the core in the translation that actually eliminates them.

Variable Elimination Based on a Core. A core of an egraph \(G = \langle N , E , L, \texttt{root} \rangle \) and a representative function \(\texttt{repr} \), is a subset of the nodes \(N_c \subseteq N\) such that \(\psi _\textit{c} = G.\mathtt {to\_formula} (\texttt{repr},N\setminus N_c)\) satisfies \( isFormula (G,\psi _\textit{c})\).

Algorithm 3 shows pseudocode for \(\mathtt {find\_core} \) that computes a core of an egraph for a given representative function. The idea is that non-representative nodes that are labeled by variables, as well as nodes congruent to nodes that are already in the core, need not be included in the core. The former are not needed since we are only interested in preserving the existential closure of the output, while the latter are not needed since congruent nodes introduce the same syntactic terms in the output. For example, for \(\varphi _1\) and \(\texttt{repr} _1\), \(\mathtt {find\_core} \) returns \(\texttt{core} _1 = N_1 \setminus \{\textsf{N}(3), \textsf{N}(5), \textsf{N}(9)\}\). Nodes \(\textsf{N}(3)\) and \(\textsf{N}(9)\) are excluded because they are labeled with variables; and node \(\textsf{N}(5)\) because it is congruent with \(\textsf{N}(4)\).

Finally, QEL produces a quantifier reduction by applying \(\mathtt {to\_formula} \) with the computed \(\texttt{repr} \) and \(\texttt{core} \). Variables that are not in the core (they are not representatives) are eliminated – this includes variables that have a ground definition. However, QEL may eliminate a variable even if it is a representative (and thus it is in the core). As an example, consider \(\psi (x,y) \triangleq f(x) \approx f(y) \wedge x \approx y\), whose egraph G contains 2 classes with 2 nodes each. The core \(N_c\) relative to any admissible \(\texttt{repr} \) contains only one representative per class: in the \( class (\textsf{N}(x))\) because both nodes are labeled with variables, and in the \( class (\textsf{N}(f(x)))\) because nodes are congruent. In this case, \(\mathtt {to\_formula} (\texttt{repr},N_c)\) results in \(\top \) (since singleton classes in the core produce no literals in the output formula), a quantifier elimination of \(\psi \). More generally, the variables are eliminated because none of them is reachable in \(G_\texttt{repr} \) from a non-singleton class in the core (only such classes contribute literals to the output).

We conclude the presentation of QEL by showing its output for our examples. For \(\varphi _1\), QEL obtains \((k + 1 \approx read (a,x) \wedge 3 > k + 1)\), a quantifier reduction, using \(\texttt{repr} _1 = \{\textsf{N}(3), \textsf{N}(8))\} \) and \(\texttt{core} _1 = N_1 \setminus \{\textsf{N}(3), \textsf{N}(5), \textsf{N}(9)\}\). For \(\varphi _{4}\), QEL obtains \((6 \approx f(g(6)))\), a quantifier elimination, using \(\texttt{repr} _{4b} = \{\textsf{N}(4),\textsf{N}(1)\}\), and \(\texttt{core} _{4b} = N_{4} \setminus \{\textsf{N}(3), \textsf{N}(2)\}\). Finally, for \(\varphi _{5}\), QEL obtains \((y \approx h(f(y)) \wedge f(g(f(y))) \approx f(y))\), a quantifier reduction, using \(\texttt{repr} _{5c} = \{\textsf{N}(1),\textsf{N}(6),\textsf{N}(5)\}\) and \(\texttt{core} _{5c} = N_5\setminus \{\textsf{N}(3)\}\).

Guarantees of QEL. Correctness of QEL is straightforward. We conclude this section by providing two conditions that ensure that a variable is eliminated by QEL. The first condition guarantees that a variable is eliminated whenever a ground definition for it exists (regardless of the specific representative function and core computed by QEL). This makes QEL complete relative to quantifier elimination based on ground definitions. Relative completeness is an important property since it means that QEL is unaffected by variable orderings and syntactic rewrites, unlike QeLite. The second condition, illustrated by \(\psi \) above, depends on the specific representative function and core computed by QEL.

Theorem 3

Let \(\varphi \) be a QF conjunction of literals with free variables \(\boldsymbol{v} \), and let \(v \in \boldsymbol{v} \). Let \(G = egraph (\varphi )\), \(n_v\) the node in G such that \(L(n_v) = v\) and \(\texttt{repr} \) and \(\texttt{core}\) computed by QEL. We denote by \( NS = \{n \in \texttt{core} \mid ( class (n) \cap \texttt{core}) \ne \{n\} \}\) the set of nodes from classes with two or more nodes in \(\texttt{core}\). If one of the following conditions hold, then v does not appear in \( QEL (\varphi ,\boldsymbol{v})\):

  1. (1)

    there exists a ground term t s.t. \(\varphi \models v \approx t\), or

  2. (2)

    \(n_v\) is not reachable from any node in \( NS \) in \(G_\texttt{repr} \).

As a corollary, if every variable meets one of the two conditions, then QEL finds a quantifier elimination.

This concludes the presentation of our quantifier reduction algorithm. Next, we show how QEL can be used to under-approximate quantifier elimination, which allows working with formulas for which QEL does not result in a qelim.

Fig. 6.
figure 6

Two MBP rules from [16]. The notation \(\varphi [t]\) means that \(\varphi \) contains term t. The rules rewrite all occurrences of \( read ( write (t,i, v),j)\) with v and \( read (t, j)\), respectively.

Fig. 7.
figure 7

Adaptation of rules in Fig. 6 using QEL API.

5 Model Based Projection Using QEL

Applications like model checking and quantified satisfiability require efficient computation of under-approximations of quantifier elimination. They make use of model-based projection (MBP) algorithms to project variables that cannot be eliminated cheaply. Our QEL algorithm is efficient and relatively complete, but it does not guarantee to eliminate all variables. In this section, we use a model and theory-specific projection rules to implement an MBP algorithm on top of QEL.

We focus on two important theories: Arrays and Algebraic DataTypes (ADT). They are widely used to encode program verification tasks. Prior works separately develop MBP algorithms for Arrays [16] and ADTs [5]. Both MBPs were presented as a set of syntactic rewrite rules applied until fixed point.

Combining the MBP algorithms for Arrays and ADTs is non-trivial because applying projection rules for one theory may produce terms of the other theory. Therefore, separately achieving saturation in either theory is not sufficient to reach saturation in the combined setting. The MBP for the combined setting has to call both MBPs, check whether either one of them produced terms that can be processed by the other, and, if so, call the other algorithm. This is similar to theory combination in SMT solving where the core SMT solver has to keep track of different theory solvers and exchange terms between them.

Our main insight is that egraphs can be used as a glue to combine MBP algorithms for different theories, just like egraphs are used in SMT solvers to combine satisfiability checking for different theories. Implementing MBP using egraphs allows us to use the insights from QEL to combine MBP with on-the-fly quantifier reduction to produce less under-approximate formulas than what we get by syntactic application of MBP rules.

To implement MBP using egraphs, we implement all rewrite rules for MBP in Arrays [16] and ADTs [5] on top of egraphs. In the interest of space, we explain the implementation of just a couple of the MBP rules for ArraysFootnote 4.

Figure 6 shows two Array MBP rules from [16]: ElimWrRd1 and ElimWrRd2. Here, \(\varphi \) is a formula with arrays and M is a model for \(\varphi \). Both rules rewrite terms which match the pattern \( read ( write (t, i, v), j)\), where t, i, j, k are all terms and t contains a variable to be projected. ElimWrRd1 is applicable when \(M \models i \approx j\). It rewrites the term \( read ( write (t, i, v), j)\) to v. ElimWrRd2 is applicable when \(M\not \models i \approx j\) and rewrites \( read ( write (t, i, v), j)\) to \( read (t, j)\).

Figure 7 shows the egraph implementation of ElimWrRd1 and ElimWrRd2. The \(\texttt{match} (t)\) method checks if t syntactically matches \( read ( write (s, i, v), j)\), where s contains a variable to be projected. The \(\texttt{apply} (t)\) method assumes that t is \( read ( write (s, i, v), j)\). It first checks if \(M\models i \approx j\), and, if so, it adds \(i\approx j\) and \(t\approx v\) to the egraph G. Otherwise, if \(M \not \models i \approx j\), \(\texttt{apply} (t)\) adds a disequality \(i \not \approx j\) and an equality \(t \approx read (s, v)\) to G. That is, the egraph implementation of the rules only adds (and does not remove) literals that capture the side condition and the conclusion of the rule.

Our algorithm for MBP based on egraphs, MBP-QEL, is shown in Alg. 4. It initializes an egraph with the input formula (line 1), applies MBP rules until saturation (line 4), and then uses the steps of QEL  (lines 7–12) to generate the projected formula.

Applying rules is as straightforward as iterating over all terms t in the egraph, and for each rule r such that \(r.\texttt{match} (t)\) is true, calling \(r.\texttt{apply} (t, M, G)\) (lines 14–22). As opposed to the standard approach based on formula rewriting, here the terms are not rewritten – both remain. Therefore, it is possible to get into an infinite loop by re-applying the same rules on the same terms over and over again. To avoid this, MBP-QEL marks terms as seen (line 23) and avoids them in the next iteration (line 15). Some rules in MBP are applied to pairs of terms. For example, Ackermann rewrites pairs of \( read \) terms over the same variable. This is different from usual applications where rewrite rules are applied to individual expressions. Yet, it is easy to adapt such pairwise rewrite rules to egraphs by iterating over pairs of terms (lines 25–30).

MBP-QEL does not apply MBP rules to terms that contain variables but are already c-ground  (line 16), which is sound because such terms are replaced by ground terms in the output (Theorem 3). This prevents unnecessary application of MBP rules thus allowing MBP-QEL to compute MBPs that are closer to a quantifier elimination (less model-specific).

Just like each application of a rewrite rule introduces a new term to a formula, each call to the \(\texttt{apply} \) method of a rule adds new terms to the egraph. Therefore, each call to \( ApplyRules \) (line 4) makes the egraph bigger. However, provided that the original MBP combination is terminating, the iterative application of \( ApplyRules \) terminates as well (due to marking).

Some MBP rules introduce new variables to the formula. MBP-QEL computes \(\texttt{repr} \) based on both original and newly introduced variables (line 7). This allows MBP-QEL to eliminate all variables, including non-Array, non-ADT variables, that are equivalent to ground terms (Theorem 3).

As mentioned earlier, MBP-QEL never removes terms while rewrite rules are saturating. Therefore, after saturation, the egraph still contains all original terms and variables. From soundness of the MBP rules, it follows that after each invocation of \(\texttt{apply} \), MBP-QEL creates an under-approximation of \(\varphi ^\exists \) based on the model M. From completeness of MBP rules, it follows that, after saturation, all terms containing Array or ADT variables can be removed from the egraph without affecting equivalence of the saturated egraph. Hence, when calling \(\mathtt {to\_formula} \), MBP-QEL removes all terms containing Array or ADT variables (line 7). This includes, in particular, all the terms on which rewrite rules were applied, but potentially more.

figure e

We demonstrate our MBP algorithm on an example with nested ADTs and Arrays. Let \(P\triangleq \langle A_{I\times I}, I\rangle \) be the datatype of a pair of an integer array and an integer, and let \( pair : A_{I\times I}\times I\rightarrow P\) be its sole constructor with destructors \( fst : P\rightarrow A_{I\times I}\) and \( snd : P\rightarrow I\). In the following, let i, l, j be integers, a an integer array, p, \(p'\) pairs, and \(\boldsymbol{p}_1\), \(\boldsymbol{p}_2\) arrays of pairs (\(A_{I\times P}\)). Consider the formula:

$$\varphi _{ mbp }(p,a) \;\triangleq \; read (a, i) \approx i \wedge p \approx pair (a, l) \wedge \boldsymbol{p}_2 \approx write (\boldsymbol{p}_1, j, p)\wedge p \not \approx p'$$

where p and a are free variables that we want to project and all of \(i, j, l, \boldsymbol{p}_1, \boldsymbol{p}_2, p'\) are constants that we want to keep. MBP is guided by a model \(M_{ mbp }\models \varphi _{ mbp }\). To eliminate p and a, MBP-QEL constructs the egraph of \(\varphi _{ mbp }\) and applies the MBP rules. In particular, it uses Array MBP rules to rewrite the \( write (\boldsymbol{p}_1, j, p)\) term by adding the equality \( read (\boldsymbol{p}_2, j) \approx p\) and merging it with the equivalence class of \(\boldsymbol{p}_2 \approx write (\boldsymbol{p}_1, j, p)\). It then applies ADT MBP rules to deconstruct the equality \(p \approx pair (a, l)\) by creating two equalities \( fst (p) \approx a\) and \( snd (p) \approx l\). Finally, the call to \(\mathtt {to\_formula} \) produces

$$\begin{aligned}{} & {} read ( fst ( read (\boldsymbol{p}_1, j)), i) \approx i \wedge snd ( read (\boldsymbol{p}_1, j)) \approx l \wedge {}\\{} & {} \qquad \qquad \qquad \qquad read (\boldsymbol{p}_2, j) \approx pair ( fst ( read (\boldsymbol{p}_1, j)), l) \wedge {}\\{} & {} \qquad \qquad \qquad \qquad \qquad \qquad \ \ \boldsymbol{p}_2 \approx write (\boldsymbol{p}_1, j, read (\boldsymbol{p}_2, j))\wedge read (\boldsymbol{p}_2, j) \not \approx p' \end{aligned}$$

The output is easy to understand by tracing it back to the input. For example, the first literal is a rewrite of the literal \( read (a, i) \approx i\) where a is represented with \( fst (p)\) and p is represented with \( read (\boldsymbol{p}_1, j)\). While the interaction of these rules might seem straightforward in this example, the MBP implementation in Z3 fails to project a in this example because of the multilevel nesting.

Notably, in this example, the c-ground computation during projection allows MBP-QEL not splitting on the disequality \(p\not \approx p'\) based on the model. While ADT MBP rules eliminate disequalities by using the model to split them, MBP-QEL benefits from the fact that, after the application of Array MBP rules, the class of p becomes ground, making \(p\not \approx p'\) c-ground. Thus, the c-ground computation allows MBP-QEL to produce a formula that is less approximate than those produced by syntactic application of MBP rules. In fact, in this example, a quantifier elimination is obtained (the model \(M_{ mbp }\) was not used).

In the next section, we show that our improvements to MBP translate to significant improvements in a CHC-solving procedure that relies on MBP.

6 Evaluation

We implement QEL  (Alg. 1) and MBP-QEL  (Alg. 4) inside Z3  [19] (version 4.12.0), a state-of-the-art SMT solver. Our implementation (referred to as Z3eg), is publicly available on GitHubFootnote 5. Z3eg replaces QeLite with QEL, and the existing MBP with MBP-QEL.

We evaluate Z3eg using two solving tasks. Our first evaluation is on the QSAT algorithm [5] for checking satisfiability of formulas with alternating quantifiers. In QSAT, Z3 uses both QeLite and MBP to under-approximate quantified formulas. We compare three QSAT implementations: the existing version in Z3 with the default QeLite and MBP; the existing version in Z3 in which QeLite and MBP are replaced by our egraph-based algorithms, Z3eg; and the QSAT implementation in YicesQSFootnote 6, based on the Yices [8] SMT solver. During the evaluation, we found a bug in QSAT implementation of Z3 and fixed itFootnote 7. The fix resulted in Z3 solving over 40 sat instances and over 120 unsat instances more than before. In the following, we use the fixed version of Z3.

We use benchmarks in the theory of (quantified) LIA and LRA from SMT-LIB [2, 3], with alternating quantifiers. LIA and LRA are the only tracks in which Z3 uses the QSAT tactic by default. To make our experiments more comprehensive, we also consider two modified variants of the LIA and LRA benchmarks, where we add some non-recursive ADT variables to the benchmarks. Specifically, we wrap all existentially quantified arithmetic variables using a record type ADT and unwrap them whenever they get usedFootnote 8. Since these benchmarks are similar to the original, we force Z3 to use the QSAT tactic on them with a tactic.default_tactic=qsat command line option.

Table 1 summarizes the results for the SMT-LIB benchmarks. In LIA, both Z3eg and Z3 solve all benchmarks in under a minute, while YicesQS is unable to solve many instances. In LRA, YicesQS solves all instances with very good performance. Z3 is able to solve only some benchmarks, and our Z3eg performs similarly to Z3. We found that in the LRA benchmarks, the new algorithms in Z3eg are not being used since there are not many equalities in the formula, and no equalities are inferred during the run of QSAT. Thus, any differences between Z3 and Z3eg are due to inherent randomness of the solving process.

Table 2 summarizes the results for the categories of mixed ADT and arithmetic. YicesQS is not able to compete because it does not support ADTs. As expected, Z3eg solves many more instances than Z3.

Table 1. Instances solved within 20 min by different implementations. Benchmarks are quantified LIA and LRA formulas from SMT-LIB [2].
Table 2. Instances solved within 60 s for our handcrafted benchmarks.

The second part of our evaluation shows the efficacy of MBP-QEL for Arrays and ADTs (Alg. 4) in the context of CHC-solving. Z3 uses both QeLite and MBP inside the CHC-solver Spacer  [17]. Therefore, we compare Z3 and Z3eg on CHC problems containing Arrays and ADTs. We use two sets of benchmarks to test out the efficacy of our MBP. The benchmarks in the first set were generated for verification of Solidity smart contracts [1] (we exclude benchmarks with non-linear arithmetic, they are not supported by Spacer). These benchmarks have a very complex structure that nests ADTs and Arrays. Specifically, they contain both ADTs of Arrays, as well as Arrays of ADTs. This makes them suitable to test our MBP-QEL. Row 1 of Table 3 shows the number of instances solved by Z3 (Spacer) with and without MBP-QEL. Z3eg solves 29 instances more than Z3. Even though MBP is just one part of the overall Spacer algorithm, we see that for these benchmarks, MBP-QEL makes a significant impact on Spacer. Digging deeper, we find that many of these instances come from the category called abi (row 2 in Table 3). Z3eg solves all of these benchmarks, while Z3 fails to solve 20 of them. We traced the problem down to the MBP implementation in Z3: it fails to eliminate all variables, causing runtime exception. In contrast, MBP-QEL eliminates all variables successfully, allowing Z3eg to solve these benchmarks.

We also compare Z3eg with Eldarica  [14], a state-of-the-art CHC-solver that is particularly effective on these benchmarks. Z3eg solves almost as many instances as Eldarica. Furthermore, like Z3, Z3eg is orders of magnitude faster than Eldarica. Finally, we compare the performance of Z3eg on Array benchmarks from the CHC competition [13]. Z3eg is competitive with Z3, solving 2 additional safe instances and almost as many unsafe instances as Z3  (row 3 of Table 3). Both Z3eg and Z3 solve quite a few instances more than Eldarica.

Table 3. Instances solved within 20 min by Z3eg, Z3, and Eldarica. Benchmarks are CHCs from Solidity [1] and CHC competition [13]. The abi benchmarks are a subset of Solidity benchmarks.

Our experiments show the effectiveness of our QEL and MBP-QEL in different settings inside the state-of-the-art SMT solver Z3. While we maintain performance on quantified arithmetic benchmarks, we improve Z3 ’s QSAT algorithm on quantified benchmarks with ADTs. On verification tasks, QEL and MBP-QEL help Spacer solve 30 new instances, even though MBP is only a relatively small part of the overall Spacer algorithm.

7 Conclusion

Quantifier elimination, and its under-approximation, Model-Based Projection are used by many SMT-based decision procedures, including quantified SAT and Constrained Horn Clause solving. Traditionally, these are implemented by a series of syntactic rules, operating directly on the syntax of an input formula. In this paper, we argue that these procedures should be implemented directly on the egraph data-structure, already used by most SMT solvers. This results in algorithms that better handle implicit equality reasoning and result in easier to implement and faster procedures. We justify this argument by implementing quantifier reduction and MBP in Z3 using egraphs and show that the new implementation translates into significant improvements to the target decision procedures. Thus, our work provides both theoretical foundations for quantifier reduction and practical contributions to Z3 SMT-solver.