12 Beyond Substitution: Cosubstitution and Bisubstitution

In making our “metamodel” of mathematics we’ve been discussing the rewriting of expressions according to rules. But there’s a subtle issue that we’ve so far avoided, that has to do with the fact that the expressions we’re rewriting are often themselves patterns that stand for whole classes of expressions. And this turns out to allow for additional kinds of transformations that we’ll call cosubstitution and bisubstitution.

Let’s talk first about cosubstitution. Imagine we have the expression f[a]. The rule ab would do a substitution for a to give f[b]. But if we have the expression f[c] the rule ab will do nothing.

Now imagine that we have the expression f[x_]. This stands for a whole class of expressions, including f[a], f[c], etc. For most of this class of expressions, the rule ab will do nothing. But in the specific case of f[a], it applies, and gives the result f[b].

If our rule is f[x_] s then this will apply as an ordinary substitution to f[a], giving the result s. But if the rule is f[b] s this will not apply as an ordinary substitution to f[a]. However, it can apply as a cosubstitution to f[x_] by picking out the specific case where x_ stands for b, then using the rule to give s.

In general, the point is that ordinary substitution specializes patterns that appear in ruleswhile what one can think of as the “dual operation” of cosubstitution specializes patterns that appear in the expressions to which the rules are being applied. If one thinks of the rule that’s being applied as like an operator, and the expression to which the rule is being applied as an operand, then in effect substitution is about making the operator fit the operand, and cosubstitution is about making the operand fit the operator.

It’s important to realize that as soon as one’s operating on expressions involving patterns, cosubstitution is not something “optional”: it’s something that one has to include if one is really going to interpret patternswherever they occuras standing for classes of expressions.

When one’s operating on a literal expression (without patterns) only substitution is ever possible, as in

corresponding to this fragment of a token-event graph:

Let’s say we have the rule f[a] s (where f[a] is a literal expression). Operating on f[b] this rule will do nothing. But what if we apply the rule to f[x_]? Ordinary substitution still does nothing. But cosubstitution can do something. In fact, there are two different cosubstitutions that can be done in this case:

What’s going on here? In the first case, f[x_] has the “special case” f[a], to which the rule applies (“by cosubstitution”)giving the result s. In the second case, however, it’s x_ on its own which has the special case f[a], that gets transformed by the rule to s, giving the final cosubstitution result f[s].

There’s an additional wrinkle when the same pattern (such as x_) appears multiple times:

In all cases, x_ is matched to a. But which of the x_’s is actually replaced is different in each case.

Here’s a slightly more complicated example:

In ordinary substitution, replacements for patterns are in effect always made “locally”, with each specific pattern separately being replaced by some expression. But in cosubstitution, a “special case” found for a pattern will get used throughout when the replacement is done.

Let’s see how this all works in an accumulative axiomatic system. Consider the very simple rule:

One step of substitution gives the token-event graph (where we’ve canonicalized the names of pattern variables to a_ and b_):

But one step of cosubstitution gives instead:

Here are the individual transformations that were made (with the rule at least nominally being applied only in one direction):

The token-event graph above is then obtained by canonicalizing variables, and combining identical expressions (though for clarity we don’t merge rules of the form ab and ba).

If we go another step with this particular rule using only substitution, there are additional events (i.e. transformations) but no new theorems produced:

Cosubstitution, however, produces another 27 theorems

or altogether

or as trees:

We’ve now seen examples of both substitution and cosubstitution in action. But in our metamodel for mathematics we’re ultimately dealing not with each of these individually, but rather with the “symmetric” concept of bisubstitution, in which both substitution and cosubstitution can be mixed together, and applied even to parts of the same expression.

In the particular case of a_b_b_a_, bisubstitution adds nothing beyond cosubstitution. But often it does. Consider the rule:

Here’s the result of applying this to three different expressions using substitution, cosubstitution and bisubstitution (where we consider only matches for “whole expressions”, not subparts):

Cosubstitution very often yields substantially more transformations than substitutionbisubstitution then yielding modestly more than cosubstitution. For example, for the axiom system

the number of theorems derived after 1 and 2 steps is given by:

In some cases there are theorems that can be produced by full bisubstitution, but noteven after any number of stepsby substitution or cosubstitution alone. However, it is also common to find that theorems can in principle be produced by substitution alone, but that this just takes more steps (and sometimes vastly more) than when full bisubstitution is used. (It’s worth noting, however, that the notion of “how many steps” it takes to “reach” a given theorem depends on the foliation one chooses to use in the token-event graph.)

The various forms of substitution that we’ve discussed here represent different ways in which one theorem can entail others. But our overall metamodel of mathematicsbased as it is purely on the structure of symbolic expressions and patternsimplies that bisubstitution covers all entailments that are possible.

In the history of metamathematics and mathematical logic, a whole variety of “laws of inference” or “methods of entailment” have been considered. But with the modern view of symbolic expressions and patterns (as used, for example, in the Wolfram Language), bisubstitution emerges as the fundamental form of entailment, with other forms of entailment corresponding to the use of particular types of expressions or the addition of further elements to the pure substitutions we’ve used here.

It should be noted, however, that when it comes to the ruliad different kinds of entailments correspond merely to different foliationswith the form of entailment that we’re using representing just a particularly straightforward case.

The concept of bisubstitution has arisen in the theory of term rewriting, as well as in automated theorem proving (where it is often viewed as a particular “strategy”, and called “paramodulation”). In term rewriting, bisubstitution is closely related to the concept of unificationwhich essentially asks what assignment of values to pattern variables is needed in order to make different subterms of an expression be identical.