featured | Bjørn Kjos-Hanssen

My book published by de Gruyter is scheduled for March 12, 2024.

Notes:

Theorem 2.28 (The Thue word is overlap-free) is also Theorem 1.3.1 in Shallit’s new book “The logical approach to automatic sequences”. It is proved in Walnut as follows:

[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => T[i+j]=T[i+j+n]";
computed ~:1 states - 11ms
computed ~:2 states - 1ms
_____
FALSE

[Walnut]$

Theorem 2.36 (overlap-free words have an $A_N$-deficiency bound of 1) uses Theorem 2.25, one direction of which is that overlap-free words can only contain squares whose lengths are of the form $2^a$ or $3\cdot 2^a$, $a\ge 1$.
If that is true then the Thue word better only have squares of such lengths.
After running:

[Walnut]$ eval thuesquare "n >= 1 & Ei Aj (j T[i+j]=T[i+j+n]";

[Walnut]$

the file

thuesquare.gv

contains some code that we further process with

brew install graphviz

(which takes a long time, and produces many error messages, but not to worry)
and then from the MacOS command line,

$ dot -Tpdf thuesquare.gv > thuesquare.pdf

The PDF looks like this:
$thuetest$
We can ignoring the arrow from state 0 to state 0 because it corresponds to initial zeros in the binary representation of the length $n$. Then we accept any word of the form $10^n$ or $110^n$ which is binary representation of $2^n$ and $3\cdot 2^n$, respectively. Phew!

In Lemma 2.29 it is mentioned that the Thue-Morse sequence has no infinite arithmetic progressions.
An infinite arithmetic progression would be characterized by $a$ and $b$ such that
$T[a*k+b]=1$ for all sufficiently large $k$. Since multiplication is not included in the Walnut language, it is not clear that we can express this. But we can express $T[2*k+b]=1$:

eval test "Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]";

Indeed, Walnut reports that this cannot happen, as follows:
$test$
It is easier to do it like this:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]";   
_____
FALSE

[Walnut]$

If we try to multiply two variables we get the following trouble:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]";
java.lang.Exception: the operator * cannot be applied to two variables
	at Automata.NumberSystem.arithmetic(NumberSystem.java:791)
	at Token.ArithmeticOperator.act(ArithmeticOperator.java:188)
	at Main.Computer.compute(Computer.java:111)
	at Main.Computer.(Computer.java:49)
	at Main.Prover.eval_def_commands(Prover.java:466)
	at Main.Prover.dispatch(Prover.java:328)
	at Main.Prover.readBuffer(Prover.java:280)
	at Main.Prover.run(Prover.java:234)
	at Main.Prover.main(Prover.java:201)
the operator * cannot be applied to two variables
	: char at 22
	: eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]";

[Walnut]$

But we can check arithmetic sequences with longer and longer periods:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[3*j+n]=T[3*(j+1)+n]";
_____
FALSE

[Walnut]$ eval test "En Ei Aj (j>=i) => T[4*j+n]=T[4*(j+1)+n]";
_____
FALSE

[Walnut]$

Let us now check if the infinite Fibonacci word is overlap-free.

[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]";    
_____
FALSE

[Walnut]$

This seems to say there are no overlaps, which is false since the Fibonacci word starts:
$0, 1, 0, 0, 1, 0, 1, 0$ which has overlapping occurrences of $010$.

However, on page 136 it is suggested to add ?msd_fib so let us try that:

[Walnut]$ eval test "?msd_fib Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]";
____
TRUE

[Walnut]$

Indeed, it is explained on page 108 that “you may need to add a ?msd specification if your sequence is automatic for other than base 2″.

Now we can check whether the infinite Tribonacci word contains any overlaps:
This takes a while and surprisingly gives an error. However, making a simple change in the formula, as explained in the book, fixes the problem:

[Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (j<=n) => TR[i+j]=TR[i+j+n]";
computed ~:1 states - 2ms
computed ~:2 states - 1ms
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.ArrayList.grow(ArrayList.java:239)
	at java.base/java.util.ArrayList.grow(ArrayList.java:244)
	at java.base/java.util.ArrayList.add(ArrayList.java:454)
	at java.base/java.util.ArrayList.add(ArrayList.java:467)
	at Automata.Automaton.subsetConstruction(Automaton.java:3477)
	at Automata.Automaton.minimize_valmari(Automaton.java:281)
	at Automata.Automaton.minimize(Automaton.java:3022)
	at Automata.Automaton.quantifyHelper(Automaton.java:845)
	at Automata.Automaton.quantify(Automaton.java:743)
	at Token.LogicalOperator.actQuantifier(LogicalOperator.java:144)
	at Token.LogicalOperator.act(LogicalOperator.java:59)
	at Main.Computer.compute(Computer.java:111)
	at Main.Computer.(Computer.java:49)
	at Main.Prover.eval_def_commands(Prover.java:466)
	at Main.Prover.dispatch(Prover.java:328)
	at Main.Prover.readBuffer(Prover.java:280)
	at Main.Prover.run(Prover.java:234)
	at Main.Prover.main(Prover.java:201)
(base) Bjrns-MacBook-Pro:bin bjoernkjos-hanssen$ java Main/Prover

[Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (i <= j & j <= i+n) => TR[j]=TR[j+n]";
computed ~:1 states - 9ms
computed ~:2 states - 1ms
____
TRUE

[Walnut]$

Actually, the Tribonacci word is
$0, 1, 0, 2, 0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 1, 0\dots$
so finding an overlap should be trivial. The situation is explained on page 111.
When we consider TR[i+j+n] the extra variable compared to TR[j+n] leads to an exponentially larger automaton being generated. Thus, this is not really the implementation of Walnut’s fault but a feature of the underlying algorithm.

We can ponder which prefixes of the infinite Fibonacci word are of the form $xyxyx z uvuvu$.
Arithmetically this says that we have a number $n$ (the length) and numbers $l_x, l_y>0, l_z>0, l_u, l_v>0$ such that they add up to $n$ in the right way, and all the appropriate agreements occur. This is related to automatic complexity of Fibonacci words. A simpler question is which prefixes are squares.

[Walnut]$ eval test "?msd_fib Elx (lx+lx=n & At (t F[t]=F[t+lx])";

[Walnut]$

This produces the following automaton which in Fibonacci representation gives the lengths of the prefixes of the infinite Fibonacci word that are squares. They are 6, 10, and in general any number of the form $F_{k+3}+F_k)$.
We see here the usefulness of the Fibonacci representation, since the word is not 2-automatic. If we run the same command without ?msd_fib the result is a 1-state automaton with a self-loop labeled 0.

test
A VS Code extension for Walnut would be great, as editing formulas is cumbersome.

In the Thue sequence $t_0t_1t_2\dots=011010011001…$, each $t_n$ is the parity of the number of 1s in the binary representation $(n)_2$ of $n$. This leads to a 2-state DFAO (deterministic finite automaton with output) that reads $(n)_2$ and keeps track of the parity so far. This is an prototypical example of an automatic sequence.

The Fibonacci word $f_0f_1f_2\dots = 01001010$ where $f_n=f_{n-1}f_{n-2}$ can be handled similarly using the Fibonacci representation of numbers. In it, we write $n$ as $a_1\cdot 1 + a_2\cdot 2 + a_3\cdot 3+a_4\cdot 5+a_5\cdot 8+\dots$ where $1,2,3,5,8,\dots$ are Fibonacci numbers and each $a_i\in\{0,1\}$. It is unique provided that no two consecutive 1s in the representation are allowed.

These two examples already hint at a rich theory of automatic sequences with respect to various representation systems. Such a theory has been established in the old-fashioned manner of mathematicians over the years. Now, due to the decidability of Presburger arithmetic and some of its extensions, and a computer implementation thereof called Walnut and due to Hamoon Mousavi, researchers in the area are able to programmatically answer many research questions and verify or correct results from the literature.

As a theoretical backbone for the approach, it is shown that a sequence is $k$-automatic if and only if it is first order definable in the structure $(\mathbb N, +, V_k)$, where $V_k(n)$ is the greatest power of $k$ that divides $n$.

The author emphasizes that Walnut computations are not formally guaranteed to be correct; however, experience has lead to large and growing confidence in its results. As a piece of software Walnut is fairly easy to use and install.

The underlying algorithm has bad worst-case complexity and this affects the use of Walnut in interesting ways. When we consider the Tribonacci word in position $i+j+n$, \texttt{TR[i+j+n]}, the three variables lead to an exponentially larger automaton being generated than if we had stuck with two, and the program crashing. Rewriting the query to use \texttt{TR[j+n]} fixes the problem, as explained in the book. Thus, this program crash is not really the implementation of Walnut’s fault but an aspect of the underlying algorithm.

The topics covered include squares, palindromes, and a wide variety of notions from combinatorics on words. The book is very well-written with few errors, and many easily stated open problems. The many entertaining examples of English words exemplifying the concepts introduced can even be appreciated by non-mathematicians: for example, the word “distinguishing” is a concatenation of two equal-length words differ in only two positions.