# A computable measure of irregularity

My book published by de Gruyter is scheduled for March 12, 2024.

Notes:

Theorem 2.28 (The Thue word is overlap-free) is also Theorem 1.3.1 in Shallit’s new book “The logical approach to automatic sequences”. It is proved in Walnut as follows:

[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => T[i+j]=T[i+j+n]"; computed ~:1 states - 11ms computed ~:2 states - 1ms _____ FALSE [Walnut]$

Theorem 2.36 (overlap-free words have an $A_N$-deficiency bound of 1) uses Theorem 2.25, one direction of which is that overlap-free words can only contain squares whose lengths are of the form $2^a$ or $3\cdot 2^a$, $a\ge 1$.

If that is true then the Thue word better only have squares of such lengths.

After running:

[Walnut]$ eval thuesquare "n >= 1 & Ei Aj (jT[i+j]=T[i+j+n]"; [Walnut]$

the file

thuesquare.gv

contains some code that we further process with

brew install graphviz

(which takes a long time, and produces many error messages, but not to worry)

and then from the MacOS command line,

$ dot -Tpdf thuesquare.gv > thuesquare.pdf

The PDF looks like this:

We can ignoring the arrow from state 0 to state 0 because it corresponds to initial zeros in the binary representation of the length $n$. Then we accept any word of the form $10^n$ or $110^n$ which is binary representation of $2^n$ and $3\cdot 2^n$, respectively. Phew!

In Lemma 2.29 it is mentioned that the Thue-Morse sequence has no infinite arithmetic progressions.

An infinite arithmetic progression would be characterized by $a$ and $b$ such that

$T[a*k+b]=1$ for all sufficiently large $k$. Since multiplication is not included in the `Walnut`

language, it is not clear that we can express this. But we can express $T[2*k+b]=1$:

eval test "Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]";

Indeed, `Walnut`

reports that this cannot happen, as follows:

It is easier to do it like this:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]"; _____ FALSE [Walnut]$

If we try to multiply two variables we get the following trouble:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]"; java.lang.Exception: the operator * cannot be applied to two variables at Automata.NumberSystem.arithmetic(NumberSystem.java:791) at Token.ArithmeticOperator.act(ArithmeticOperator.java:188) at Main.Computer.compute(Computer.java:111) at Main.Computer.(Computer.java:49) at Main.Prover.eval_def_commands(Prover.java:466) at Main.Prover.dispatch(Prover.java:328) at Main.Prover.readBuffer(Prover.java:280) at Main.Prover.run(Prover.java:234) at Main.Prover.main(Prover.java:201) the operator * cannot be applied to two variables : char at 22 : eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]"; [Walnut]$

But we can check arithmetic sequences with longer and longer periods:

[Walnut]$ eval test "En Ei Aj (j>=i) => T[3*j+n]=T[3*(j+1)+n]"; _____ FALSE [Walnut]$ eval test "En Ei Aj (j>=i) => T[4*j+n]=T[4*(j+1)+n]"; _____ FALSE [Walnut]$

Let us now check if the infinite Fibonacci word is overlap-free.

[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]"; _____ FALSE [Walnut]$

This seems to say there are no overlaps, which is false since the Fibonacci word starts:

$0, 1, 0, 0, 1, 0, 1, 0$ which has overlapping occurrences of $010$.

However, on page 136 it is suggested to add `?msd_fib`

so let us try that:

[Walnut]$ eval test "?msd_fib Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]"; ____ TRUE [Walnut]$

Indeed, it is explained on page 108 that “you may need to add a `?msd`

specification if your sequence is automatic for other than base 2″.

Now we can check whether the infinite Tribonacci word contains any overlaps:

This takes a while and surprisingly gives an error. However, making a simple change in the formula, as explained in the book, fixes the problem:

[Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (j<=n) => TR[i+j]=TR[i+j+n]"; computed ~:1 states - 2ms computed ~:2 states - 1ms Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.base/java.util.ArrayList.grow(ArrayList.java:239) at java.base/java.util.ArrayList.grow(ArrayList.java:244) at java.base/java.util.ArrayList.add(ArrayList.java:454) at java.base/java.util.ArrayList.add(ArrayList.java:467) at Automata.Automaton.subsetConstruction(Automaton.java:3477) at Automata.Automaton.minimize_valmari(Automaton.java:281) at Automata.Automaton.minimize(Automaton.java:3022) at Automata.Automaton.quantifyHelper(Automaton.java:845) at Automata.Automaton.quantify(Automaton.java:743) at Token.LogicalOperator.actQuantifier(LogicalOperator.java:144) at Token.LogicalOperator.act(LogicalOperator.java:59) at Main.Computer.compute(Computer.java:111) at Main.Computer.(Computer.java:49) at Main.Prover.eval_def_commands(Prover.java:466) at Main.Prover.dispatch(Prover.java:328) at Main.Prover.readBuffer(Prover.java:280) at Main.Prover.run(Prover.java:234) at Main.Prover.main(Prover.java:201) (base) Bjrns-MacBook-Pro:bin bjoernkjos-hanssen$ java Main/Prover [Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (i <= j & j <= i+n) => TR[j]=TR[j+n]"; computed ~:1 states - 9ms computed ~:2 states - 1ms ____ TRUE [Walnut]$

Actually, the Tribonacci word is

$0, 1, 0, 2, 0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 1, 0\dots$

so finding an overlap should be trivial. The situation is explained on page 111.

When we consider `TR[i+j+n]`

the extra variable compared to `TR[j+n]`

leads to an exponentially larger automaton being generated. Thus, this is not really the implementation of Walnut’s fault but a feature of the underlying algorithm.

We can ponder which prefixes of the infinite Fibonacci word are of the form $xyxyx z uvuvu$.

Arithmetically this says that we have a number $n$ (the length) and numbers $l_x, l_y>0, l_z>0, l_u, l_v>0$ such that they add up to $n$ in the right way, and all the appropriate agreements occur. This is related to automatic complexity of Fibonacci words. A simpler question is which prefixes are squares.

[Walnut]$ eval test "?msd_fib Elx (lx+lx=n & At (tF[t]=F[t+lx])"; [Walnut]$

This produces the following automaton which in Fibonacci representation gives the lengths of the prefixes of the infinite Fibonacci word that are squares. They are 6, 10, and in general any number of the form $F_{k+3}+F_k)$.

We see here the usefulness of the Fibonacci representation, since the word is not 2-automatic. If we run the same command without `?msd_fib`

the result is a 1-state automaton with a self-loop labeled 0.

test

A VS Code extension for Walnut would be great, as editing formulas is cumbersome.

In the Thue sequence $t_0t_1t_2\dots=011010011001…$, each $t_n$ is the parity of the number of 1s in the binary representation $(n)_2$ of $n$. This leads to a 2-state DFAO (deterministic finite automaton with output) that reads $(n)_2$ and keeps track of the parity so far. This is an prototypical example of an automatic sequence.

The Fibonacci word $f_0f_1f_2\dots = 01001010$ where $f_n=f_{n-1}f_{n-2}$ can be handled similarly using the Fibonacci representation of numbers. In it, we write $n$ as $a_1\cdot 1 + a_2\cdot 2 + a_3\cdot 3+a_4\cdot 5+a_5\cdot 8+\dots$ where $1,2,3,5,8,\dots$ are Fibonacci numbers and each $a_i\in\{0,1\}$. It is unique provided that no two consecutive 1s in the representation are allowed.

These two examples already hint at a rich theory of automatic sequences with respect to various representation systems. Such a theory has been established in the old-fashioned manner of mathematicians over the years. Now, due to the decidability of Presburger arithmetic and some of its extensions, and a computer implementation thereof called Walnut and due to Hamoon Mousavi, researchers in the area are able to programmatically answer many research questions and verify or correct results from the literature.

As a theoretical backbone for the approach, it is shown that a sequence is $k$-automatic if and only if it is first order definable in the structure $(\mathbb N, +, V_k)$, where $V_k(n)$ is the greatest power of $k$ that divides $n$.

The author emphasizes that Walnut computations are not formally guaranteed to be correct; however, experience has lead to large and growing confidence in its results. As a piece of software Walnut is fairly easy to use and install.

The underlying algorithm has bad worst-case complexity and this affects the use of Walnut in interesting ways. When we consider the Tribonacci word in position $i+j+n$, \texttt{TR[i+j+n]}, the three variables lead to an exponentially larger automaton being generated than if we had stuck with two, and the program crashing. Rewriting the query to use \texttt{TR[j+n]} fixes the problem, as explained in the book. Thus, this program crash is not really the implementation of Walnut’s fault but an aspect of the underlying algorithm.

The topics covered include squares, palindromes, and a wide variety of notions from combinatorics on words. The book is very well-written with few errors, and many easily stated open problems. The many entertaining examples of English words exemplifying the concepts introduced can even be appreciated by non-mathematicians: for example, the word “distinguishing” is a concatenation of two equal-length words differ in only two positions.

# Math 100 student’s protein folding research

Madison Koskey had the highest score in the class on this project task:

Fold an idealized protein known as S64 to obtain a highest possible score in the hydrophobic-polar protein folding model in three dimensions.

This problem was studied by Lesh, Mitzenmacher, and Whitesides (2003).

Above we see Madison’s intricate folding of S64 which earned a score of 27.

To try your hand at rotating it in 3D, go directly to this link.

# Recent PhDs

#### Three new papers

with my recent PhDs Birns (2023, left) and Webb (2022, right). Webb is now an Assistant Professor at Chaminade University, and Birns is a data scientist at Pacific Life in California.- KL-randomness and effective dimension under strong reducibility (with David J. Webb). Computability in Europe, Lectures Notes in Computer Science, 2021.
- On the degrees of constructively immune sets (with Samuel D. Birns). Computability in Europe, Lectures Notes in Computer Science, 2021.
- Strong Medvedev reducibilities and the KL-randomness problem (with David J. Webb). Computability in Europe, Lectures Notes in Computer Science, 2022.

# Philosophy for Children P4C

# Protein folding and Hawaiian language in Math 100

# The number of languages with maximum state complexity

Lei Liu completed her Master’s degree with the project title *Complexity of Options* in 2017.

An extension to monotone options (pictured) was presented at

ALH-2018. The new paper is called **The number of languages with maximum state complexity** and has been accepted for TAMC 2019.

As of 2022, the paper has been through 7 revisions and has been accepted for publication in the journal Algebra Universalis.

Some of the 168 monotone Boolean functions of 4 variables

- Slides for ALH 2018
- Slides for TAMC 2019

# Master’s projects in Lean

- Hugh Chou, 2021: Formalizing my paper on a conflict in the Carmo and Jones approach to contrary-to-duty deontic obligations.
- Ryan T. Sasaki, 2022: Formalizing a time-invariance theorem for Redington immunization in financial mathematics.