- First-Derivative Test (2024, with Patrick Massot and Floris van Doorn)
- Equivalence between one-point compactification of $\mathbb R$ and the projective line (2024, with Oliver Nash)
Lean blueprint projects
- Marginis – formalia marginalia for the Journal of Logic and Analysis
- Protein folding – formalizations related to the paper Nonmonotonicity of protein folding under concatenation
- Automatic complexity – formalized exercise solutions for my book
- Dyadic deontic logic – formalized results from a paper in preparation (with Rachelle Edwards)
- Jaccard – formalization research paper Interpolating between the Jaccard distance and an analogue of the normalized information distance (Journal of Logic and Computation, 2022)
Notes on Shallit’s new book
[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => T[i+j]=T[i+j+n]"; computed ~:1 states - 11ms computed ~:2 states - 1ms _____ FALSE [Walnut]$Theorem 2.36 (overlap-free words have an $A_N$-deficiency bound of 1) uses Theorem 2.25, one direction of which is that overlap-free words can only contain squares whose lengths are of the form $2^a$ or $3\cdot 2^a$, $a\ge 1$. If that is true then the Thue word better only have squares of such lengths. After running:
[Walnut]$ eval thuesquare "n >= 1 & Ei Aj (jthe fileT[i+j]=T[i+j+n]"; [Walnut]$
thuesquare.gvcontains some code that we further process with
brew install graphviz(which takes a long time, and produces many error messages, but not to worry) and then from the MacOS command line,
$ dot -Tpdf thuesquare.gv > thuesquare.pdfThe PDF looks like this: We can ignoring the arrow from state 0 to state 0 because it corresponds to initial zeros in the binary representation of the length $n$. Then we accept any word of the form $10^n$ or $110^n$ which is binary representation of $2^n$ and $3\cdot 2^n$, respectively. Phew! In Lemma 2.29 it is mentioned that the Thue-Morse sequence has no infinite arithmetic progressions. An infinite arithmetic progression would be characterized by $a$ and $b$ such that $T[a*k+b]=1$ for all sufficiently large $k$. Since multiplication is not included in the
Walnut
language, it is not clear that we can express this. But we can express $T[2*k+b]=1$:
eval test "Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]";Indeed,
Walnut
reports that this cannot happen, as follows:
It is easier to do it like this:
[Walnut]$ eval test "En Ei Aj (j>=i) => T[j+j+n]=T[j+j+2+n]"; _____ FALSE [Walnut]$If we try to multiply two variables we get the following trouble:
[Walnut]$ eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]"; java.lang.Exception: the operator * cannot be applied to two variables at Automata.NumberSystem.arithmetic(NumberSystem.java:791) at Token.ArithmeticOperator.act(ArithmeticOperator.java:188) at Main.Computer.compute(Computer.java:111) at Main.Computer.But we can check arithmetic sequences with longer and longer periods:(Computer.java:49) at Main.Prover.eval_def_commands(Prover.java:466) at Main.Prover.dispatch(Prover.java:328) at Main.Prover.readBuffer(Prover.java:280) at Main.Prover.run(Prover.java:234) at Main.Prover.main(Prover.java:201) the operator * cannot be applied to two variables : char at 22 : eval test "En Ei Aj (j>=i) => T[k*j+n]=T[k*(j+1)+n]"; [Walnut]$
[Walnut]$ eval test "En Ei Aj (j>=i) => T[3*j+n]=T[3*(j+1)+n]"; _____ FALSE [Walnut]$ eval test "En Ei Aj (j>=i) => T[4*j+n]=T[4*(j+1)+n]"; _____ FALSE [Walnut]$Let us now check if the infinite Fibonacci word is overlap-free.
[Walnut]$ eval test "Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]"; _____ FALSE [Walnut]$This seems to say there are no overlaps, which is false since the Fibonacci word starts: $0, 1, 0, 0, 1, 0, 1, 0$ which has overlapping occurrences of $010$. However, on page 136 it is suggested to add
?msd_fib
so let us try that:
[Walnut]$ eval test "?msd_fib Ei,n (n >= 1) & Aj (j<=n) => F[i+j]=F[i+j+n]"; ____ TRUE [Walnut]$Indeed, it is explained on page 108 that “you may need to add a
?msd
specification if your sequence is automatic for other than base 2″.
Now we can check whether the infinite Tribonacci word contains any overlaps:
This takes a while and surprisingly gives an error. However, making a simple change in the formula, as explained in the book, fixes the problem:
[Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (j<=n) => TR[i+j]=TR[i+j+n]"; computed ~:1 states - 2ms computed ~:2 states - 1ms Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.base/java.util.ArrayList.grow(ArrayList.java:239) at java.base/java.util.ArrayList.grow(ArrayList.java:244) at java.base/java.util.ArrayList.add(ArrayList.java:454) at java.base/java.util.ArrayList.add(ArrayList.java:467) at Automata.Automaton.subsetConstruction(Automaton.java:3477) at Automata.Automaton.minimize_valmari(Automaton.java:281) at Automata.Automaton.minimize(Automaton.java:3022) at Automata.Automaton.quantifyHelper(Automaton.java:845) at Automata.Automaton.quantify(Automaton.java:743) at Token.LogicalOperator.actQuantifier(LogicalOperator.java:144) at Token.LogicalOperator.act(LogicalOperator.java:59) at Main.Computer.compute(Computer.java:111) at Main.Computer.Actually, the Tribonacci word is $0, 1, 0, 2, 0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 1, 0\dots$ so finding an overlap should be trivial. The situation is explained on page 111. When we consider(Computer.java:49) at Main.Prover.eval_def_commands(Prover.java:466) at Main.Prover.dispatch(Prover.java:328) at Main.Prover.readBuffer(Prover.java:280) at Main.Prover.run(Prover.java:234) at Main.Prover.main(Prover.java:201) (base) Bjrns-MacBook-Pro:bin bjoernkjos-hanssen$ java Main/Prover [Walnut]$ eval test "?msd_trib Ei,n (n >= 1) & Aj (i <= j & j <= i+n) => TR[j]=TR[j+n]"; computed ~:1 states - 9ms computed ~:2 states - 1ms ____ TRUE [Walnut]$
TR[i+j+n]
the extra variable compared to TR[j+n]
leads to an exponentially larger automaton being generated. Thus, this is not really the implementation of Walnut’s fault but a feature of the underlying algorithm.
We can ponder which prefixes of the infinite Fibonacci word are of the form $xyxyx z uvuvu$.
Arithmetically this says that we have a number $n$ (the length) and numbers $l_x, l_y>0, l_z>0, l_u, l_v>0$ such that they add up to $n$ in the right way, and all the appropriate agreements occur. This is related to automatic complexity of Fibonacci words. A simpler question is which prefixes are squares.
[Walnut]$ eval test "?msd_fib Elx (lx+lx=n & At (tThis produces the following automaton which in Fibonacci representation gives the lengths of the prefixes of the infinite Fibonacci word that are squares. They are 6, 10, and in general any number of the form $F_{k+3}+F_k)$. We see here the usefulness of the Fibonacci representation, since the word is not 2-automatic. If we run the same command withoutF[t]=F[t+lx])"; [Walnut]$
?msd_fib
the result is a 1-state automaton with a self-loop labeled 0.
test
A VS Code extension for Walnut would be great, as editing formulas is cumbersome.
In the Thue sequence $t_0t_1t_2\dots=011010011001…$, each $t_n$ is the parity of the number of 1s in the binary representation $(n)_2$ of $n$. This leads to a 2-state DFAO (deterministic finite automaton with output) that reads $(n)_2$ and keeps track of the parity so far. This is an prototypical example of an automatic sequence. The Fibonacci word $f_0f_1f_2\dots = 01001010$ where $f_n=f_{n-1}f_{n-2}$ can be handled similarly using the Fibonacci representation of numbers. In it, we write $n$ as $a_1\cdot 1 + a_2\cdot 2 + a_3\cdot 3+a_4\cdot 5+a_5\cdot 8+\dots$ where $1,2,3,5,8,\dots$ are Fibonacci numbers and each $a_i\in\{0,1\}$. It is unique provided that no two consecutive 1s in the representation are allowed. These two examples already hint at a rich theory of automatic sequences with respect to various representation systems. Such a theory has been established in the old-fashioned manner of mathematicians over the years. Now, due to the decidability of Presburger arithmetic and some of its extensions, and a computer implementation thereof called Walnut and due to Hamoon Mousavi, researchers in the area are able to programmatically answer many research questions and verify or correct results from the literature. As a theoretical backbone for the approach, it is shown that a sequence is $k$-automatic if and only if it is first order definable in the structure $(\mathbb N, +, V_k)$, where $V_k(n)$ is the greatest power of $k$ that divides $n$. The author emphasizes that Walnut computations are not formally guaranteed to be correct; however, experience has lead to large and growing confidence in its results. As a piece of software Walnut is fairly easy to use and install. The underlying algorithm has bad worst-case complexity and this affects the use of Walnut in interesting ways. When we consider the Tribonacci word in position $i+j+n$, \texttt{TR[i+j+n]}, the three variables lead to an exponentially larger automaton being generated than if we had stuck with two, and the program crashing. Rewriting the query to use \texttt{TR[j+n]} fixes the problem, as explained in the book. Thus, this program crash is not really the implementation of Walnut’s fault but an aspect of the underlying algorithm. The topics covered include squares, palindromes, and a wide variety of notions from combinatorics on words. The book is very well-written with few errors, and many easily stated open problems. The many entertaining examples of English words exemplifying the concepts introduced can even be appreciated by non-mathematicians: for example, the word “distinguishing” is a concatenation of two equal-length words differ in only two positions.
Formal marginalia in computability theory
A talk I gave at the ASL Annual Meeting in Ames, Iowa, May 15, 2024.
Slides:
ASL-2024-slides-Kjos-Hanssen
Automatic complexity of distinguishing
My book “Automatic complexity: a computable measure of irregularity” was published by de Gruyter on February 19, 2024. It concerns a version of Sipser’s complexity of distinguishing that was introduced by Shallit and Wang in 2001 and has been developed by me and others since 2013.
Erratum
Theorem 6.6: Two constructions $M_1\times_1 M_2$ and $M_1\times_2 M_2$ are described. The text suggests that $\times_1$ is used for the theorem, but actually $\times_2$ should be used. The operation $\times_1$ can be used to prove the inequality $A_N(x)\le A_N(x\mid y)\cdot A_N(y)$ instead.
Math 100 student’s protein folding research
Madison Koskey had the highest score in the class on this project task:
Fold an idealized protein known as S64 to obtain a highest possible score in the hydrophobic-polar protein folding model in three dimensions.
This problem was studied by Lesh, Mitzenmacher, and Whitesides (2003).
Above we see Madison’s intricate folding of S64 which earned a score of 27.
To try your hand at rotating it in 3D, go directly to this link.
Recent PhDs
Three new papers
with my recent PhDs Birns (2023, left) and Webb (2022, right). Webb is now an Assistant Professor at Chaminade University, and Birns is a data scientist at Pacific Life in California.- KL-randomness and effective dimension under strong reducibility (with David J. Webb). Computability in Europe, Lectures Notes in Computer Science, 2021.
- On the degrees of constructively immune sets (with Samuel D. Birns). Computability in Europe, Lectures Notes in Computer Science, 2021.
- Strong Medvedev reducibilities and the KL-randomness problem (with David J. Webb). Computability in Europe, Lectures Notes in Computer Science, 2022.